Re: Getting started with Avro + Reading from an Avro formatted file

2012-01-24 Thread selvi k
> $ avro --help has some options that can help you out. > > For "avro cat", the following may help: > > --fields=FIELDS fields to show, comma separated (show all by default) Thanks a lot for this pointer Harsh..this is how I chanced up the filter flag. I am going to take a look at the blog p

Re: Getting started with Avro + Reading from an Avro formatted file

2012-01-24 Thread selvi k
On Tue, Jan 24, 2012 at 4:00 PM, Douglas Creager wrote: > > Also my understanding is that I must 'install' or deploy Avro before I > can try out the C bindings suggested by Douglas. I am stating this since I > am not exactly clear by what this meant: - "especially since the C > bindings don't hav

Re: Getting started with Avro + Reading from an Avro formatted file

2012-01-24 Thread selvi k
I was able to set both up and use them. And they work like a charm! :) - The advantage with the C version for me was that the CSV file created, retained the field names for every field. Even though this makes it bulky, as I move my data through different processing steps, this would come in handy

Re: Getting started with Avro + Reading from an Avro formatted file

2012-01-24 Thread Harsh J
Selvi, (Forgot to reply to this before) On Wed, Jan 25, 2012 at 1:07 AM, selvi k wrote: > 3. With regards to the two suggested ways, would either of these techniques > allow me to filter my data records using some sort of a condition on a > field?(or a few fields)  If not it seems like I would h

Re: Getting started with Avro + Reading from an Avro formatted file

2012-01-24 Thread Douglas Creager
> Also my understanding is that I must 'install' or deploy Avro before I can > try out the C bindings suggested by Douglas. I am stating this since I am not > exactly clear by what this meant: - "especially since the C bindings don't > have any library dependencies to install". I am assuming it

Re: Getting started with Avro + Reading from an Avro formatted file

2012-01-24 Thread Harsh J
If you want to try out the Python API for Avro datafiles, I had written a short blog post on reading/writing that at http://www.harshj.com/2010/04/25/writing-and-reading-avro-data-files-using-python/ which still holds good I think. Hope this helps. On Wed, Jan 25, 2012 at 1:50 AM, selvi k wrote:

Re: Getting started with Avro + Reading from an Avro formatted file

2012-01-24 Thread selvi k
I found out what the issue was: I first needed to install snappy downloaded from here: http://code.google.com/p/snappy/ After a simple ./configure, make and make install, 'easy_install avro' completed successfully. I will try out both the CSV conversion options and update this thread in a bit. -

Re: Getting started with Avro + Reading from an Avro formatted file

2012-01-24 Thread selvi k
Douglas and Harsh - Thanks a lot for the immediate and detailed replies! Looks like both of these would work well for me. In order to start trying these, I have tried a few things to get started with Avro, but this is where I am stuck: 1. I first downloaded the stable version in the form of "av

Re: Getting started with Avro + Reading from an Avro formatted file

2012-01-24 Thread Harsh J
Selvi, Expanding on Douglas' response, if you have installed Avro's python libraries (Simplest way to get latest stable is: "easy_install avro", or install from the distribution -- Post back if you need help on this), you can simply do, using the now-installed 'avro' executable: $ ls sample_input

Re: Getting started with Avro + Reading from an Avro formatted file

2012-01-24 Thread Douglas Creager
> I want to be able to read from an Avro formatted log file (specifically the > History Log file created at the end of a Hadoop job) and create a Comma > Separated file of certain log entries. I need a csv file because this is the > format that is accepted by post processing software I am workin

Getting started with Avro + Reading from an Avro formatted file

2012-01-24 Thread selvi k
Hello All, I would like some suggestions on where I can start in the Avro project. I want to be able to read from an Avro formatted log file (specifically the History Log file created at the end of a Hadoop job) and create a Comma Separated file of certain log entries. I need a csv file because