Wonder if there has been any progress with getting Parquet reading to work in Crunch? Currently I can't read specific classes that were written out to Parquet Avro files, without getting a casting exception. Also I can't read Parquet files as generics either if the generated class that represents the data from the schema is in the class path while trying to do the generic read. If I remove the generated class from the jar and read the parquet data as a GenericRecord it works. Thanks! Danny
From: [email protected] Date: Mon, 29 Sep 2014 20:43:35 -0700 Subject: Re: Trouble with Avro records To: [email protected] On Mon, Sep 29, 2014 at 2:32 PM, Danny Morgan <[email protected]> wrote: Hi Josh, From: [email protected] Date: Mon, 29 Sep 2014 13:49:01 -0700 Subject: Re: Trouble with Avro records To: [email protected] > Hrm, that's surprising. We upgraded the Avro version to deal w/some Java 7 > stuff-- maybe exclude it and give it another go? Yea that did the trick. > On the Parquet stuff, this is apparently a long standing issue w/Parquet that > requires some major surgery to fix. I have some book stuff to get done over > the > next couple of weeks so I don't have the time to do it myself, but I'll > keep checking in with the parquet folks. Working on a Crunch book possibly??!?! Sadly, no-- data science stuff this time, instead of data engineering. :) Thanks Again! Danny -- Director of Data ScienceClouderaTwitter: @josh_wills
