Patch is here: https://issues.apache.org/jira/browse/CRUNCH-333
On Mon, Jan 27, 2014 at 10:08 AM, Josh Wills <[email protected]> wrote: > Of course. I wrote up a little patch that adds a method to From.java to > open the Avro file and pull out the schema and return a Source of > GenericData.Record, but I had to roll to some meetings before I got a > chance to test it. I'll post something later this evening ET. > On Jan 27, 2014 11:56 AM, "Magnus Runesson" <[email protected]> wrote: > >> Thanks for quick answer. >> >> It is totally OK and reasonable to take one file in a directory and >> assume all other has the same schema. >> >> >> On 2014-01-27 18:27, Josh Wills wrote: >> >> No, I haven't written a way to do that yet, and I feel bad about it-- a >> Clouderan asked me for just such a feature a couple of weeks ago and it >> slipped my mind. I don't think it's hard to do, just a little tedious and >> will require refreshing my memory of the Avro APIs. There's also the >> potential issue that multiple Avro files in the same input directory can >> have different schemas, so the one we would end up reading might be >> somewhat arbitrary (e.g., based on the timestamp of the files in the >> directory, or some such thing)-- is that ok? >> >> >> On Mon, Jan 27, 2014 at 9:12 AM, Magnus Runesson <[email protected]>wrote: >> >>> Can I in (s)crunch read an Avro-file to GenericRecord without provide >>> the schema? I want crunch to get the schema from the avro-file it reads. >>> How do I do it? >>> >>> /Magnus >>> >> >> >> -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>
