Hi Micah, Thanks for your help, it's good to see some more examples of ORC in Crunch. The single ORC record created manually in the test setup is what I needed to see.
Thanks, Ben On Mon, Sep 14, 2015 at 9:50 PM, Micah Whitacre <[email protected]> wrote: > Ben, > > You might look at the OrcSourceTarget integration tests[1]. I'm not an > expert at OrcFiles but looks like it has a few examples for reading/writing > data. > > [1] - > https://github.com/apache/crunch/blob/master/crunch-hive/src/it/java/org/apache/crunch/io/orc/OrcFileSourceTargetIT.java#L64 > > On Mon, Sep 14, 2015 at 8:29 AM, Ben Watson <[email protected]> > wrote: > >> Hi all, >> >> I'm trying to write a simple converter in Crunch to turn Sequence files >> into ORC files. The only examples that I can find for dealing with ORC >> files are the tutorial at >> http://hortonworks.com/blog/using-orcfile-cascading-apache-crunch/ and >> then the discussion at https://issues.apache.org/jira/browse/CRUNCH-450. >> The tutorial seems to only show how to output data that's already in ORC >> format, which isn't much use for me here. >> >> It would be nice to be able to output ORC files like you can with Java >> MapReduce - >> http://hadoopathome.logdown.com/posts/277986-using-multipleoutputs-with-orc-in-mapreduce >> - specifying a Struct, parsing each record into some type of object, and >> letting the output do the rest. I've tried to replicate this in Crunch by >> writing a MapFn that basically turns each record into an OrcWritable, but >> it doesn't work, and even if it did I suspect it wouldn't be very efficient. >> >> Is this something that's already possible that I'm missing? >> >> Thanks, >> >> Ben >> > >
