Hi all, I'm trying to write a simple converter in Crunch to turn Sequence files into ORC files. The only examples that I can find for dealing with ORC files are the tutorial at http://hortonworks.com/blog/using-orcfile-cascading-apache-crunch/ and then the discussion at https://issues.apache.org/jira/browse/CRUNCH-450. The tutorial seems to only show how to output data that's already in ORC format, which isn't much use for me here.
It would be nice to be able to output ORC files like you can with Java MapReduce - http://hadoopathome.logdown.com/posts/277986-using-multipleoutputs-with-orc-in-mapreduce - specifying a Struct, parsing each record into some type of object, and letting the output do the rest. I've tried to replicate this in Crunch by writing a MapFn that basically turns each record into an OrcWritable, but it doesn't work, and even if it did I suspect it wouldn't be very efficient. Is this something that's already possible that I'm missing? Thanks, Ben
