Hey Som, The Pipeline object that coordinates the flow has a getConfiguration() method where you can set any options you might like and they will propagate to all of your jars.
I usually implement Hadoop's Tool interface and then specify these configuration options on the command line so I can play with them independent of the logic of my runtime, and I end up w/something like: hadoop jar <crunch-job.jar> -D mapred.compress.output=true -D mapred.output.compression.type=block etc. I think that having some syntactic sugar for compressing Target objects (like To.sequenceFile or To.avroFile) would be a nice JIRA. J On Fri, Aug 2, 2013 at 3:58 PM, Som Satpathy <[email protected]> wrote: > Hi all, > > I am trying to write compressed sequence files at the end of my crunch > pipeline. I'm doing a pipeline.write(mycollection, To.sequenceFile(path)) > for that. > However, Crunch is writing an uncompressed sequence file by default. How > do I pass the codec that I want to use to Crunch? > > Looking forward for your inputs. > > Thanks, > Som > > -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>
