Hey Nipur, AvroPathPerKeyTarget is the closest thing to what you want; you can use it on a PTable<String, T> collection, where T is any type that Avro supports. It will write multiple output files to a common base directory where the name of the file depends on the value of the String key in the PTable.
Josh On Mon, Jul 6, 2015 at 11:47 AM, Nipur Patodi <[email protected]> wrote: > Hi All, > > > > I am very new to crunch. > > > I am trying to read data from csv file using MR pipelines. I need to > convert and bucketize this data on the bases of time stamp which is a > field in csv. I need to write data per timestamp in to single file. > > > > This scenario is equivalent to writing values (record) per key (which is > time stamp) to different file. > > I can achieve this using multiple output format in mapreduce. > > > > Do we have any equivalent concept or design pattern to achieve same > behavior using crunch? > > > > Please suggest. > > > > Thanks, > > > > _Nipur > -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>
