Hi all, I have a crunch job that should process a big sequence file and produce a single csv file. I am using the "pipeline.writeTextFile(transformedRecords, csvFilePath)" to write to a csv. (csvFilePath is like "/data/csv_directory"). The larger the input sequence file is, more number of mappers are being created and thus equivalent number of csv output files are being created.
In classic mapreduce one could output a single file by setting the #reducers to 1 while configuring the job. How could I achieve this with crunch? I would really appreciate any help here. Thanks, Som
