You need to add a call to MultipleOutputs.close() in your reducer's cleanup:
public void cleanup(Context) throws IOException {
mos.close();
...
}
On Fri, May 6, 2011 at 1:55 PM, Geoffry Roberts
<[email protected]> wrote:
> All,
>
> I am attempting to take a large file and split it up into a series of
> smaller files. I want the smaller files to be named based on values taken
> from the large file. I am using
> org.apache.hadoop.mapreduce.lib.output.MultipleOutputs to do this.
>
> The job runs without error and produces a set of files as expected and each
> file is named as expected. But most of the files are empty. Apparently, no
> data was written to them. The fact that the file was created at all should
> confirm that there was data coming in from the mapper. When my reducer
> counts as it iterates through the values then logs the count. I am seeing
> reasonable counts in my logs. The number of lines in an output file should
> equal the count. I have counts but no lines.
>
> What could be causing this?
>
> My Mapper:
> protected void map(LongWritable key, Text value, Context ctx) throws
> IOException,
> InterruptedException {
> String[] ss = value.toString().split(",");
> String locale = ss[F.DEPARTURE_LOCALE];
> ctx.write(new Text(locale), value);
> }
>
> My Reducer:
> private MultipleOutputs<Text, Text> mos;
>
> @Override
> protected void setup(Context ctx) throws IOException, InterruptedException
> {
> mos = new MultipleOutputs<Text, Text>(ctx);
> }
>
> @Override
> protected void reduce(Text key, Iterable<Text> values, Context ctx)
> throws IOException, InterruptedException {
> int k = 0;
> /*
> * The key at this point can have blanks and slashes. Let us get rid
> * of both.
> */
> String blankless = key.toString().replace(' ', '+');
> String path = blankless.toString().replace("/", "");
> try {
> for (Text value : values) {
> k++;
> String[] ss = value.toString().split(F.DELIMITER);
> String id = ss[F.ID];
> String[] sslessid = Arrays.copyOfRange(ss, 1, ss.length);
> String line = UT.array2String(sslessid);
>
> // An output file is being created,
> mos.write(new Text(id), new Text(line), path);
> }
> } catch (NullPointerException e) {
> LOG.error("<br/>" + "blankless=" + blankless);
> LOG.error("<br/>" + "values=" + values.toString());
> }
>
> // In my logs, I see reasonable counts even when the output file is empty.
> LOG.info("<br/>key=" + path + " count=" + k);
> }
> --
> Geoffry Roberts
>
>
--
Joseph Echeverria
Cloudera, Inc.
443.305.9434