My mapper code is as follows and I don't know whether any file is not
closed correctly.
public class TheMapper extends Mapper<LongWritable, Text, Text,
NullWritable> {
private MultipleOutputs<Text, NullWritable> outputs;
@Override
protected void setup(Context ctx) {
outputs = new MultipleOutputs<Text, NullWritable>(ctx);
}
@Override
protected void map(LongWritable o, Text t, Context ctx) {
outputs.write(t, NullWritable.get(), "2014-01-20/");
}
@Override
protected void cleanup(Context ctx) {
outputs.close();
}
}
2014-04-08 21:17 GMT+08:00 Peyman Mohajerian <[email protected]>:
> If you didn't close the file correctly then NameNode wouldn't be notified
> of the final size of the file. The file size is meta-data coming from
> NameNode.
>
>
> On Tue, Apr 8, 2014 at 4:35 AM, Tao Xiao <[email protected]> wrote:
>
>> I wrote some data into a file using MultipleOutputs in mappers. I can see
>> the contents in this file using "hadoop fs -ls <file>", but its size is
>> reported as zero by the command "hadoop fs -du <file>" or "hadoop fs -ls
>> <file>" , as follows:
>>
>> -rw-r--r-- 3 hadoop hadoop 0 2014-04-07 22:06
>> /test/xt/out/2014-01-20/-m-00000
>>
>> BTW, when I download this file from HDFS to local file system, I can see
>> the correct size. Why is it reported as zero size in hadoop CLI?
>>
>>
>>
>