Multiple output files

Ryan LeCompte Sat, 06 Sep 2008 09:35:36 -0700

Hello,

I have a question regarding multiple output files that get produced as
a result of using multiple reduce tasks for a job (as opposed to only
one). If I'm using a custom writable and thus writing to a sequence
output, am I gauranteed that all of the day for a particular key will
appear in a single output file (e.g., part-0000), or is it possible
that the values could be split across multiple part-xxxx files? At the
end of the job I'm using the sequence file reader to read each custom
key/writable pair from each output file. Is it possible that the same
key could appear in multiple output files? If so, does Hadoop
automatically grab all of the values for a particular key in all of
the output files?


Thanks,
Ryan

Multiple output files

Reply via email to