Hi Saptarshi,

Are you able to reproduce this on the 0.20.1rc1 uploaded last week?

http://people.apache.org/~omalley/hadoop-0.20.1-rc1/

If so, it would be worth putting together a test case. If you can reproduce
this in a JUnit test (even if it only happens once every few runs) you
should definitely open a JIRA.

Thanks,
-Todd

On Sat, Sep 5, 2009 at 12:22 PM, Saptarshi Guha <[email protected]>wrote:

> Hello,
> I'm using the the textoutputformat in mapreduce/lib/output with Hadoop 0.20
> and it appears it is not writing all the keys to the output file even though
> the
> the write method in the RecordWriter is recieving them. Let me explain
>
> 1) I copied TextOutputFormat  save for some debugging print messages
>
>    public synchronized void write(K key, V value)
>      throws IOException {
>
>      boolean nullKey = key == null || key instanceof NullWritable;
>      boolean nullValue = value == null || value instanceof NullWritable;
>      if (nullKey && nullValue) {
>        return;
>      }
>      if (!nullKey) {
>        writeObject(key);
>      }
>      if (!(nullKey || nullValue)) {
>        out.write(keyValueSeparator);
>      }
>      if (!nullValue) {
>        writeObject(value);
>      }
>      out.write(newline);
>
>            System.out.println("Key="+key.toString());
>            System.out.println("Value="+value.toString());
>    }
>
> I expect 52 keys corresponding to the upper/lower case keys of the
> alphabet.  I get < 52 keys in the output folder, sometimes 44, some times,
> and once even 52.
> /However/, the write method above does recieve the missing K,V value as
> evidenced by the log file messages, i.e i see Key=(missing key) and
> Value=(missing-value)
> Hence for some reason, a) it is not writing,b) writing but not
> flushing/commiting or c) the temporary outputs are getting deleted.
> Also if a given reducer has received  e.g 5 keys, i see messages for 5
> keys, of which a few (but not all) are missing.
>
> SequenceFileOutputFormat does not have the same issues(all 52 present)
>
> Any ideas?My bug?
> Kind Regards
> Saptarshi
>
> Version: 0.20.0, r763504
> Compiled: Thu Apr 9 05:18:40 UTC 2009 by ndaley
> Identifier: 200908281653
>
>
>
> Saptarshi Guha | [email protected] |
> http://www.stat.purdue.edu/~sguha <http://www.stat.purdue.edu/%7Esguha>
> Kindness is a language which the deaf can hear and the blind can read.
>                -- Mark Twain
>
>

Reply via email to