LineRecordWriter should not always insert tab char between key and value
------------------------------------------------------------------------

                 Key: HADOOP-819
                 URL: http://issues.apache.org/jira/browse/HADOOP-819
             Project: Hadoop
          Issue Type: Improvement
            Reporter: Runping Qi



With the current implementation of LineRecordWriter in TextOutputFormat, the 
client cannot pass null key/or value to the write function, and a tab char is 
always inserted between  the key and value. This works fine most time. However, 
in some 
cases, one just does not want to have the extra tab char. A common example is 
that, if I need to implement a utility similar 
to the unix sort with some fields in the lines as the sort key, I can have my 
map to extract the sort key from each line and pass the whole line as the 
value. The reducer just outputs the values and ignore the keys. However, if I 
use TextOutputFormat, my output will have an extra tab key in each of the 
lines, which is annoying. 

A simple solution is that let the write function of LineRecordWriter accept 
null key argument, and write out the value only if the key is null. 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to