[ 
https://issues.apache.org/jira/browse/HADOOP-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488336
 ] 

Hadoop QA commented on HADOOP-819:
----------------------------------

Integrated in Hadoop-Nightly #55 (See 
http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/55/)

> LineRecordWriter should not always insert tab char between key and value
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-819
>                 URL: https://issues.apache.org/jira/browse/HADOOP-819
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Runping Qi
>         Assigned To: Runping Qi
>             Fix For: 0.13.0
>
>         Attachments: patch-819.txt
>
>
> With the current implementation of LineRecordWriter in TextOutputFormat, the 
> client cannot pass null key/or value to the write function, and a tab char is 
> always inserted between  the key and value. This works fine most time. 
> However, in some 
> cases, one just does not want to have the extra tab char. A common example is 
> that, if I need to implement a utility similar 
> to the unix sort with some fields in the lines as the sort key, I can have my 
> map to extract the sort key from each line and pass the whole line as the 
> value. The reducer just outputs the values and ignore the keys. However, if I 
> use TextOutputFormat, my output will have an extra tab key in each of the 
> lines, which is annoying. 
> A simple solution is that let the write function of LineRecordWriter accept 
> null key argument, and write out the value only if the key is null. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to