[ https://issues.apache.org/jira/browse/HADOOP-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12488336 ]
Hadoop QA commented on HADOOP-819: ---------------------------------- Integrated in Hadoop-Nightly #55 (See http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/55/) > LineRecordWriter should not always insert tab char between key and value > ------------------------------------------------------------------------ > > Key: HADOOP-819 > URL: https://issues.apache.org/jira/browse/HADOOP-819 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Reporter: Runping Qi > Assigned To: Runping Qi > Fix For: 0.13.0 > > Attachments: patch-819.txt > > > With the current implementation of LineRecordWriter in TextOutputFormat, the > client cannot pass null key/or value to the write function, and a tab char is > always inserted between the key and value. This works fine most time. > However, in some > cases, one just does not want to have the extra tab char. A common example is > that, if I need to implement a utility similar > to the unix sort with some fields in the lines as the sort key, I can have my > map to extract the sort key from each line and pass the whole line as the > value. The reducer just outputs the values and ignore the keys. However, if I > use TextOutputFormat, my output will have an extra tab key in each of the > lines, which is annoying. > A simple solution is that let the write function of LineRecordWriter accept > null key argument, and write out the value only if the key is null. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.