using "Line" in class names stresses the use of these classes for Streaming.

Unix commands invoked by Streaming operate on lines, not on Text.

Understanding what "Text" means requires knowing internals of Hadoop, while "line" is used here in its common meaning.

On Apr 12, 2007, at 2:14 PM, Doug Cutting (JIRA) wrote:


[ https://issues.apache.org/jira/browse/HADOOP-1214? page=com.atlassian.jira.plugin.system.issuetabpanels:comment- tabpanel#action_12488504 ]

Doug Cutting commented on HADOOP-1214:
--------------------------------------

There are some whitespace-only changes in TextInputFormat.java

SequenceFileLineRecordReader might better be called SequenceFileTextRecordReader, since it doesn't convert things to lines, but rather to Text, no?

Similarly, SequenceFileToLineInputFormat might be called SequenceFileTextInputFormat.




the first step for streaming clean up
-------------------------------------

                Key: HADOOP-1214
                URL: https://issues.apache.org/jira/browse/HADOOP-1214
            Project: Hadoop
         Issue Type: Improvement
         Components: contrib/streaming
           Reporter: Runping Qi
        Assigned To: Runping Qi
        Attachments: patch-1214.txt


This is the first step for streaming clean up.
This step will mainly replace various streaming classes related inputformat/output format, record readers, etc. with hadoop's counterparts.
This step will maintain backward compatibility

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Reply via email to