[jira] [Created] (HADOOP-9168) The Naming and Inheritance for RecordReader, LineRecordReader, LineReader

2012-12-26 Thread Gelesh (JIRA)
Gelesh created HADOOP-9168:
--

 Summary: The Naming and Inheritance for RecordReader, 
LineRecordReader, LineReader 
 Key: HADOOP-9168
 URL: https://issues.apache.org/jira/browse/HADOOP-9168
 Project: Hadoop Common
  Issue Type: Improvement
  Components: util
Affects Versions: 0.23.5, 2.0.2-alpha, 0.21.0
Reporter: Gelesh
Priority: Minor
 Fix For: site, hudson, 1.2.0, 0.23.2



I feel LineReader is not the correct name, since it reads up to a given 
delimiter.

How about Text Record Reader ?
Sounds correct but LineReader is not a RecordReader by inheritance,
but by functionality , yes it is the Record reader.

Now if we look at it with a different angle,


In General,
InputFormat would mostly has two responsibilities
1)To Read A split
2)Generate Key  Value pairs based upon the Reading done over Split.

Now in TextInputFormat,
Has a RecordReader, Which is inherited by LineRecordReader, 
which uses another class LineReader.

But We Have
LineReader, which does the reading of the file.
LineRecordReader generates key  Value. 

I would suggest,

RecordReader  to be renamed as KeyValueGenerator,
LineRecordReader  to be renamed as TextInputKeyValueGenerator,
LineReaderto be renamed as delimitedTextReader,

Generic attributes of LineReader (such as start, pos, end, buffer, bufferBytes 
.. etc ) to be abstracted to a class called RecordReader,
Since its all specific to reading of the given input.

delimitedTextReader class could extend RecordReader.

Now the names could make better scene. We must also look into computability as 
well. It might be un fit to deploy unless a new API is introduced.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HADOOP-9169) Bring branch-0.23 ExitUtil up to same level as branch-2

2012-12-26 Thread Robert Joseph Evans (JIRA)
Robert Joseph Evans created HADOOP-9169:
---

 Summary: Bring branch-0.23 ExitUtil up to same level as branch-2
 Key: HADOOP-9169
 URL: https://issues.apache.org/jira/browse/HADOOP-9169
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.23.5
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans


ExitUtil in 0.23 is behind branch-2, because a number of changes went in that 
were part of HDFS JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira