[ 
https://issues.apache.org/jira/browse/HADOOP-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12478116
 ] 

Doug Cutting commented on HADOOP-1053:
--------------------------------------

Milind, you're right: if we implement some things in the io package as records 
then we'll have a circular package structure: these packages would no longer be 
well layered.  I don't see how that would cost us much, but it is true.

If we want to (1) define some things that are currently in the io package as 
records, (2) not duplicate code, and (3) keep things well layered, then we'd 
need to restructure things.  The io runtime (e.g., readVInt, compareBytes, 
Writable, etc.), would need to be split into a separate package from classes 
that we might define using the record package, like IntWritable and 
BytesWritable, so that the package layering might be ioruntime > record > 
iostructs.

But what would be the point?  We could probably decompose nearly every package 
into well-layered sub-packages, but that is disruptive, since it is not 
back-compatible. Occasionally it is warranted, when packages get too big and 
poorly defined, and we have other reasons to change public APIs.  For example, 
I would like to someday re-organize mapred into several sub-packages (e.g., 
client, protocol, tasktracker, jobtracker), to rename org.apache.hadoop.dfs to 
be org.apache.hadoop.fs.hdfs, to make the util package smaller, etc., but we 
don't want to rush into such changes lightly.

In summary, I still fail to see an overwhelming argument for making 
org.apache.hadoop.record independent of org.apache.hadoop.io.  What am I 
missing?

> Make Record I/O functionally modular from the rest of Hadoop
> ------------------------------------------------------------
>
>                 Key: HADOOP-1053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1053
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: record
>    Affects Versions: 0.11.2
>         Environment: All
>            Reporter: Milind Bhandarkar
>         Assigned To: Milind Bhandarkar
>             Fix For: 0.13.0
>
>         Attachments: jute-patch.txt
>
>
> This issue has been created to separate one proposal originally included in 
> HADOOP-941, for which no consensus could be reached. For earlier discussion 
> about the issue, please see HADOOP-941.
> I will summarize the proposal here.  We need to provide a way for some users 
> who want to use record I/O framework outside of Hadoop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to