[ 
https://issues.apache.org/jira/browse/HADOOP-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602132#action_12602132
 ] 

Owen O'Malley commented on HADOOP-3315:
---------------------------------------

Srikanth, I don't understand your concern. When the user calls append(long, 
long), the writer can decide whether to start a new block or not based on the 
lengths. So as the client calls write(byte[], int, int) on the output stream, 
it can be written directly to the file stream or codec's ByteBuffer. For codecs 
like lzo, the write may be broken into multiple calls to handle the required 
chunking.

And yes, to make this efficient, you need to be able to get the serialized 
length of the objects. 

> New binary file format
> ----------------------
>
>                 Key: HADOOP-3315
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3315
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: io
>            Reporter: Owen O'Malley
>            Assignee: Srikanth Kakani
>         Attachments: Tfile-1.pdf, TFile-2.pdf
>
>
> SequenceFile's block compression format is too complex and requires 4 codecs 
> to compress or decompress. It would be good to have a file format that only 
> needs 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to