[jira] Commented: (HADOOP-3315) New binary file format

stack (JIRA) Thu, 29 Jan 2009 12:28:24 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668588#action_12668588
 ]


stack commented on HADOOP-3315:
-------------------------------

Hmm.  Looking at doing random accesses and it seems like a bunch of time is 
spent in inBlockAdvance advancing sequentially through blocks rather than do 
something like a binary search to find desired block location.  Also, as we 
advance, we create and destroy a bunch of objects such as the stream to hold 
the value.  Can you comment on why this is (compression should be on tfile 
block boundaries, right so nothing to stop hopping into the midst of a tfile)?  
Thanks.

> New binary file format
> ----------------------
>
>                 Key: HADOOP-3315
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3315
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: io
>            Reporter: Owen O'Malley
>            Assignee: Amir Youssefi
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-3315_20080908_TFILE_PREVIEW_WITH_LZO_TESTS.patch, 
> HADOOP-3315_20080915_TFILE.patch, hadoop-trunk-tfile.patch, 
> hadoop-trunk-tfile.patch, TFile Specification 20081217.pdf
>
>
> SequenceFile's block compression format is too complex and requires 4 codecs 
> to compress or decompress. It would be good to have a file format that only 
> needs 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3315) New binary file format

Reply via email to