[ 
https://issues.apache.org/jira/browse/HADOOP-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated HADOOP-3315:
------------------------------

    Attachment: TFile Specification Final.pdf

The TFile specification.

TFile is an immutable <Key, Value> storage format. TFile is meant to replace 
Sequence File. Comparing with Sequence File, TFile provides the features of 
block compression, sorted and unsorted keys, block indexing (for sorted TFile 
only), scan (bulk read), and application-level meta data. The data abstraction 
and storage representation is language-neutral, which supports different 
implementation in various languages (although this specification only describes 
the API in Java).

> New binary file format
> ----------------------
>
>                 Key: HADOOP-3315
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3315
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: io
>            Reporter: Owen O'Malley
>            Assignee: Amir Youssefi
>         Attachments: HADOOP-3315_TFILE_PREVIEW.patch, 
> HADOOP-3315_TFILE_PREVIEW_WITH_LZO_TESTS.patch, TFile Specification Final.pdf
>
>
> SequenceFile's block compression format is too complex and requires 4 codecs 
> to compress or decompress. It would be good to have a file format that only 
> needs 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to