[jira] Commented: (HADOOP-3315) New binary file format

stack (JIRA) Mon, 12 May 2008 12:17:19 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12596169#action_12596169
 ]


stack commented on HADOOP-3315:
-------------------------------

This proposal looks great.  Here's a  couple of comments:

In 'Goals', says 'support all kinds of columns'?  Do you mean all column data 
types?  Also says 'support seek to key and seek to row'.  What is the 
difference between a key and a row?

In the description of the blockidx, says 'Spare index of keys into the 
datablocks'.  Whats this mean?  The key that is at the start of each block will 
be in the block index?  And only this?  Or will index have entries keys from 
the middle of blocks in it?

Does the metadata value have to be a String?  It looks like it doesn't have to 
be -- that I can specify my own keyClass and valClass.  For example, I would 
like to be able to write a bloom filter into the metadata.

Its not plain that user can add their own metadata to imeta.  You might 
explicitly state this.

Section 3.2 where you describe two different kinds of index is a little 
confusing (I'm not clear on RO vs. Key as per above).

In the Writer API, you state that a null key class is for a keyless column.  
Whats a null value class imply?

Is the Writer API missing metadata writing?  Same for reading.

Reading talks about rowids but writer does not.  Is this intentional?

For the reader API, expose methods getting key only without reading value?

> New binary file format
> ----------------------
>
>                 Key: HADOOP-3315
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3315
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: io
>            Reporter: Owen O'Malley
>            Assignee: Srikanth Kakani
>         Attachments: Tfile-1.pdf
>
>
> SequenceFile's block compression format is too complex and requires 4 codecs 
> to compress or decompress. It would be good to have a file format that only 
> needs 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3315) New binary file format

Reply via email to