[ 
https://issues.apache.org/jira/browse/HADOOP-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634028#action_12634028
 ] 

stack commented on HADOOP-3315:
-------------------------------

@Hong.  Yeah.  You got all my questions.  Here's some follow-on comments.

Would be great if you could figure some way of avoiding double-decompress doing 
a single random access.  Yeah on name of alternate comparator, etc.

On "indexing the region by the endKey", pardon me.  I thought ceiling a 
Scanner-only method.  I see now what you are suggesting and that could almost 
work (leaving aside need to rewrite a bunch of ornery hbase code to make the 
change).

But what if the following in TFile.Reader:

{code}
    private final MetaBlockTFileDataIndex metaTFileDataIndex;
    private final MetaBlockTFileMeta metaTFileMeta;
{code}

...had protected accessors so were accessible by a subclass?  Then a subclass 
could do its own implementation of ceiling to return the key closest before 
rather than closest after as currently implemented?  To do this, I'd replicate 
the body of ceiling -- would need the accessors so I could do the following 
call in my subclass:

{code}
      int blkIndex = metaTFileDataIndex.lowerBound(inKey);
{code}

and if searchInBlock was also protected instead of private, a subclass could 
ask it to find the equal to or before rather than equal to or after?





> New binary file format
> ----------------------
>
>                 Key: HADOOP-3315
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3315
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: io
>            Reporter: Owen O'Malley
>            Assignee: Amir Youssefi
>         Attachments: HADOOP-3315_20080908_TFILE_PREVIEW_WITH_LZO_TESTS.patch, 
> HADOOP-3315_20080915_TFILE.patch, TFile Specification Final.pdf
>
>
> SequenceFile's block compression format is too complex and requires 4 codecs 
> to compress or decompress. It would be good to have a file format that only 
> needs 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to