[
https://issues.apache.org/jira/browse/HADOOP-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634285#action_12634285
]
Hong Tang commented on HADOOP-3315:
-----------------------------------
bq. Sorry if this need seems exotic but I think we can get away with casting
this need under the 'Extensibility' TFile Design Principal. In our application,
keys are row/column/timestamp. If millions of columns in a row and we want to
skip to the next row, we can't next-next-next through the keys. It'll be too
slow. We need to skip ahead to the new row. Block index won't help in this
regard.
Yes, it sounds reasonable to change the various indices in TFile as protected
instead of private.
Just curiously, would your auxiilary index remember how many records start with
the same row-key? So that you may want to take advantage of this to quickly
advance? If true, a better way than opening on advanceCursoInBlock() is to
provide an advanceCursor(n) API on the Scanner.
> New binary file format
> ----------------------
>
> Key: HADOOP-3315
> URL: https://issues.apache.org/jira/browse/HADOOP-3315
> Project: Hadoop Core
> Issue Type: New Feature
> Components: io
> Reporter: Owen O'Malley
> Assignee: Amir Youssefi
> Attachments: HADOOP-3315_20080908_TFILE_PREVIEW_WITH_LZO_TESTS.patch,
> HADOOP-3315_20080915_TFILE.patch, TFile Specification Final.pdf
>
>
> SequenceFile's block compression format is too complex and requires 4 codecs
> to compress or decompress. It would be good to have a file format that only
> needs
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.