[
https://issues.apache.org/jira/browse/HADOOP-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633980#action_12633980
]
Hong Tang commented on HADOOP-3315:
-----------------------------------
bq. On "indexing the region by the endKey", pardon me, I'm not sure I follow.
Currently index is block-based, not key-based IIUC so can I even make an index
that has all keys? Or, can you make an index that is key-based? (Even if I
could index all keys, if key/values are small, might make for a big index so
might need something like the MapFile interval).
Sorry for the confusion. My comment is in response to your question wrt whether
TFile can support something similar to MapFile's getClosest() call. The answer
to that question is that we cannot implement such semantics efficiently because
the API would require a bidirectional iterator and the underlying decompression
stream is not so.
My understanding of your usage case is that you currently have a MapFile with
key being <region startKey> value may contain <region endKey, ...>. Given a
client key, you perform getCloest(before==true) to get to the right region
entry in the MapFile. To support the usage case in TFile, you may use <region
endKey> as TFile key, and <region startKey, ...> as the value of TFile. Then
TFile.Reader.ceiling(clientKey) will get you to the right entry.
> New binary file format
> ----------------------
>
> Key: HADOOP-3315
> URL: https://issues.apache.org/jira/browse/HADOOP-3315
> Project: Hadoop Core
> Issue Type: New Feature
> Components: io
> Reporter: Owen O'Malley
> Assignee: Amir Youssefi
> Attachments: HADOOP-3315_20080908_TFILE_PREVIEW_WITH_LZO_TESTS.patch,
> HADOOP-3315_20080915_TFILE.patch, TFile Specification Final.pdf
>
>
> SequenceFile's block compression format is too complex and requires 4 codecs
> to compress or decompress. It would be good to have a file format that only
> needs
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.