[
https://issues.apache.org/jira/browse/HBASE-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-61:
-----------------------
Attachment: hfile3.patch
Latest version of the hfile patch. Scanners work properly now. Stripped down
the API. Actually need the SimpleBufferedInputStream between tfile and
DFSInputStream -- just with smaller buffer size -- for sake of increased
concurrency. Also need to change how we read so we read the whole block in
rather than piecemeal it as tfile currently does. The tfile is block based but
reads on backing stream do not pull in whole blocks; it just reads whats
needed. This means that there is no whole block to cache if we only read a
part and we're decompressing just what we need -- so it can be faster in
certain circumstance -- but this behavior frustrates being able to cache on a
block basis or more importantly decompressed blocks.
I'd work on this next but have been chatting with Ryan Rawson over last few
days and he just sent me his rfile patch. Going to help out on that effort for
a while.
> [hbase] Create an HBase-specific MapFile implementation
> -------------------------------------------------------
>
> Key: HBASE-61
> URL: https://issues.apache.org/jira/browse/HBASE-61
> Project: Hadoop HBase
> Issue Type: Improvement
> Components: io
> Reporter: Bryan Duxbury
> Assignee: stack
> Priority: Minor
> Fix For: 0.20.0
>
> Attachments: cpucalltreetfile.html, hfile.patch, hfile2.patch,
> hfile3.patch, longestkey.patch, tfile.patch, tfile3.patch
>
>
> Today, HBase uses the Hadoop MapFile class to store data persistently to
> disk. This is convenient, as it's already done (and maintained by other
> people :). However, it's beginning to look like there might be possible
> performance benefits to be had from doing an HBase-specific implementation of
> MapFile that incorporated some precise features.
> This issue should serve as a place to track discussion about what features
> might be included in such an implementation.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.