[ 
https://issues.apache.org/jira/browse/HBASE-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-61:
-----------------------

    Attachment: hfile3.patch

Latest version of the hfile patch.  Scanners work properly now.  Stripped down 
the API.  Actually need the SimpleBufferedInputStream  between tfile and 
DFSInputStream -- just with smaller buffer size -- for sake of increased 
concurrency.  Also need to change how we read so we read the whole block in 
rather than piecemeal it as tfile currently does.  The tfile is block based but 
reads on backing stream do not pull in whole blocks; it just reads whats 
needed.  This means that there is no whole block to cache if we only read a 
part and we're decompressing just what we need -- so it can be faster in 
certain circumstance -- but this behavior frustrates being able to cache on a 
block basis or more importantly decompressed blocks.

I'd work on this next but have been chatting with Ryan Rawson over last few 
days and he just sent me his rfile patch.  Going to help out on that effort for 
a while.

> [hbase] Create an HBase-specific MapFile implementation
> -------------------------------------------------------
>
>                 Key: HBASE-61
>                 URL: https://issues.apache.org/jira/browse/HBASE-61
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: io
>            Reporter: Bryan Duxbury
>            Assignee: stack
>            Priority: Minor
>             Fix For: 0.20.0
>
>         Attachments: cpucalltreetfile.html, hfile.patch, hfile2.patch, 
> hfile3.patch, longestkey.patch, tfile.patch, tfile3.patch
>
>
> Today, HBase uses the Hadoop MapFile class to store data persistently to 
> disk. This is convenient, as it's already done (and maintained by other 
> people :). However, it's beginning to look like there might be possible 
> performance benefits to be had from doing an HBase-specific implementation of 
> MapFile that incorporated some precise features.
> This issue should serve as a place to track discussion about what features 
> might be included in such an implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to