[ 
https://issues.apache.org/jira/browse/HBASE-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13167284#comment-13167284
 ] 

Lars Hofhansl commented on HBASE-5001:
--------------------------------------

* Bytes.add(hfileNameInBytes, Bytes.toBytes(offset)) -> 0.07us

But byte[] cannot be directly use as key in a map, no? Would need to wrap in 
HashBytes, so:
* new HashedBytes(Bytes.add(x, Bytes.toBytes(offser))); -> 0.08us

Which brought me to a new idea, what if we have a CacheKey Object that takes a 
String and a long:
* new CacheKey(hfileName, offset) -> 0.01us

That would be the cleanest design anyway. Cachkey would implement the proper 
equals and hashCode methods.
The LruCache could just take CacheKey (or even just java.lang.Object) as cache 
key, that way we can pass whatever.

                
> Improve the performance of block cache keys
> -------------------------------------------
>
>                 Key: HBASE-5001
>                 URL: https://issues.apache.org/jira/browse/HBASE-5001
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.4
>            Reporter: Jean-Daniel Cryans
>            Priority: Minor
>             Fix For: 0.94.0
>
>
> Doing a pure random read test on data that's 100% block cache, I see that we 
> are spending quite some time in getBlockCacheKey:
> {quote}
> "IPC Server handler 19 on 62023" daemon prio=10 tid=0x00007fe0501ff800 
> nid=0x6c87 runnable [0x00007fe0577f6000]
>    java.lang.Thread.State: RUNNABLE
>       at java.util.Arrays.copyOf(Arrays.java:2882)
>       at 
> java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
>       at 
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
>       at java.lang.StringBuilder.append(StringBuilder.java:119)
>       at 
> org.apache.hadoop.hbase.io.hfile.HFile.getBlockCacheKey(HFile.java:457)
>       at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:249)
>       at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:209)
>       at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:521)
>       at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:536)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:178)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:111)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekExactly(StoreFileScanner.java:219)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:80)
>       at 
> org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1689)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:2857)
> {quote}
> Since the HFile name size is known and the offset is a long, it should be 
> possible to allocate exactly what we need. Maybe use byte[] as the key and 
> drop the separator too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to