Sergey Shelukhin created HBASE-10418:
----------------------------------------
Summary: give blocks of smaller store files priority in cache
Key: HBASE-10418
URL: https://issues.apache.org/jira/browse/HBASE-10418
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Sergey Shelukhin
That's just an idea at this point, I don't have a patch nor plan to make one in
near future.
It's good for datasets that don't fit in memory especially; and if scans are
involved.
Scans (and gets in absence of bloom filters' help) have to read from all store
files. Short range request will hit one block in every file.
If small files are more likely to be entirely available in memory, on average
requests will hit less blocks from FS.
For scans that read a lot of data, it's better to read blocks in sequence from
a big file and blocks for small files from cache, rather than a mix of FS and
cached blocks from different files, because the (HBase) blocks of a big file
would be sequential in one HDFS block.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)