[
https://issues.apache.org/jira/browse/HADOOP-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561327#action_12561327
]
Tom White commented on HADOOP-1398:
-----------------------------------
I ran some benchmarks of PerformanceEvaluation with and without block caching
enabled. The setup was similar to that described in
http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation, with three machines
on EC2: one running the namenode and HBase master, one running a datanode and a
region server, and one running a datanode and the PerformanceEvaluation program.
Number of operations per second:
||Experiment||Block cache disabled||Block cache enabled||
|sequential reads|119|182|
|random reads|110|123|
I've seen quite a lot of variation in the results of PerformanceEvaluation, so
I'm reluctant to read too much into these figures. But I think we can say that
the block cache doesn't seem to slow down the system.
> Add in-memory caching of data
> -----------------------------
>
> Key: HADOOP-1398
> URL: https://issues.apache.org/jira/browse/HADOOP-1398
> Project: Hadoop
> Issue Type: New Feature
> Components: contrib/hbase
> Reporter: Jim Kellerman
> Priority: Trivial
> Attachments: commons-collections-3.2.jar, hadoop-blockcache-v2.patch,
> hadoop-blockcache-v3.patch, hadoop-blockcache-v4.1.patch,
> hadoop-blockcache-v4.patch, hadoop-blockcache.patch
>
>
> Bigtable provides two in-memory caches: one for row/column data and one for
> disk block caches.
> The size of each cache should be configurable, data should be loaded lazily,
> and the cache managed by an LRU mechanism.
> One complication of the block cache is that all data is read through a
> SequenceFile.Reader which ultimately reads data off of disk via a RPC proxy
> for ClientProtocol. This would imply that the block caching would have to be
> pushed down to either the DFSClient or SequenceFile.Reader
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.