[jira] Commented: (HADOOP-1398) Add in-memory caching of data

Tom White (JIRA) Tue, 22 Jan 2008 04:38:54 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12561327#action_12561327
 ]


Tom White commented on HADOOP-1398:
-----------------------------------

I ran some benchmarks of PerformanceEvaluation with and without block caching 
enabled. The setup was similar to that described in 
http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation, with three machines 
on EC2: one running the namenode and HBase master, one running a datanode and a 
region server, and one running a datanode and the PerformanceEvaluation program.

Number of operations per second:

||Experiment||Block cache disabled||Block cache enabled||
|sequential reads|119|182|
|random reads|110|123|

I've seen quite a lot of variation in the results of PerformanceEvaluation, so 
I'm reluctant to read too much into these figures. But I think we can say that 
the block cache doesn't seem to slow down the system. 


> Add in-memory caching of data
> -----------------------------
>
>                 Key: HADOOP-1398
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1398
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: contrib/hbase
>            Reporter: Jim Kellerman
>            Priority: Trivial
>         Attachments: commons-collections-3.2.jar, hadoop-blockcache-v2.patch, 
> hadoop-blockcache-v3.patch, hadoop-blockcache-v4.1.patch, 
> hadoop-blockcache-v4.patch, hadoop-blockcache.patch
>
>
> Bigtable provides two in-memory caches: one for row/column data and one for 
> disk block caches.
> The size of each cache should be configurable, data should be loaded lazily, 
> and the cache managed by an LRU mechanism.
> One complication of the block cache is that all data is read through a 
> SequenceFile.Reader which ultimately reads data off of disk via a RPC proxy 
> for ClientProtocol. This would imply that the block caching would have to be 
> pushed down to either the DFSClient or SequenceFile.Reader

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1398) Add in-memory caching of data

Reply via email to