[ https://issues.apache.org/jira/browse/HADOOP-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12559208#action_12559208 ]
stack commented on HADOOP-1398: ------------------------------- Patch looks great Tom. You pass 'length' in the below but its not used: {code} + protected FSDataInputStream openFile(FileSystem fs, Path file, + int bufferSize, long length) throws IOException { + return fs.open(file, bufferSize); {code} I presume you have plans for it later? You have confidence in the LruMap class? You don't have unit tests (though these things are hard to test). I ask because though small, sometimes these kinds of classes can prove a little tricky.... Do you have any numbers for how it improves throughput when cached blocks are 'hot'? And you talked of a slight 'cost'. Do you have rough numbers for that too? (Playing on datanode adjusting the size of the CRC blocks, a similar type of blocking to what you have here, there was no discernable difference adjusting sizes). What do we need to add to make it so its easy to enable/disable this feature on a per-column basis? Currently edits to column config. requires taking column offline. Changing this configuration looks safe-to-do while the column stays on line. Would you agree? > Add in-memory caching of data > ----------------------------- > > Key: HADOOP-1398 > URL: https://issues.apache.org/jira/browse/HADOOP-1398 > Project: Hadoop > Issue Type: New Feature > Components: contrib/hbase > Reporter: Jim Kellerman > Priority: Trivial > Attachments: hadoop-blockcache.patch > > > Bigtable provides two in-memory caches: one for row/column data and one for > disk block caches. > The size of each cache should be configurable, data should be loaded lazily, > and the cache managed by an LRU mechanism. > One complication of the block cache is that all data is read through a > SequenceFile.Reader which ultimately reads data off of disk via a RPC proxy > for ClientProtocol. This would imply that the block caching would have to be > pushed down to either the DFSClient or SequenceFile.Reader -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.