[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128811#comment-14128811 ] Nick Dimiduk commented on HBASE-11331: -- {quote} Left some comments at RB. bq. Any other comments from reviewers? Are we happy with the config name hbase.block.data.cachecompressed ? Should this be {{hbase.block.cache.data.cachecompressed}}? {quote} There's not a lot of consistency around cache configurations. We also have: - hbase.rs.cacheblocksonwrite - hfile.block.index.cacheonwrite - hfile.block.bloom.cacheonwrite - hbase.rs.evictblocksonclose - hbase.bucketcache.* - hbase.rs.prefetchblocksonopen - hbase.offheapcache.minblocksize [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128833#comment-14128833 ] stack commented on HBASE-11331: --- +1 Add to the simplification of block cache config a note that we need to unify the configuration args? Needs fat release note and on commit, add something to the refguide in the block cache section else I'm afraid folks won't come across this feature. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129211#comment-14129211 ] Enis Soztutar commented on HBASE-11331: --- Is this right? Assuming HeapByteBuffer here. {code} @Override public void serialize(ByteBuffer destination) { -ByteBuffer dupBuf = this.buf.duplicate(); - dupBuf.rewind(); - destination.put(dupBuf); +// assumes HeapByteBuffer + destination.put(this.buf.array(), this.buf.arrayOffset() + getSerializedLength() - EXTRA_SERIALIZATION_SPACE); serializeExtraInfo(destination); } {code} On naming conf, I'll go with whatever you think is better. Agreed with Stack. Fat release note would be good. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129242#comment-14129242 ] Nick Dimiduk commented on HBASE-11331: -- Thanks [~enis], [~stack] for having a look. bq. Add to the simplification of block cache config a note that we need to unify the configuration args? Alright, I'll open a ticket. bq. Needs fat release note and on commit, add something to the refguide in the block cache section else I'm afraid folks won't come across this feature. Release note I'd planned, just what's in the commit message; you want more than this? {noformat} When hbase.block.data.cachecompressed=true, DATA (and ENCODED_DATA) blocks are cached in the BlockCache in their on-disk format. This is different from the default behavior, which decompresses and decrypts a block before caching. {noformat} I'll open a docs ticket for updating the book. bq. Is this right? Assuming HeapByteBuffer here. Alas, yes. There's a number of assumptions baked into HFileBlock about HeapByteArrays. The whole read-ahead of the next block's header feature is based on this supposition (look for reads beyond limit without checking capacity; this is how I ran into the serialization bug in the first place). [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129280#comment-14129280 ] Enis Soztutar commented on HBASE-11331: --- bq. Release note I'd planned, just what's in the commit message; you want more than this? Maybe add why you would want this (X times more block cache while trading more CPU usage). bq. There's a number of assumptions baked into HFileBlock about HeapByteArrays Ok. Seems we may need to address that in the future. +1 for 0.99+. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129286#comment-14129286 ] Nick Dimiduk commented on HBASE-11331: -- Opened HBASE-11938 for cache configuration and HBASE-11939 for updating the book. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129287#comment-14129287 ] stack commented on HBASE-11331: --- bq. you want more than this? No. The note you have looks good. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129296#comment-14129296 ] Andrew Purtell commented on HBASE-11331: +1 for 0.98. 0.98.7 is open. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129427#comment-14129427 ] Hudson commented on HBASE-11331: SUCCESS: Integrated in HBase-TRUNK #5489 (See [https://builds.apache.org/job/HBase-TRUNK/5489/]) HBASE-11331 [blockcache] lazy block decompression (ndimiduk: rev eec15bd17218a2543c9f4cbb9a78841a9fdec043) * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileEncryption.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV3.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLazyDataBlockDecompression.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheConfig.java * hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java * hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/BlockCacheTmpl.jamon * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129471#comment-14129471 ] Hudson commented on HBASE-11331: FAILURE: Integrated in HBase-0.98 #510 (See [https://builds.apache.org/job/HBase-0.98/510/]) HBASE-11331 [blockcache] lazy block decompression (ndimiduk: rev b8851309e0ff1e04a3a0abd5fbad6d1868a151bc) * hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java * hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/BlockCacheTmpl.jamon * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLazyDataBlockDecompression.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileEncryption.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV3.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129511#comment-14129511 ] Hudson commented on HBASE-11331: SUCCESS: Integrated in HBase-1.0 #171 (See [https://builds.apache.org/job/HBase-1.0/171/]) HBASE-11331 [blockcache] lazy block decompression (ndimiduk: rev 4d51cf0ee7d2a508569bd96630eb6d2d9afd6493) * hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/BlockCacheTmpl.jamon * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileEncryption.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheConfig.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java * hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV3.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLazyDataBlockDecompression.java [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14129521#comment-14129521 ] Hudson commented on HBASE-11331: SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #483 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/483/]) HBASE-11331 [blockcache] lazy block decompression (ndimiduk: rev b8851309e0ff1e04a3a0abd5fbad6d1868a151bc) * hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/BlockCacheTmpl.jamon * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV3.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLazyDataBlockDecompression.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileEncryption.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java * hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java * hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127736#comment-14127736 ] Ted Yu commented on HBASE-11331: +1 [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127828#comment-14127828 ] Hadoop QA commented on HBASE-11331: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12667525/HBASE-11331.07.patch against trunk revision . ATTACHMENT ID: 12667525 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 44 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestMultiParallel org.apache.hadoop.hbase.master.TestDistributedLogSplitting {color:red}-1 core zombie tests{color}. There are 6 zombie test(s): at org.apache.hadoop.hbase.util.TestBytes.testToStringBytesBinaryReversible(TestBytes.java:295) at org.apache.hadoop.hbase.backup.TestHFileArchiving.testCleaningRace(TestHFileArchiving.java:372) at org.apache.hadoop.hbase.ipc.TestDelayedRpc.testTooManyDelayedRpcs(TestDelayedRpc.java:202) at org.apache.hadoop.hbase.TestIOFencing.testFencingAroundCompaction(TestIOFencing.java:224) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10800//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10800//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10800//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10800//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10800//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10800//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10800//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10800//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10800//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10800//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10800//console This message is automatically generated. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 2.0.0, 0.98.7, 0.99.1 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331.06-0.98.patch, HBASE-11331.06.patch, HBASE-11331.07-0.98.patch, HBASE-11331.07.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14112523#comment-14112523 ] Nick Dimiduk commented on HBASE-11331: -- Quick update: I've hit an ArrayIndexOutOfBoundsException while running a mixed workload via LTT. Investigating. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.99.0, 2.0.0, 0.98.7 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111244#comment-14111244 ] Nick Dimiduk commented on HBASE-11331: -- bq. The schema option needs to be false by default This is the intention. If it's not, I've messed up. Will double-check before commit. [~misty] where would I put a section in the book on this feature? Maybe something under 9.6.4. Block Cache ? [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111319#comment-14111319 ] Nick Dimiduk commented on HBASE-11331: -- Any other comments from reviewers? Are we happy with the config name hbase.block.data.cachecompressed ? [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111391#comment-14111391 ] Enis Soztutar commented on HBASE-11331: --- Left some comments at RB. bq. Any other comments from reviewers? Are we happy with the config name hbase.block.data.cachecompressed ? Should this be {{hbase.block.cache.data.cachecompressed}}? [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111393#comment-14111393 ] Enis Soztutar commented on HBASE-11331: --- Also add the parameter to hbase-default.xml and release notes so that it is findable. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14111547#comment-14111547 ] Misty Stanley-Jones commented on HBASE-11331: - [~ndimiduk] wow there are a lot of comments here and I don't have bandwidth to parse it right now. From looking at the description, it looks like maybe this needs a mention in the blockcache section and also in the compression / codec appendix. WDYT? [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109622#comment-14109622 ] Nick Dimiduk commented on HBASE-11331: -- Running some numbers where dataset is larger than available blockcache in both cases, will report back. In the mean time, would be nice to get some more eyes on the patch. I'll also be creating patches for branch-1 and 0.98. ping [~enis], [~apurtell], [~lhofhansl] [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109769#comment-14109769 ] Nick Dimiduk commented on HBASE-11331: -- This data is from increasing the query range to size=100; more data than fits in cache with either config. I had to run the warmup and test for longer periods because it took longer for the oscache and BlockCache to reach steady-state. || ||=false, 100g||=true, 100g||delta| |hbase.regionserver.server.Get_num_ops|313.83|466.69|{color:green}49%{color}| |hbase.regionserver.server.Get_mean|28.91 ms|20.00 ms|{color:green}-31%{color}| |hbase.regionserver.server.Get_99th_percentile|221.06 ms|197.30 ms|{color:green}-11%{color}| |hbase.regionserver.jvmmetrics.GcTimeMillis|26.99 ms|48.84 ms|{color:red}81%{color}| |proc.loadavg.1min|11.71|12.00|{color:red}2%{color}| |proc.stat.cpu.percpu{type=iowait}|343.11|404.10|{color:red}18%{color}| |hbase.regionserver.server.blockCacheCount|181.85 k|716.79 k|{color:green}294%{color}| Overall, I'd say that you want this feature enabled unless: # your data decompressed fits in BlockCache # your machines are not your own and CPU time is at a premium (ie, AWS). 2nd point above is just a guess. It's likely other factors are at play in these environments. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109792#comment-14109792 ] stack commented on HBASE-11331: --- 50% more ops for 80% more GC (20% more CPU) sounds like reasonable trade off. Would be interesting to see how it does in a long running test? Does the extra GC bring on the dreaded FGC? Is GC steady? On the blockCacheCount, this is indication of our caching more blocks? 3x? [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109817#comment-14109817 ] Nick Dimiduk commented on HBASE-11331: -- bq. Would be interesting to see how it does in a long running test? Today's results were from 90m warmup + 40m test. The 16h run I posted on Thursday [0] seemed stable. No slope in those graphs. This is just PE mind you, no concurrent writes. bq. On the blockCacheCount, this is indication of our caching more blocks? 3x? Yes, exactly. It's purely compression ratio manifest in memory. All this is using SNAPPY and the reported compression ratio is 0.2473, thus the ~3x increase. Probably using GZ would have better BC utilization buy higher CPU load (should translate to a standard compression benchmark). [0]: https://issues.apache.org/jira/secure/attachment/12663479/v03-20g-045g-true-16h.pdf [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14109930#comment-14109930 ] Hadoop QA commented on HBASE-11331: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12664245/HBASE-11331.05-0.98.patch against trunk revision . ATTACHMENT ID: 12664245 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 46 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10574//console This message is automatically generated. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110159#comment-14110159 ] Andrew Purtell commented on HBASE-11331: +1 for 0.98. This is good work. Didn't relish going through HFile code again (hasn't been long enough since HBASE-7544). Nothing jumped out to me as wrong. I will try taking another look when not spread so thin. The schema option needs to be false by default. We can write a note in the release notes and/or manual that if using compression the user should try enabling the option and only disable if observing too high CPU usage or too much GC under their workload. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05-0.98.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, hbase-hbase-master-hor17n36.gq1.ygridcore.net.log, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107368#comment-14107368 ] Nick Dimiduk commented on HBASE-11331: -- As best as I can tell, both of these configurations stay entirely in BlockCache. Verified by looking at the RS BlockCache stats and confirmed by the low iowait stat being basically flat for them both. Looks like enabling this feature when it's not needed is quite expensive. || ||=false, 11g||=true, 40g||delta| |hbase.regionserver.server.Get_num_ops|15.15 k|6.07 k|{color:red}-60%{color}| |hbase.regionserver.server.Get_mean|0.00 ns| 0.00 ns|{color:green}0%{color}| |hbase.regionserver.server.Get_99th_percentile|1.00 ms|22.65 ms|{color:red}2165%{color}| |hbase.regionserver.jvmmetrics.GcTimeMillis|48.89 ms|441.33 ms|{color:red}802%{color}| |proc.loadavg.1min|0.56|3.25|{color:red}480%{color}| |proc.stat.cpu.percpu{type=iowait}|3.55|3.47|{color:green}-2%{color}| |hbase.regionserver.server.blockCacheCount|181.75 k|666.44 k|{color:green}266%{color}| [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105622#comment-14105622 ] Nick Dimiduk commented on HBASE-11331: -- I'm now investigating those heart-beat looking GC spikes in the compressed=true graphs. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105617#comment-14105617 ] Nick Dimiduk commented on HBASE-11331: -- Previous comment is re: v03-20g-045g-false.pdf, v03-20g-045g-true.pdf [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105635#comment-14105635 ] Vladimir Rodionov commented on HBASE-11331: --- The standard YCSB is very friendly to compressed block cache (especially, with zipfian data access pattern). Just to let you know, Nick. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105865#comment-14105865 ] Nick Dimiduk commented on HBASE-11331: -- [~vrodionov] I think PE has zipfian now too (or maybe that's just the value size?). I'll take a look at YCSB, thanks. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14105888#comment-14105888 ] Nick Dimiduk commented on HBASE-11331: -- Interesting that proc.loadavg.1min decreases even though we know there's more decompress operations happening and the gc activity increases. Maybe that measurement includes iowait? [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106147#comment-14106147 ] stack commented on HBASE-11331: --- You figure the compressor issue, our not reusing them? 13x the GC because we are doing 10x the throughput is fair enough. All other numbers are very nice. This is best case (when =false, we are seeking? Or is it always inside fscache?) What is the 'cost' keeping stuff compressed? What if you do a run where all fits in cache, for both cases? [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106165#comment-14106165 ] Nick Dimiduk commented on HBASE-11331: -- Thanks for having a look [~stack]. bq. You figure the compressor issue, our not reusing them? I've punted on the compressor is a non-issue, but I haven't run with a profiler yet. I think it was related to the non-native gz impl while running locally. I'll re-enable tracing there as well with the next runs and see what I see. bq. This is best case (when =false, we are seeking? Or is it always inside fscache?) Yes, this configuration is targeting a best case for this patch. The fscache is minimized with this config, seems to stay down around 3.5g (vs 11.5g blockcache). Compression ratio is reported as 0.2437, so --size=45 should be ~11g compressed -- larger than the fscache. Because the PE test is random, I believe we'll be thrashing the fscache with both =true and =false. The iowait charts indicate both configs are doing io constantly, just more with =false (as expected). bq. What is the 'cost' keeping stuff compressed? What if you do a run where all fits in cache, for both cases? I'm testing a couple more scenarios, this one was already on the list. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14106221#comment-14106221 ] Hadoop QA commented on HBASE-11331: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12663549/HBASE-11331.05.patch against trunk revision . ATTACHMENT ID: 12663549 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 44 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 7 warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +LOG.trace(freed + StringUtils.byteDesc(bytesFreed) + from single and multi buckets); +LOG.trace(freed + StringUtils.byteDesc(bytesFreed) + total from all three buckets ); {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 2 zombie test(s): at org.apache.hadoop.hbase.mapreduce.TestImportTsv.testMROnTable(TestImportTsv.java:113) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10521//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10521//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10521//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10521//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10521//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10521//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10521//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10521//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10521//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10521//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10521//console This message is automatically generated. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331.03.patch, HBASE-11331.04.patch, HBASE-11331.05.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf, v03-20g-045g-false.pdf, v03-20g-045g-true-16h.pdf, v03-20g-045g-true.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088136#comment-14088136 ] stack commented on HBASE-11331: --- Any chance of our fixing compressor reuse. As you say, I'd imagine it'd mess up any possibility of nice numbers when this feature enabled. I'm game to rerun test when you say its ready. That a fancy graph tool on front a tsdb? [~ndimiduk]? Nice graphs. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088193#comment-14088193 ] Nick Dimiduk commented on HBASE-11331: -- bq. Any chance of our fixing compressor reuse. I'll look into it. If anything, it'll be a separate ticket. bq. I'm game to rerun test when you say its ready. Hold off on rerunning until I verify/refute my hypothesis on the blockcache heapsize stuff. bq. Nice graphs Awe shucks, thanks boss. This is Grafana running on top of OpenTSDB. No offense, but I find it nicer to use these dashboards than the gnuplot stuff we get otherwise. I'm still learning it, but I'll attach the dashboard json file I used to make the latest figures. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088271#comment-14088271 ] Andrew Purtell commented on HBASE-11331: bq. Grafana running on top of OpenTSDB Very nice. Is there a way to autogen the JSON? Pardon for hijacking the issue. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088436#comment-14088436 ] Nick Dimiduk commented on HBASE-11331: -- bq. Is there a way to autogen the JSON? The json file I attached was exported from Grafana; I did not build it in a text editor. It's UI starts with a black dashboard and you add to it the rows and column with metrics you want. Probably I'm not explaining it well; I'll leave it to grafana.org/docs [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf, lazy-decompress.02.1.json, lazy-decompress.02.1.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14087060#comment-14087060 ] Nick Dimiduk commented on HBASE-11331: -- Looks like I have about 10g used as disk cache. Will run the workloads again with RS heap increased to 20g. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14087200#comment-14087200 ] Hadoop QA commented on HBASE-11331: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12659287/HBASE-11331.02.patch against trunk revision . ATTACHMENT ID: 12659287 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 36 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.TestIOFencing org.apache.hadoop.hbase.regionserver.TestRegionReplicas org.apache.hadoop.hbase.client.TestReplicasClient org.apache.hadoop.hbase.master.TestRestartCluster org.apache.hadoop.hbase.TestRegionRebalancing Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/10308//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10308//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10308//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10308//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10308//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10308//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10308//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10308//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10308//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/10308//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/10308//console This message is automatically generated. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf, lazy-decompress.02.0.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14083268#comment-14083268 ] Nick Dimiduk commented on HBASE-11331: -- Looks like the lack of compressor reuse is a known issue, somewhere between HBASE-5881, HBASE-7435, and HADOOP-9171. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331.02.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075297#comment-14075297 ] stack commented on HBASE-11331: --- Seeing this... trying to figure it in client: {code}2014-07-26 00:38:46,924 DEBUG [IPC Client (1094864774) connection to c2021.halxg.cloudera.com/10.20.84.27:16020 from stack] ipc.RpcClient: IPC Client (1094864774) connection to c2021.halxg.cloudera.com/10.20.84.27:16020 from stack: got response header call_id: 1102 exception { exception_class_name: java.io.IOException stack_trace: java.io.IOException\n\tat org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2047)\n\tat org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)\n\tat org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)\n\tat org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)\n\tat java.lang.Thread.run(Thread.java:744)\nCaused by: java.lang.ArrayIndexOutOfBoundsException\n\tat org.apache.hadoop.hbase.io.hfile.HFileBlock.unpack(HFileBlock.java:498)\n\tat org.apache.hadoop.hbase.io.hfile.HFileReaderV2.getCachedBlock(HFileReaderV2.java:270)\n\tat org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:424)\n\tat org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:260)\n\tat org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:644)\n\tat org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:624)\n\tat org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:251)\n\tat org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:176)\n\tat org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:55)\n\tat org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:312)\n\tat org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:268)\n\tat org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:721)\n\tat org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:709)\n\tat org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:559)\n\tat org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:139)\n\tat org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:4002)\n\tat org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:4082)\n\tat org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3951)\n\tat org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3933)\n\tat org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3920)\n\tat org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4865)\n\tat org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4838)\n\tat org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1619)\n\tat org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29990)\n\tat org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2013)\n\t... 4 more\n do_not_retry: false }, totalSize: 2735 bytes{code} [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075225#comment-14075225 ] stack commented on HBASE-11331: --- bq. Do you think that's necessary for this feature, or an acceptable follow-on JIRA? Follow-on. Trying your latest patch. Will make new report with more variety to it. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075234#comment-14075234 ] stack commented on HBASE-11331: --- bq tilted against. ... it being on by default. For some use cases enabling it will make sense but not general case. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14075233#comment-14075233 ] stack commented on HBASE-11331: --- [~ndimiduk] IMO, this can't be on by default given the report previous. Benefit is not enough. Will post a new report in next few days but think the benefit vs cost will be about same; tilted against. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331.01.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073878#comment-14073878 ] Nick Dimiduk commented on HBASE-11331: -- bq. How feasible keeping count of how many times a block has been decompressed and if over a configurable threshold, instead shove the decompressed block back into the block cache in place of the compressed one? We already count if been accessed more than once? Could we leverage this fact? I like it. Do you think that's necessary for this feature, or an acceptable follow-on JIRA? [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073883#comment-14073883 ] Nick Dimiduk commented on HBASE-11331: -- bq. Or, if hot, move the decompressed and decoded block up into L1? This sounds like a feature to add to CombinedCache. Can blocks become less hot and be demoted back down to a compressed state in L2, or is promotion a one-way street? I guess regular block eviction will take care of this naturally. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch, HBASE-11331LazyBlockDecompressperfcompare.pdf Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050892#comment-14050892 ] stack commented on HBASE-11331: --- bq. How feasible keeping count of how many times a block has been decompressed and if over a configurable threshold, instead shove the decompressed block back into the block cache in place of the compressed one? Or, if hot, move the decompressed and decoded block up into L1? [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037824#comment-14037824 ] stack commented on HBASE-11331: --- How feasible keeping count of how many times a block has been decompressed and if over a configurable threshold, instead shove the decompressed block back into the block cache in place of the compressed one? We already count if been accessed more than once? Could we leverage this fact? bq. This is related to but less invasive than HBASE-8894. Would a better characterization be that this is a core piece of HBASE-8894 only done more in line w/ how hbase master branch works now (HBASE-8894 interjects a special-case handling of its L2 cache when reading blocks from HDFS... This makes do without special interjection). [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030586#comment-14030586 ] Anoop Sam John commented on HBASE-11331: Not decrypting gives (so much) size save? bq.decompressing every time I read from a block rather than as we have now where we decompress once Ya a concerning factor and +1 for making it configurable. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030948#comment-14030948 ] Vladimir Rodionov commented on HBASE-11331: --- These two statements contradict each other: {quote} Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. {quote} {quote} The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. {quote} You either keep blocks compressed in a cache and decompress them on demand (1) or you decompress them lazily and keep them decompressed after that (2). What does lazy decompression means in this case? If you cache blocks on reads only (most of the time and default behavior) - there is no much sense in a lazy decompression, because your block will be accessed immediately after it got into the cache. Lazy decompression makes sense only if you cache blocks on writes, but in this case (2) contradicts (1) as I mentioned already. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030968#comment-14030968 ] Nick Dimiduk commented on HBASE-11331: -- In this implementation, the decompressed block does not replace the compressed block in the cache. Decompression cost is paid on block access, every time. I need to profile the scanner path to ensure a single request is not decompressing the same block multiple times. For hot blocks, I expect this to result in increased CPU load vs decompressing it only once. For a more evenly distributed access pattern, this should greatly reduce the amount of disk seeks because more data is cached. I believe the latter use-case is more common. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030986#comment-14030986 ] Vladimir Rodionov commented on HBASE-11331: --- [~ndimiduk], where do you keep decompressed blocks? In a fast on heap cache? You do not have to decompress block every time you access it - only once and all subsequent scanner.next will read from decompressed block. Sorry, I am not following you here. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031002#comment-14031002 ] Vladimir Rodionov commented on HBASE-11331: --- In my own tests I have seen performance drop from ~ 100K to 75K ops with compression on. This is with a custom LZ4 codec and YMMV. I think 20-25% penalty is not that big to justify uncompressed mode of operation. In many cases, having more data in a cache is much more important than peak performance. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031006#comment-14031006 ] Nick Dimiduk commented on HBASE-11331: -- Decompressed blocks aren't stored anywhere in the patch posted. This patch, just as current master code, decompressed an HFileBlock into an on-heap ByteBuffer. There's no additional cache layer for decompressed blocks as it stands currently; they're decompressed, consumed, and thrown away. HFileReaderV2$AbstractScannerV2 keeps a reference to the current block, so a single GET shouldn't pay the cost of decompression multiple times, but I need to confirm this is true. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14031009#comment-14031009 ] Nick Dimiduk commented on HBASE-11331: -- bq. In many cases, having more data in a cache is much more important than peak performance. We are in complete agreement on this point. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14028512#comment-14028512 ] Hadoop QA commented on HBASE-11331: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12649880/HBASE-11331.00.patch against trunk revision . ATTACHMENT ID: 12649880 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 24 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.io.encoding.TestLoadAndSwitchEncodeOnDisk org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite org.apache.hadoop.hbase.regionserver.TestMultiColumnScanner org.apache.hadoop.hbase.io.hfile.TestHFileBlock {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.regionserver.TestHRegion.testWritesWhileGetting(TestHRegion.java:3499) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/9747//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/9747//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/9747//console This message is automatically generated. [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11331) [blockcache] lazy block decompression
[ https://issues.apache.org/jira/browse/HBASE-11331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14028647#comment-14028647 ] stack commented on HBASE-11331: --- Looks great to me [~ndimiduk] I am decompressing every time I read from a block rather than as we have now where we decompress once and that is what is in the blockcache? Will be interesting to see how this does compared. May show that this should be optional behavior (we give up CPU here... we need to start winning it back elsewhere). [blockcache] lazy block decompression - Key: HBASE-11331 URL: https://issues.apache.org/jira/browse/HBASE-11331 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Nick Dimiduk Assignee: Nick Dimiduk Attachments: HBASE-11331.00.patch Maintaining data in its compressed form in the block cache will greatly increase our effective blockcache size and should show a meaning improvement in cache hit rates in well designed applications. The idea here is to lazily decompress/decrypt blocks when they're consumed, rather than as soon as they're pulled off of disk. This is related to but less invasive than HBASE-8894. -- This message was sent by Atlassian JIRA (v6.2#6252)