[jira] [Updated] (HBASE-4683) Always cache index and bloom blocks

2012-02-09 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4683:
---

Attachment: D1695.1.patch

mbautin requested code review of [jira] [HBASE-4683] Test that we always cache 
index and bloom blocks.
Reviewers: JIRA, jdcryans, lhofhansl, Liyin

  This was reviewed D807 but was not checked in. Submitting unit test as a 
separate patch, and extending the scope of the test to also handle the case 
when block cache is enabled for the column family.

TEST PLAN
  Run unit tests

REVISION DETAIL
  https://reviews.facebook.net/D1695

AFFECTED FILES
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/3621/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


 Always cache index and bloom blocks
 ---

 Key: HBASE-4683
 URL: https://issues.apache.org/jira/browse/HBASE-4683
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Mikhail Bautin
Priority: Minor
 Fix For: 0.94.0, 0.92.0

 Attachments: 0001-Cache-important-block-types.patch, 4683-v2.txt, 
 4683.txt, D1695.1.patch, D807.1.patch, D807.2.patch, D807.3.patch, 
 HBASE-4683-0.92-v2.patch, HBASE-4683-v3.patch


 This would add a new boolean config option: hfile.block.cache.datablocks
 Default would be true.
 Setting this to false allows HBase in a mode where only index blocks are 
 cached, which is useful for analytical scenarios where a useful working set 
 of the data cannot be expected to fit into the (aggregate) cache.
 This is the equivalent of setting cacheBlocks to false on all scans 
 (including scans on behalf of gets).
 I would like to get a general feeling about what folks think about this.
 The change itself would be simple.
 Update (Mikhail): we probably don't need a new conf option. Instead, we will 
 make index blocks cached by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4683) Always cache index and bloom blocks

2011-12-14 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4683:
---

Attachment: D807.3.patch

mbautin updated the revision [jira] [HBASE-4683] Always cache index and bloom 
blocks.
Reviewers: jdcryans, lhofhansl, JIRA

  Rebasing up to r1214519.

REVISION DETAIL
  https://reviews.facebook.net/D807

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java
  src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaConfigured.java


 Always cache index and bloom blocks
 ---

 Key: HBASE-4683
 URL: https://issues.apache.org/jira/browse/HBASE-4683
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Mikhail Bautin
Priority: Minor
 Fix For: 0.92.0, 0.94.0

 Attachments: 4683-v2.txt, 4683.txt, D807.1.patch, D807.2.patch, 
 D807.3.patch, HBASE-4683-0.92-v2.patch, HBASE-4683-v3.patch


 This would add a new boolean config option: hfile.block.cache.datablocks
 Default would be true.
 Setting this to false allows HBase in a mode where only index blocks are 
 cached, which is useful for analytical scenarios where a useful working set 
 of the data cannot be expected to fit into the (aggregate) cache.
 This is the equivalent of setting cacheBlocks to false on all scans 
 (including scans on behalf of gets).
 I would like to get a general feeling about what folks think about this.
 The change itself would be simple.
 Update (Mikhail): we probably don't need a new conf option. Instead, we will 
 make index blocks cached by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4683) Always cache index and bloom blocks

2011-12-14 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4683:
--

Attachment: 0001-Cache-important-block-types.patch

Attaching the patch rebased on top of r1214519.

 Always cache index and bloom blocks
 ---

 Key: HBASE-4683
 URL: https://issues.apache.org/jira/browse/HBASE-4683
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Mikhail Bautin
Priority: Minor
 Fix For: 0.92.0, 0.94.0

 Attachments: 0001-Cache-important-block-types.patch, 4683-v2.txt, 
 4683.txt, D807.1.patch, D807.2.patch, D807.3.patch, HBASE-4683-0.92-v2.patch, 
 HBASE-4683-v3.patch


 This would add a new boolean config option: hfile.block.cache.datablocks
 Default would be true.
 Setting this to false allows HBase in a mode where only index blocks are 
 cached, which is useful for analytical scenarios where a useful working set 
 of the data cannot be expected to fit into the (aggregate) cache.
 This is the equivalent of setting cacheBlocks to false on all scans 
 (including scans on behalf of gets).
 I would like to get a general feeling about what folks think about this.
 The change itself would be simple.
 Update (Mikhail): we probably don't need a new conf option. Instead, we will 
 make index blocks cached by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4683) Always cache index and bloom blocks

2011-12-13 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4683:
---

Attachment: D807.2.patch

mbautin updated the revision [jira] [HBASE-4683] Always cache index and bloom 
blocks.
Reviewers: jdcryans, lhofhansl, JIRA

  Actually, the metrics changes are important. The unit test uses per-CF 
metrics to determine whether the blocks have been cached (or not cached, for 
data blocks), and when running the unit test I discovered some bugs in how 
table and CF name are passed down to HFile readers/writers, so I had to clean 
that up a bit.

REVISION DETAIL
  https://reviews.facebook.net/D807

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java
  src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaConfigured.java


 Always cache index and bloom blocks
 ---

 Key: HBASE-4683
 URL: https://issues.apache.org/jira/browse/HBASE-4683
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Mikhail Bautin
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4683-v2.txt, 4683.txt, D807.1.patch, D807.2.patch, 
 HBASE-4683-v3.patch


 This would add a new boolean config option: hfile.block.cache.datablocks
 Default would be true.
 Setting this to false allows HBase in a mode where only index blocks are 
 cached, which is useful for analytical scenarios where a useful working set 
 of the data cannot be expected to fit into the (aggregate) cache.
 This is the equivalent of setting cacheBlocks to false on all scans 
 (including scans on behalf of gets).
 I would like to get a general feeling about what folks think about this.
 The change itself would be simple.
 Update (Mikhail): we probably don't need a new conf option. Instead, we will 
 make index blocks cached by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4683) Always cache index and bloom blocks

2011-12-13 Thread Jean-Daniel Cryans (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-4683:
--

Fix Version/s: 0.92.0

This needs to be in 0.92 too, the potential performance regression is too big.

 Always cache index and bloom blocks
 ---

 Key: HBASE-4683
 URL: https://issues.apache.org/jira/browse/HBASE-4683
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Mikhail Bautin
Priority: Minor
 Fix For: 0.92.0, 0.94.0

 Attachments: 4683-v2.txt, 4683.txt, D807.1.patch, D807.2.patch, 
 HBASE-4683-v3.patch


 This would add a new boolean config option: hfile.block.cache.datablocks
 Default would be true.
 Setting this to false allows HBase in a mode where only index blocks are 
 cached, which is useful for analytical scenarios where a useful working set 
 of the data cannot be expected to fit into the (aggregate) cache.
 This is the equivalent of setting cacheBlocks to false on all scans 
 (including scans on behalf of gets).
 I would like to get a general feeling about what folks think about this.
 The change itself would be simple.
 Update (Mikhail): we probably don't need a new conf option. Instead, we will 
 make index blocks cached by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4683) Always cache index and bloom blocks

2011-12-12 Thread Mikhail Bautin (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-4683:
--

Summary: Always cache index and bloom blocks  (was: Always cache index 
blocks)

 Always cache index and bloom blocks
 ---

 Key: HBASE-4683
 URL: https://issues.apache.org/jira/browse/HBASE-4683
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Mikhail Bautin
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4683-v2.txt, 4683.txt, HBASE-4683-v3.patch


 This would add a new boolean config option: hfile.block.cache.datablocks
 Default would be true.
 Setting this to false allows HBase in a mode where only index blocks are 
 cached, which is useful for analytical scenarios where a useful working set 
 of the data cannot be expected to fit into the (aggregate) cache.
 This is the equivalent of setting cacheBlocks to false on all scans 
 (including scans on behalf of gets).
 I would like to get a general feeling about what folks think about this.
 The change itself would be simple.
 Update (Mikhail): we probably don't need a new conf option. Instead, we will 
 make index blocks cached by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4683) Always cache index and bloom blocks

2011-12-12 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-4683:
---

Attachment: D807.1.patch

mbautin requested code review of [jira] [HBASE-4683] Always cache index and 
bloom blocks.
Reviewers: jdcryans, lhofhansl, JIRA

  Making sure that we always cache index and bloom blocks, as long as we have 
block cache, even if the cache is disabled for the CF. Adding a unit test to 
verify this using metrics. In process of doing this, I've factored out a region 
creation method in HBaseTestingUtility, and cleaned up a few things in HFile 
readers/writers. I have also categorized HFile v1 bloom blocks (that are 
technically meta blocks) as bloom blocks in metrics.

TEST PLAN
  Unit tests

REVISION DETAIL
  https://reviews.facebook.net/D807

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaConfigured.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  
src/test/java/org/apache/hadoop/hbase/io/hfile/TestForceCacheImportantBlocks.java
  src/test/java/org/apache/hadoop/hbase/regionserver/CreateRandomStoreFile.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestSeekOptimizations.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/1719/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


 Always cache index and bloom blocks
 ---

 Key: HBASE-4683
 URL: https://issues.apache.org/jira/browse/HBASE-4683
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Mikhail Bautin
Priority: Minor
 Fix For: 0.94.0

 Attachments: 4683-v2.txt, 4683.txt, D807.1.patch, HBASE-4683-v3.patch


 This would add a new boolean config option: hfile.block.cache.datablocks
 Default would be true.
 Setting this to false allows HBase in a mode where only index blocks are 
 cached, which is useful for analytical scenarios where a useful working set 
 of the data cannot be expected to fit into the (aggregate) cache.
 This is the equivalent of setting cacheBlocks to false on all scans 
 (including scans on behalf of gets).
 I would like to get a general feeling about what folks think about this.
 The change itself would be simple.
 Update (Mikhail): we probably don't need a new conf option. Instead, we will 
 make index blocks cached by default.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira