subject:"\[jira\] \[Commented\] \(HBASE\-2794\) Utilize ROWCOL bloom filter if multiple columns within same family are requested in a Get"

[jira] [Commented] (HBASE-2794) Utilize ROWCOL bloom filter if multiple columns within same family are requested in a Get

2011-09-30 Thread Ted Yu (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118265#comment-13118265
 ] 

Ted Yu commented on HBASE-2794:
---

Integrated to 0.92 branch and TRUNK.

Thanks for the patch Mikhail.

Thanks for the review Jonathan.

 Utilize ROWCOL bloom filter if multiple columns within same family are 
 requested in a Get
 -

 Key: HBASE-2794
 URL: https://issues.apache.org/jira/browse/HBASE-2794
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: Kannan Muthukkaruppan
Assignee: Mikhail Bautin
 Fix For: 0.92.0


 Noticed the following snippet in StoreFile.java:Scanner:shouldSeek():
 {code}
 switch(bloomFilterType) {
   case ROW:
 key = row;
 break;
   case ROWCOL:
 if (columns.size() == 1) {
   byte[] col = columns.first();
   key = Bytes.add(row, col);
   break;
 }
 //$FALL-THROUGH$
   default:
 return true;
 }
 {code}
 If columns.size  1, then we currently don't take advantage of the bloom 
 filter.  We should optimize this to check bloom for each of columns and if 
 none of the columns are present in the bloom avoid opening the file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2794) Utilize ROWCOL bloom filter if multiple columns within same family are requested in a Get

2011-09-30 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118294#comment-13118294
 ] 

jirapos...@reviews.apache.org commented on HBASE-2794:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2084/#review2226
---

Ship it!


I'm +0 on commmitting this.  I tried reviewing it but I don't know this code 
well.  The added unit test is nicely intrusive and the asserts look right.  
What about Nicolas's performance concerns.  How are they addressed by this 
patch?  I'm running a build of the patch and if that passes I'm +1 on commit. 


src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
https://reviews.apache.org/r/2084/#comment5175

Interesting method name.  We should use this pattern everywhere we have to 
do this.



src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java
https://reviews.apache.org/r/2084/#comment5176

Should we get rid of this javadoc if an override?  (Let us know can do on 
commit)


- Michael


On 2011-09-29 21:05:20, Mikhail Bautin wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2084/
bq.  ---
bq.  
bq.  (Updated 2011-09-29 21:05:20)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Previously we only used row-column Bloom filters for scans that only 
requested one column. We have seen production queries that request up to 200 
columns, and with say ~6 store files per store (region / column family 
combination) this might have resulted in 1200 block read operations in the 
worst case. With this diff we will be avoiding seeks on store files that we 
know don't contain the row/column of interest when using an 
ExplicitColumnTracker. The performance should remain the same for column range 
queries.
bq.  
bq.  
bq.  This addresses bug HBASE-2794.
bq.  https://issues.apache.org/jira/browse/HBASE-2794
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/KeyValue.java 585c4a8 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java 
f5173c4 
bq.src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java a3d778e 
bq.
src/main/java/org/apache/hadoop/hbase/regionserver/AbstractKeyValueScanner.java 
PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 7cbdb98 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java 
9d9895c 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueScanner.java 
6cdada7 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java 4aa72de 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java 
68cdac5 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
fd9e7ef 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java 
08d3ba4 
bq.src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java 
ac2348e 
bq.src/main/java/org/apache/hadoop/hbase/util/CollectionBackedScanner.java 
32f88fb 
bq.src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueHeap.java 
a5d13f7 
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java 
baee696 
bq.
src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/2084/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Existing unit tests. A new unit test (TestScanWithBloomError). Load 
testing using HBaseTest.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Mikhail
bq.  
bq.



 Utilize ROWCOL bloom filter if multiple columns within same family are 
 requested in a Get
 -

 Key: HBASE-2794
 URL: https://issues.apache.org/jira/browse/HBASE-2794
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: Kannan Muthukkaruppan
Assignee: Mikhail Bautin
 Fix For: 0.92.0


 Noticed the following snippet in StoreFile.java:Scanner:shouldSeek():
 {code}
 switch(bloomFilterType) {
   case ROW:
 key = row;
 break;
   case ROWCOL:
 if (columns.size() == 1) {
   byte[] col = columns.first();
   key = Bytes.add(row, col);
   break;
 }
 //$FALL-THROUGH$
   default:
 return true;
 }
 {code}
 If columns.size  1,

[jira] [Commented] (HBASE-2794) Utilize ROWCOL bloom filter if multiple columns within same family are requested in a Get

2011-09-30 Thread stack (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118449#comment-13118449
 ] 

stack commented on HBASE-2794:
--

These failed after running full suite but seem unrelated:

{code}
Failed tests:   
testOnlineChangeTableSchema(org.apache.hadoop.hbase.client.TestAdmin)
  testForceSplitMultiFamily(org.apache.hadoop.hbase.client.TestAdmin): 
expected:2 but was:1

Tests in error:
  
testEnableDisableAddColumnDeleteColumn(org.apache.hadoop.hbase.client.TestAdmin):
 org.apache.hadoop.hbase.TableNotEnabledException: testMasterAdmin
{code}


 Utilize ROWCOL bloom filter if multiple columns within same family are 
 requested in a Get
 -

 Key: HBASE-2794
 URL: https://issues.apache.org/jira/browse/HBASE-2794
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: Kannan Muthukkaruppan
Assignee: Mikhail Bautin
 Fix For: 0.92.0


 Noticed the following snippet in StoreFile.java:Scanner:shouldSeek():
 {code}
 switch(bloomFilterType) {
   case ROW:
 key = row;
 break;
   case ROWCOL:
 if (columns.size() == 1) {
   byte[] col = columns.first();
   key = Bytes.add(row, col);
   break;
 }
 //$FALL-THROUGH$
   default:
 return true;
 }
 {code}
 If columns.size  1, then we currently don't take advantage of the bloom 
 filter.  We should optimize this to check bloom for each of columns and if 
 none of the columns are present in the bloom avoid opening the file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2794) Utilize ROWCOL bloom filter if multiple columns within same family are requested in a Get

2011-09-30 Thread Mikhail Bautin (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118471#comment-13118471
 ] 

Mikhail Bautin commented on HBASE-2794:
---

@Michael: I am observing a different set of spuriously failing tests, also 
seemingly unrelated.

2011-09-29_20_41_15 | tests: 1015, fail: 0, err: 0, skip: 21, time: 6027.3
2011-09-29_23_09_51 | tests: 1012, fail: 0, err: 0, skip: 21, time: 5328.0
2011-09-30_01_44_42 | tests: 1015, fail: 0, err: 0, skip: 21, time: 6338.4
2011-09-30_04_28_29 | tests: 1015, fail: 0, err: 0, skip: 21, time: 6079.2
2011-09-30_07_00_24 | tests: 1015, fail: 1, err: 0, skip: 21, time: 6656.2, 
failed: Admin
2011-09-30_09_41_53 | tests: 1015, fail: 0, err: 0, skip: 21, time: 5900.8
2011-09-30_12_10_25 | tests: 1004, fail: 1, err: 0, skip: 21, time: 5397.7, 
failed: DistributedLogSplitting

(Patch applied on top of http://svn.apache.org/repos/asf/hbase/trunk@1176613)


 Utilize ROWCOL bloom filter if multiple columns within same family are 
 requested in a Get
 -

 Key: HBASE-2794
 URL: https://issues.apache.org/jira/browse/HBASE-2794
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: Kannan Muthukkaruppan
Assignee: Mikhail Bautin
 Fix For: 0.92.0


 Noticed the following snippet in StoreFile.java:Scanner:shouldSeek():
 {code}
 switch(bloomFilterType) {
   case ROW:
 key = row;
 break;
   case ROWCOL:
 if (columns.size() == 1) {
   byte[] col = columns.first();
   key = Bytes.add(row, col);
   break;
 }
 //$FALL-THROUGH$
   default:
 return true;
 }
 {code}
 If columns.size  1, then we currently don't take advantage of the bloom 
 filter.  We should optimize this to check bloom for each of columns and if 
 none of the columns are present in the bloom avoid opening the file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2794) Utilize ROWCOL bloom filter if multiple columns within same family are requested in a Get

2011-09-30 Thread Hudson (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13118632#comment-13118632
 ] 

Hudson commented on HBASE-2794:
---

Integrated in HBase-TRUNK #2274 (See 
[https://builds.apache.org/job/HBase-TRUNK/2274/])
HBASE-2794  Utilize ROWCOL bloom filter if multiple columns within same 
family
   are requested in a Get (Mikhail Bautin)

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileReader.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/AbstractKeyValueScanner.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueScanner.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/MemStore.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/CollectionBackedScanner.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeyValueHeap.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMultiColumnScanner.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWithBloomError.java


 Utilize ROWCOL bloom filter if multiple columns within same family are 
 requested in a Get
 -

 Key: HBASE-2794
 URL: https://issues.apache.org/jira/browse/HBASE-2794
 Project: HBase
  Issue Type: Improvement
  Components: performance
Reporter: Kannan Muthukkaruppan
Assignee: Mikhail Bautin
 Fix For: 0.92.0


 Noticed the following snippet in StoreFile.java:Scanner:shouldSeek():
 {code}
 switch(bloomFilterType) {
   case ROW:
 key = row;
 break;
   case ROWCOL:
 if (columns.size() == 1) {
   byte[] col = columns.first();
   key = Bytes.add(row, col);
   break;
 }
 //$FALL-THROUGH$
   default:
 return true;
 }
 {code}
 If columns.size  1, then we currently don't take advantage of the bloom 
 filter.  We should optimize this to check bloom for each of columns and if 
 none of the columns are present in the bloom avoid opening the file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-2794) Utilize ROWCOL bloom filter if multiple columns within same family are requested in a Get

[jira] [Commented] (HBASE-2794) Utilize ROWCOL bloom filter if multiple columns within same family are requested in a Get

[jira] [Commented] (HBASE-2794) Utilize ROWCOL bloom filter if multiple columns within same family are requested in a Get

[jira] [Commented] (HBASE-2794) Utilize ROWCOL bloom filter if multiple columns within same family are requested in a Get

[jira] [Commented] (HBASE-2794) Utilize ROWCOL bloom filter if multiple columns within same family are requested in a Get

5 matches

Site Navigation

Mail list logo

Footer information