[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-28 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513580#comment-17513580
 ] 

Bryan Beaudreault commented on HBASE-21065:
---

Just as an update, we integrated this patch into our environment and also 
updated all of our existing hbase2 clusters to have this setup. That's 
currently only about 70 smallish clusters, but we'll be getting to our bigger 
clusters soon. Will loop back if we encounter any issues, but until then assume 
that no news is good :)

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.6.0
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-25 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512232#comment-17512232
 ] 

Hudson commented on HBASE-21065:


Results for branch branch-2
[build #497 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/497/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/497/General_20Nightly_20Build_20Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/497/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/497/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/497/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/497//artifact/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.6.0
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-24 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17512072#comment-17512072
 ] 

Hudson commented on HBASE-21065:


Results for branch branch-2.5
[build #74 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/74/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/74/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/74/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/74/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/74/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.5/74//artifact/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.6.0
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-23 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511412#comment-17511412
 ] 

Viraj Jasani commented on HBASE-21065:
--

Thank you [~bbeaudreault]! Changes look good, let's wait for full build QA 
results.

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.6.0
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-23 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511342#comment-17511342
 ] 

Bryan Beaudreault commented on HBASE-21065:
---

[~apurtell] I backported this to my company's branch-2-based fork and it 
required some minor conflict resolution. In order to save you the same work I 
also pushed it as a PR for the branch-2 backport: 
[https://github.com/apache/hbase/pull/4268.]  The commit there should apply 
cleanly to branch-2.5 as well.

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.6.0
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-22 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510866#comment-17510866
 ] 

Andrew Kyle Purtell commented on HBASE-21065:
-

That is my fault. I shouldn’t be on JIRA on mobile. I put the 3.0.0 fix version 
back. We can resolve this once the changes have been committed to branch-2 and 
branch-2.5 and all will be well. 

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 2.5.0, 2.6.0, 3.0.0-alpha-3
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-22 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510842#comment-17510842
 ] 

Bryan Beaudreault commented on HBASE-21065:
---

Hey [~apurtell] just to clarify since you're traveling. Sorry if I did a faux 
pas by commenting on this old, resolved issue, but here's what happened:
 * This issue was originally resolved with a fix version of 3.0.0-alpha-1, 
because the attached GH PR was merged into master only
 * I stumbled across this and was wondering why we couldn't do this for 
branch-2, since it seems valuable. So I commented here to ask the question
 * Now you've removed the 3.0.0-alpha fix version, which I don't think is 
correct since it is resolved there, as Duo mentions

I think probably what I should have done was to create a new Jira for "Backport 
HBASE-21065 to branch-2". Since we've already re-opened this issue, I think we 
have 2 options:
 * Return fix versions to 3.0.0-alpha-1 and resolve as fixed, then create the 
above Jira for backport
 * Set fix versions to 3.0.0-alpha-1, 2.5.0, 2.6.0 and resolve once the PR has 
been applied to branch-2.5 and branch-2

Sorry for the confusion here. 

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 2.5.0, 2.6.0
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-22 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510800#comment-17510800
 ] 

Andrew Kyle Purtell commented on HBASE-21065:
-

Great. I am traveling so did not check the code before posting. No concerns 
obviously if already done. Let me update fix versions here just for 2.x. 

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 2.5.0, 2.6.0, 3.0.0-alpha-3
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-22 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510687#comment-17510687
 ] 

Duo Zhang commented on HBASE-21065:
---

I assume you mean making this change in 2.x? We already have this landed on 
master, i.e, 3.0.0-alpha?

I do not have big concerns to land them to 2.x, it should be an improvement.

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Andrew Kyle Purtell
>Priority: Major
> Fix For: 2.5.0, 2.6.0, 3.0.0-alpha-3
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-22 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510558#comment-17510558
 ] 

Andrew Kyle Purtell commented on HBASE-21065:
-

(y)

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-22 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510439#comment-17510439
 ] 

Bryan Beaudreault commented on HBASE-21065:
---

What you summarized is what I was thinking

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-21 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509947#comment-17509947
 ] 

Andrew Kyle Purtell commented on HBASE-21065:
-

bq. I agree it might be nice to do this by default for 2.5.0 if we already have 
it enabled in some production.

It is unlikely this will receive testing beyond what we can muster on a 
voluntary basis for the release, and the act of enabling it would be a opt out 
change enabling wide scale testing. I am confident the mechanics of the 
encoding and bloomfilters themselves are not problematic but a change like this 
could introduce interesting multi factor consequences for some environments. 
This would be documented as such in a release note including instructions on 
how to opt out. I think that can address generic concerns but I would be 
interested if you have a specific test scenario in mind.

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-19 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509339#comment-17509339
 ] 

Bryan Beaudreault commented on HBASE-21065:
---

Thanks for the details both. Took a closer look at the patch. Glad I can just 
enable for my case.

I agree it might be nice to do this by default for 2.5.0 if we already have it 
enabled in some production. The biggest issue with ROW_INDEX_V1 seems to be 
size, and meta should not be large. I find this encoding is great for random 
read, which I agree is often the case with meta.

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-19 Thread Andrew Kyle Purtell (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509331#comment-17509331
 ] 

Andrew Kyle Purtell commented on HBASE-21065:
-

bq. If you are just interested in addressing meta hotspot issues, edit your 
hbase:meta and enable ROW_INDEX_V1 and BLOOMFILTER... You can since 2.3.0. I 
checked that at least one cluster where I work has this in place – 2.4.x, 
BLOOMFILTER => 'ROW', DATA_BLOCK_ENCODING => 'ROW_INDEX_V1'  on all hbase:meta 
columnfamilies. Enabling meta replicas also helped.

With the exception of enabling replicas, is this reasonable to try as default 
in upcoming 2.5.0? It is coming in for a landing. 

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-19 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509330#comment-17509330
 ] 

Michael Stack commented on HBASE-21065:
---

[~bbeaudreault] making hbase:meta schema default enabling ROW_INDEX_V1? 
Probably not a technical issue. IIRC, probably thought changing default 
hbase:meta schema should wait on major version release  (I think the patch and 
the subject on this Jira are out of alignment so there might be some confusion 
here as to what this Jira did).

If you are just interested in addressing meta hotspot issues, edit your 
hbase:meta and enable ROW_INDEX_V1 and BLOOMFILTER... You can since 2.3.0. I 
checked that at least one cluster where I work has this in place – 2.4.x, 
BLOOMFILTER => 'ROW', DATA_BLOCK_ENCODING => 'ROW_INDEX_V1'  on all hbase:meta 
columnfamilies. Enabling meta replicas also helped.

 

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-18 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509118#comment-17509118
 ] 

Bryan Beaudreault commented on HBASE-21065:
---

Looks like a handful of conflicts, so maybe complicated. But curious if there 
was a technical reason beyond that.

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2022-03-18 Thread Bryan Beaudreault (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509117#comment-17509117
 ] 

Bryan Beaudreault commented on HBASE-21065:
---

[~stack] sorry to ping on an old issue, but I'm curious – is there a reason 
this can't land in branch-2? We have meta hotspot issues often, so any 
improvement there will help. I see HBASE-23705 landed in branch-2. I can take a 
look at a backport if there were no real reasons.

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2020-01-24 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022976#comment-17022976
 ] 

Hudson commented on HBASE-21065:


Results for branch master
[build #1607 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/1607/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1607//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1607//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1607//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0
>
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2020-01-20 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019677#comment-17019677
 ] 

Michael Stack commented on HBASE-21065:
---

HBASE-23705 is about fixing our cellcomparator handling. It needs to land 
before this can go in.

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2020-01-17 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17018270#comment-17018270
 ] 

Michael Stack commented on HBASE-21065:
---

Changing the encoding on  meta exposes the fact that the ROW_INDEX_V1 encoder 
does not work on the hbase:meta table; it has hard-coded the user-space 
CellComparator. Reviewing how CellComparators are instantiated around the 
codebase, we are inconsistent and encoding context does not have what 
CellComparator is appropriate. Let me fix this first. Will fix the UT failures 
we're seeing in the PR here.

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-21065) Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we are at it)

2020-01-09 Thread Michael Stack (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012252#comment-17012252
 ] 

Michael Stack commented on HBASE-21065:
---

Put up a patch. Lets see how it does.

> Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters on meta while we 
> are at it)
> -
>
> Key: HBASE-21065
> URL: https://issues.apache.org/jira/browse/HBASE-21065
> Project: HBase
>  Issue Type: Improvement
>  Components: meta, Performance
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
>
> Some users end up hitting meta hard. Bulk is probably because our client goes 
> to meta too often, and the real 'fix' for a saturated meta is splitting it, 
> but the encoding that came in with HBASE-16213, ROW_INDEX_V1, could help in 
> the near term. It adds an index on hfile blocks and helped improve random 
> reads against user-space tables (less compares as we used index to go direct 
> to requested Cells rather than look at each Cell in turn until we found what 
> we wanted -- see RN on HBASE-16213 for citation).
> I also noticed code-reading that we don't enable blooms on hbase:meta tables; 
> that could save some CPU and speed things up a bit too:
> {code}
> // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
> .setBloomFilterType(BloomType.NONE)
> {code}
> This issue is about doing a bit of perf compare of encoding *on* vs current 
> default (and will check diff in size of indexed blocks).
> Meta access is mostly random-read I believe (A review of a user's access 
> showed this so at least for their workload). The nice addition, HBASE-19722 
> Meta query statistics metrics source, would help verify if it saw some usage 
> on a prod cluster.
> If all is good, I'd like to make a small patch, one that could be easily 
> backported, with minimal changes in it.
> As is, its all a little awkward as the meta table schema is hard-coded and 
> meta is immutable -- stuff we'll have to fix if we want to split meta -- so 
> in the meantime it requires a code change to enable (and a backport of 
> HBASE-16213 -- this patch is in 1.4.0 only currently, perhaps that is 
> enough). Code change to enable is small:
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> index 28c7ec3c2f..8f08f94dc1 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java
> @@ -160,6 +160,7 @@ public class FSTableDescriptors implements 
> TableDescriptors {
>  .setScope(HConstants.REPLICATION_SCOPE_LOCAL)
>  // Disable blooms for meta.  Needs work.  Seems to mess w/ 
> getClosestOrBefore.
>  .setBloomFilterType(BloomType.NONE)
> +
> .setDataBlockEncoding(org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.ROW_INDEX_V1)
>  .build())
>
> .setColumnFamily(ColumnFamilyDescriptorBuilder.newBuilder(HConstants.TABLE_FAMILY)
>  .setMaxVersions(conf.getInt(HConstants.HBASE_META_VERSIONS,
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)