[jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
[ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15180623#comment-15180623 ] Aravindan Vijayan commented on HBASE-15249: --- On further investigation, we find that the normalizer is not causing an issue for the AMS use case. We ran the normalizer in the following scenarios and verified that it respects the initial splits recommended by AMS. 1. Small cluster (~10 Nodes) {code} 2016-02-29 23:22:29,372 DEBUG [avijayan-hbase-3.novalocal,56004,1456787532496_ChoreService_1] normalizer.SimpleRegionNormalizer: Computing normalization plan for table: METRIC_RECORD, number of regions: 10 2016-02-29 23:22:29,372 DEBUG [avijayan-hbase-3.novalocal,56004,1456787532496_ChoreService_1] normalizer.SimpleRegionNormalizer: Table METRIC_RECORD, total aggregated regions size: 0 2016-02-29 23:22:29,372 DEBUG [avijayan-hbase-3.novalocal,56004,1456787532496_ChoreService_1] normalizer.SimpleRegionNormalizer: Table METRIC_RECORD, average region size: 0.0 2016-02-29 23:22:29,372 DEBUG [avijayan-hbase-3.novalocal,56004,1456787532496_ChoreService_1] normalizer.SimpleRegionNormalizer: No normalization needed, regions look good for table: METRIC_RECORD {code} 2. Large cluster (~900 Nodes) with hbase.normalizer.period = 10 minutes {code} 2016-03-04 20:55:14,168 DEBUG [perf-a-2.c.pramod-thangali.internal,61300,1457124301027_ChoreService_1] normalizer.SimpleRegionNormalizer: Computing normalization plan for table: METRIC_RECORD, number of r egions: 10 2016-03-04 20:55:14,168 DEBUG [perf-a-2.c.pramod-thangali.internal,61300,1457124301027_ChoreService_1] normalizer.SimpleRegionNormalizer: Table METRIC_RECORD, total aggregated regions size: 157 2016-03-04 20:55:14,168 DEBUG [perf-a-2.c.pramod-thangali.internal,61300,1457124301027_ChoreService_1] normalizer.SimpleRegionNormalizer: Table METRIC_RECORD, average region size: 15.7 2016-03-04 20:55:14,169 DEBUG [perf-a-2.c.pramod-thangali.internal,61300,1457124301027_ChoreService_1] normalizer.SimpleRegionNormalizer: Table METRIC_RECORD, largest region [B@6901ddd has size 37, more t han 2 times than avg size, splitting 2016-03-04 20:55:14,170 INFO [perf-a-2.c.pramod-thangali.internal,61300,1457124301027_ChoreService_1] normalizer.SplitNormalizationPlan: Executing splitting normalization plan: SplitNormalizationPlan{reg ionInfo={ENCODED => 318e17da72832e72005e202f09a7ee55, NAME => 'METRIC_RECORD,regionserver.Server.Increment_num_ops\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00,1457124324424.318e17da72832e72005e202f09a 7ee55.', STARTKEY => 'regionserver.Server.Increment_num_ops\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', ENDKEY => 'rpc.rpc.CallQueueLength\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'}, splitPoi nt=null} {code} 3. Large cluster with hbase.normalizer.period = 1 minute, to test normalizer behavior for close to empty regions {code} 2016-03-04 21:50:50,465 DEBUG [perf-a-2.c.pramod-thangali.internal,61300,1457127998192_ChoreService_1] normalizer.SimpleRegionNormalizer: Computing normalization plan for table: METRIC_RECORD, number of r egions: 10 2016-03-04 21:50:50,467 DEBUG [perf-a-2.c.pramod-thangali.internal,61300,1457127998192_ChoreService_1] normalizer.SimpleRegionNormalizer: Table METRIC_RECORD, total aggregated regions size: 0 2016-03-04 21:50:50,467 DEBUG [perf-a-2.c.pramod-thangali.internal,61300,1457127998192_ChoreService_1] normalizer.SimpleRegionNormalizer: Table METRIC_RECORD, average region size: 0.0 2016-03-04 21:50:50,467 DEBUG [perf-a-2.c.pramod-thangali.internal,61300,1457127998192_ChoreService_1] normalizer.SimpleRegionNormalizer: No normalization needed, regions look good for table: METRIC_RECOR D {code} > Provide lower bound on number of regions in region normalizer for pre-split > tables > -- > > Key: HBASE-15249 > URL: https://issues.apache.org/jira/browse/HBASE-15249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Labels: normalization > Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt > > > AMS (Ambari Metrics System) developer found the following scenario: > Metrics table was pre-split with many regions on large cluster (1600 nodes). > After some time, AMS stopped working because region normalizer merged the > regions into few big regions which were not able to serve high read / write > load. > This is a big problem since the write requests flood the regions faster than > the splits can happen resulting in poor performance. > We should consider setting reasonable lower bound on region count. > If the table is pre-split, we can use initial region count as the lower bound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
[ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172206#comment-15172206 ] Elliott Clark commented on HBASE-15249: --- This whole normalizer seems to be running off the rails. We can't add a new config every time there's a new use case that the normalizer doesn't behave the ideal way. That leads to a feature that is so complex that everyone gets it wrong. It seems like the normalizer is currently using incorrect logic and incorrect signals. Are we sure this is a feature that will ever be complete? > Provide lower bound on number of regions in region normalizer for pre-split > tables > -- > > Key: HBASE-15249 > URL: https://issues.apache.org/jira/browse/HBASE-15249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Labels: normalization > Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt > > > AMS (Ambari Metrics System) developer found the following scenario: > Metrics table was pre-split with many regions on large cluster (1600 nodes). > After some time, AMS stopped working because region normalizer merged the > regions into few big regions which were not able to serve high read / write > load. > This is a big problem since the write requests flood the regions faster than > the splits can happen resulting in poor performance. > We should consider setting reasonable lower bound on region count. > If the table is pre-split, we can use initial region count as the lower bound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
[ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150949#comment-15150949 ] Siddharth Wagle commented on HBASE-15249: - {quote} What does the math look like for region splits {quote} Ref: AMBARI-13039. We use the _memstore.lowerLimit_ and _memstore.flush.size_ to calculate memory available to the memstore and number of max-value on regions. Then we calculate lexically equidistant split points based on the services deployed by Ambari (from a static list of metrics that we mined from a deployed cluster) for the large tables. {quote}You need to run normalizer?{quote} In a stable state it seems normalizer works well for us managing the region boundaries. We do give user the option to disable this with a configuration setting in AMS (precautionary tactic from our end). All in all, we can definitely live without the normalizer this was not available to us until very recently, the pre-splitting pre-dates normalizer setting in AMS. The best use case for normalizer use for us is this: Ambari user can lets say add a service example: KAFKA that starts writing a ton of metrics and introduces a skew where previous splits become irrelevant. [~stack] / [~anoop.hbase] Thanks for feedback. > Provide lower bound on number of regions in region normalizer for pre-split > tables > -- > > Key: HBASE-15249 > URL: https://issues.apache.org/jira/browse/HBASE-15249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt > > > AMS (Ambari Metrics System) developer found the following scenario: > Metrics table was pre-split with many regions on large cluster (1600 nodes). > After some time, AMS stopped working because region normalizer merged the > regions into few big regions which were not able to serve high read / write > load. > This is a big problem since the write requests flood the regions faster than > the splits can happen resulting in poor performance. > We should consider setting reasonable lower bound on region count. > If the table is pre-split, we can use initial region count as the lower bound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
[ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149801#comment-15149801 ] Anoop Sam John commented on HBASE-15249: Thanks for the explanation of ur usage [~swagle]. Yes what I was saying is even if write reqs to a region is very small in numbers, still it may not be correct to merge it with another. As there is clear indication that this region is growing. May be after some time it might be getting much more write load. When 2 regions are done with all its writes and the data will be used only for read purpose, it may get merged.. The challenge is how we know whether region is done with its writes :-) > Provide lower bound on number of regions in region normalizer for pre-split > tables > -- > > Key: HBASE-15249 > URL: https://issues.apache.org/jira/browse/HBASE-15249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt > > > AMS (Ambari Metrics System) developer found the following scenario: > Metrics table was pre-split with many regions on large cluster (1600 nodes). > After some time, AMS stopped working because region normalizer merged the > regions into few big regions which were not able to serve high read / write > load. > This is a big problem since the write requests flood the regions faster than > the splits can happen resulting in poor performance. > We should consider setting reasonable lower bound on region count. > If the table is pre-split, we can use initial region count as the lower bound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
[ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149636#comment-15149636 ] stack commented on HBASE-15249: --- bq. AMS (Ambari Metrics system), creates tables with pre-splits based on the knowledge of how many daemons will be writing metrics to HBase and the memory available to RS. What does the math look like [~swagle]? bq. and this count dropped shortly after the system came online. You need to run normalizer? Its a new feature that is not yet in any shipping version of hbase and it is off by default. An important service like AMS might try and do without it, at least at first? Could you settle for a less aggressive set of initial splits that is somewhat a factor of how many servers there are involved? e.g. cluster node count/ 10? As is, the default is to split aggressively at first so regions fan out over the cluster. That was not working for you? > Provide lower bound on number of regions in region normalizer for pre-split > tables > -- > > Key: HBASE-15249 > URL: https://issues.apache.org/jira/browse/HBASE-15249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt > > > AMS (Ambari Metrics System) developer found the following scenario: > Metrics table was pre-split with many regions on large cluster (1600 nodes). > After some time, AMS stopped working because region normalizer merged the > regions into few big regions which were not able to serve high read / write > load. > This is a big problem since the write requests flood the regions faster than > the splits can happen resulting in poor performance. > We should consider setting reasonable lower bound on region count. > If the table is pre-split, we can use initial region count as the lower bound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
[ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149247#comment-15149247 ] Siddharth Wagle commented on HBASE-15249: - [~anoop.hbase] AMS (Ambari Metrics system), creates tables with pre-splits based on the knowledge of how many daemons will be writing metrics to HBase and the memory available to RS. What we saw is that on a large cluster we defined close to 10 initial pre-split Regions for the table that gets heavy Read/Write load and this count dropped shortly after the system came online. This was during Ambari performance test runs. *Note*: The metrics system is a constant write load system, however there is a bootstrap lag as the cluster comes online. This is the grey area where although Regions are not getting as many writes, merging them would be a bad idea. We enable normalizer specifically because the intial splits might not be optimal, however, the Region count is certainly critical for us to support the volume of writes that would eventually settle downs to a consistent number. We will try to get the numbers for you, since normalizer has DEBUG as the log level, we could not capture this on our intial run. > Provide lower bound on number of regions in region normalizer for pre-split > tables > -- > > Key: HBASE-15249 > URL: https://issues.apache.org/jira/browse/HBASE-15249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt > > > AMS (Ambari Metrics System) developer found the following scenario: > Metrics table was pre-split with many regions on large cluster (1600 nodes). > After some time, AMS stopped working because region normalizer merged the > regions into few big regions which were not able to serve high read / write > load. > This is a big problem since the write requests flood the regions faster than > the splits can happen resulting in poor performance. > We should consider setting reasonable lower bound on region count. > If the table is pre-split, we can use initial region count as the lower bound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
[ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142821#comment-15142821 ] Ted Yu commented on HBASE-15249: If some regions are empty, they wouldn't get write requests. But if regions receive even access, it is likely that regions would get write requests. How about giving regions whose write requests are in the bottom 10% (can make this configurable) ? > Provide lower bound on number of regions in region normalizer for pre-split > tables > -- > > Key: HBASE-15249 > URL: https://issues.apache.org/jira/browse/HBASE-15249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt > > > AMS (Ambari Metrics System) developer found the following scenario: > Metrics table was pre-split with many regions on large cluster (1600 nodes). > After some time, AMS stopped working because region normalizer merged the > regions into few big regions which were not able to serve high read / write > load. > This is a big problem since the write requests flood the regions faster than > the splits can happen resulting in poor performance. > We should consider setting reasonable lower bound on region count. > If the table is pre-split, we can use initial region count as the lower bound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
[ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142935#comment-15142935 ] stack commented on HBASE-15249: --- bq. How about giving regions whose write requests are in the bottom 10% (can make this configurable) ? Above is totally arbitrary. If a region is getting 100 hits a second, thats 10% of 1k -- you'd merge it because it is getting too little load? bq. After some time, AMS stopped working because region normalizer merged the regions into few big regions which were not able to serve high read / write load. This is a big problem since the write requests flood the regions faster than the splits can happen resulting in poor performance. Numbers? How many regions was too little? What was the hit rate that overwhelmed? What did the normalizer do? It merged up how many regions in how much time? The problem here is that the time between presplit and the load coming on was too long? The normalizer merged up the regions before the load came on? > Provide lower bound on number of regions in region normalizer for pre-split > tables > -- > > Key: HBASE-15249 > URL: https://issues.apache.org/jira/browse/HBASE-15249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt > > > AMS (Ambari Metrics System) developer found the following scenario: > Metrics table was pre-split with many regions on large cluster (1600 nodes). > After some time, AMS stopped working because region normalizer merged the > regions into few big regions which were not able to serve high read / write > load. > This is a big problem since the write requests flood the regions faster than > the splits can happen resulting in poor performance. > We should consider setting reasonable lower bound on region count. > If the table is pre-split, we can use initial region count as the lower bound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
[ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143541#comment-15143541 ] Ted Yu commented on HBASE-15249: To better answer the questions above, we are trying to reproduce the problem with DEBUG logging on. Previously metrics table ended up with 2 regions. Will be back with details. > Provide lower bound on number of regions in region normalizer for pre-split > tables > -- > > Key: HBASE-15249 > URL: https://issues.apache.org/jira/browse/HBASE-15249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt > > > AMS (Ambari Metrics System) developer found the following scenario: > Metrics table was pre-split with many regions on large cluster (1600 nodes). > After some time, AMS stopped working because region normalizer merged the > regions into few big regions which were not able to serve high read / write > load. > This is a big problem since the write requests flood the regions faster than > the splits can happen resulting in poor performance. > We should consider setting reasonable lower bound on region count. > If the table is pre-split, we can use initial region count as the lower bound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
[ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143049#comment-15143049 ] Anoop Sam John commented on HBASE-15249: Normalizer just checks the region size and decide on merge? Ya when a region is getting write requests (even if very less than 10%) we should not allow to merge it IMHO.. This means the region is growing.When 2 regions stop getting any write reqs and may be only have read request/or even that also not, it can be considered for merge. > Provide lower bound on number of regions in region normalizer for pre-split > tables > -- > > Key: HBASE-15249 > URL: https://issues.apache.org/jira/browse/HBASE-15249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt > > > AMS (Ambari Metrics System) developer found the following scenario: > Metrics table was pre-split with many regions on large cluster (1600 nodes). > After some time, AMS stopped working because region normalizer merged the > regions into few big regions which were not able to serve high read / write > load. > This is a big problem since the write requests flood the regions faster than > the splits can happen resulting in poor performance. > We should consider setting reasonable lower bound on region count. > If the table is pre-split, we can use initial region count as the lower bound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
[ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141875#comment-15141875 ] Ted Yu commented on HBASE-15249: Alternatively, we can retrieve write request counts from RegionLoad for the regions. If the two regions to be merged happen to receive the most write requests, don't merge them in the current iteration. > Provide lower bound on number of regions in region normalizer for pre-split > tables > -- > > Key: HBASE-15249 > URL: https://issues.apache.org/jira/browse/HBASE-15249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt > > > AMS (Ambari Metrics System) developer found the following scenario: > Metrics table was pre-split with many regions on large cluster (1600 nodes). > After some time, AMS stopped working because region normalizer merged the > regions into few big regions which were not able to serve high read / write > load. > This is a big problem since the write requests flood the regions faster than > the splits can happen resulting in poor performance. > We should consider setting reasonable lower bound on region count. > If the table is pre-split, we can use initial region count as the lower bound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15249) Provide lower bound on number of regions in region normalizer for pre-split tables
[ https://issues.apache.org/jira/browse/HBASE-15249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15142293#comment-15142293 ] Anoop Sam John commented on HBASE-15249: IMO when 2 regions getting write requests ( not necessarily the most count) it may not be good to merge them.. They are any way growing. So merging based on region size wont be a good idea. > Provide lower bound on number of regions in region normalizer for pre-split > tables > -- > > Key: HBASE-15249 > URL: https://issues.apache.org/jira/browse/HBASE-15249 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: HBASE-15249.v1.txt, HBASE-15249.v2.txt > > > AMS (Ambari Metrics System) developer found the following scenario: > Metrics table was pre-split with many regions on large cluster (1600 nodes). > After some time, AMS stopped working because region normalizer merged the > regions into few big regions which were not able to serve high read / write > load. > This is a big problem since the write requests flood the regions faster than > the splits can happen resulting in poor performance. > We should consider setting reasonable lower bound on region count. > If the table is pre-split, we can use initial region count as the lower bound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)