[jira] [Commented] (HBASE-20312) CCSMap: A faster, GC-friendly, less memory Concurrent Map for memstore
[ https://issues.apache.org/jira/browse/HBASE-20312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159650#comment-17159650 ] Reid Chan commented on HBASE-20312: --- I'll try to make a patch for branch-1 first and have it test then get some statistics number. > CCSMap: A faster, GC-friendly, less memory Concurrent Map for memstore > -- > > Key: HBASE-20312 > URL: https://issues.apache.org/jira/browse/HBASE-20312 > Project: HBase > Issue Type: New Feature > Components: regionserver >Reporter: Xiang Wang >Assignee: Chance Li >Priority: Major > Fix For: 3.0.0-alpha-1 > > Attachments: 1.1.2-ccsmap-number.png, HBASE-20312-1.3.2.patch, > HBASE-20312-master.v1.patch, HBASE-20312-master.v2.patch, > HBASE-20312-master.v3.patch, HBASE-20312-master.v4.patch, > HBASE-20312-master.v5.patch, HBASE-20312-master.v6.patch, > HBASE-20312-master.v7.patch, HBASE-20312-master.v8.patch, > HBASE-20312-master.v9.patch, ccsmap-branch-1.1.patch, hits.png, jira1.png, > jira2.png, jira3.png, off-heap-test-put-master.png, > on-heap-test-put-master.png > > > Now hbase use ConcurrentSkipListMap as memstore's data structure. > Although MemSLAB reduces memory fragment brought by key-value pairs. > Hundred of millions key-value pairs still make young generation > garbage-collection(gc) stop time long. > > These are 2 gc problems of ConcurrentSkipListMap: > 1. HBase needs 3 objects to store one key-value on expectation. One > Index(skiplist's average node height is 1), one Node, and one KeyValue. Too > many objects are created for memstore. > 2. Recent inserted KeyValue and its map structure(Index, Node) are assigned > on young generation.The card table (for CMS gc algorithm) or RSet(for G1 gc > algorithm) will change frequently on high writing throughput, which makes YGC > slow. > > We devleoped a new skip-list map called CompactdConcurrentSkipListMap(CCSMap > for short), > which provides similary features like ConcurrentSkipListMap but get rid of > Objects for every key-value pairs. > CCSMap's memory structure is like this picture: > !jira1.png! > > One CCSMap consists of a certain number of Chunks. One Chunk consists of a > certain number of nodes. One node is corresspding one element. This element's > all information and its key-value is encoded on a continuous memory segment > without any objects. > Features: > 1. all insert,update,delete operations is lock-free on CCSMap. > 2. Consume less memory, it brings 40% memory saving for 50Byte key-value. > 3. Faster on small key-value because of better cacheline usage. 20~30% > better read/write troughput than ConcurrentSkipListMap for 50Byte key-value. > CCSMap do not support recyle space when deleting element. But it doesn't > matter for hbase because of region flush. > CCSMap has been running on Alibaba's hbase clusters over 17 months, it cuts > down YGC time significantly. here are 2 graph of before and after. > !jira2.png! > !jira3.png! > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-20312) CCSMap: A faster, GC-friendly, less memory Concurrent Map for memstore
[ https://issues.apache.org/jira/browse/HBASE-20312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159643#comment-17159643 ] Andrew Kyle Purtell edited comment on HBASE-20312 at 7/17/20, 3:31 AM: --- An issue I had when looking at this, to maybe test it, is it was developed against 1.1, and already in 1.3 there were significant differences that prevented applying the patch to that version, never mind 2.x, which has extensively changed due to the in memory compaction work. As a proof of concept, in 1.1, it’s interesting, but will need work not to be a dead end there. was (Author: apurtell): An issue I had when looking at this, to maybe test it, is it was developed against 1.1, and already in 1.3 there were significant differences that made applying the patch to that version, never mind 2.x, which has extensively changed due to the in memory compaction work. As a proof of concept, in 1.1, it’s interesting, but will need work not to be a dead end there. > CCSMap: A faster, GC-friendly, less memory Concurrent Map for memstore > -- > > Key: HBASE-20312 > URL: https://issues.apache.org/jira/browse/HBASE-20312 > Project: HBase > Issue Type: New Feature > Components: regionserver >Reporter: Xiang Wang >Assignee: Chance Li >Priority: Major > Fix For: 3.0.0-alpha-1 > > Attachments: 1.1.2-ccsmap-number.png, HBASE-20312-1.3.2.patch, > HBASE-20312-master.v1.patch, HBASE-20312-master.v2.patch, > HBASE-20312-master.v3.patch, HBASE-20312-master.v4.patch, > HBASE-20312-master.v5.patch, HBASE-20312-master.v6.patch, > HBASE-20312-master.v7.patch, HBASE-20312-master.v8.patch, > HBASE-20312-master.v9.patch, ccsmap-branch-1.1.patch, hits.png, jira1.png, > jira2.png, jira3.png, off-heap-test-put-master.png, > on-heap-test-put-master.png > > > Now hbase use ConcurrentSkipListMap as memstore's data structure. > Although MemSLAB reduces memory fragment brought by key-value pairs. > Hundred of millions key-value pairs still make young generation > garbage-collection(gc) stop time long. > > These are 2 gc problems of ConcurrentSkipListMap: > 1. HBase needs 3 objects to store one key-value on expectation. One > Index(skiplist's average node height is 1), one Node, and one KeyValue. Too > many objects are created for memstore. > 2. Recent inserted KeyValue and its map structure(Index, Node) are assigned > on young generation.The card table (for CMS gc algorithm) or RSet(for G1 gc > algorithm) will change frequently on high writing throughput, which makes YGC > slow. > > We devleoped a new skip-list map called CompactdConcurrentSkipListMap(CCSMap > for short), > which provides similary features like ConcurrentSkipListMap but get rid of > Objects for every key-value pairs. > CCSMap's memory structure is like this picture: > !jira1.png! > > One CCSMap consists of a certain number of Chunks. One Chunk consists of a > certain number of nodes. One node is corresspding one element. This element's > all information and its key-value is encoded on a continuous memory segment > without any objects. > Features: > 1. all insert,update,delete operations is lock-free on CCSMap. > 2. Consume less memory, it brings 40% memory saving for 50Byte key-value. > 3. Faster on small key-value because of better cacheline usage. 20~30% > better read/write troughput than ConcurrentSkipListMap for 50Byte key-value. > CCSMap do not support recyle space when deleting element. But it doesn't > matter for hbase because of region flush. > CCSMap has been running on Alibaba's hbase clusters over 17 months, it cuts > down YGC time significantly. here are 2 graph of before and after. > !jira2.png! > !jira3.png! > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20312) CCSMap: A faster, GC-friendly, less memory Concurrent Map for memstore
[ https://issues.apache.org/jira/browse/HBASE-20312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159643#comment-17159643 ] Andrew Kyle Purtell commented on HBASE-20312: - An issue I had when looking at this, to maybe test it, is it was developed against 1.1, and already in 1.3 there were significant differences that made applying the patch to that version, never mind 2.x, which has extensively changed due to the in memory compaction work. As a proof of concept, in 1.1, it’s interesting, but will need work not to be a dead end there. > CCSMap: A faster, GC-friendly, less memory Concurrent Map for memstore > -- > > Key: HBASE-20312 > URL: https://issues.apache.org/jira/browse/HBASE-20312 > Project: HBase > Issue Type: New Feature > Components: regionserver >Reporter: Xiang Wang >Assignee: Chance Li >Priority: Major > Fix For: 3.0.0-alpha-1 > > Attachments: 1.1.2-ccsmap-number.png, HBASE-20312-1.3.2.patch, > HBASE-20312-master.v1.patch, HBASE-20312-master.v2.patch, > HBASE-20312-master.v3.patch, HBASE-20312-master.v4.patch, > HBASE-20312-master.v5.patch, HBASE-20312-master.v6.patch, > HBASE-20312-master.v7.patch, HBASE-20312-master.v8.patch, > HBASE-20312-master.v9.patch, ccsmap-branch-1.1.patch, hits.png, jira1.png, > jira2.png, jira3.png, off-heap-test-put-master.png, > on-heap-test-put-master.png > > > Now hbase use ConcurrentSkipListMap as memstore's data structure. > Although MemSLAB reduces memory fragment brought by key-value pairs. > Hundred of millions key-value pairs still make young generation > garbage-collection(gc) stop time long. > > These are 2 gc problems of ConcurrentSkipListMap: > 1. HBase needs 3 objects to store one key-value on expectation. One > Index(skiplist's average node height is 1), one Node, and one KeyValue. Too > many objects are created for memstore. > 2. Recent inserted KeyValue and its map structure(Index, Node) are assigned > on young generation.The card table (for CMS gc algorithm) or RSet(for G1 gc > algorithm) will change frequently on high writing throughput, which makes YGC > slow. > > We devleoped a new skip-list map called CompactdConcurrentSkipListMap(CCSMap > for short), > which provides similary features like ConcurrentSkipListMap but get rid of > Objects for every key-value pairs. > CCSMap's memory structure is like this picture: > !jira1.png! > > One CCSMap consists of a certain number of Chunks. One Chunk consists of a > certain number of nodes. One node is corresspding one element. This element's > all information and its key-value is encoded on a continuous memory segment > without any objects. > Features: > 1. all insert,update,delete operations is lock-free on CCSMap. > 2. Consume less memory, it brings 40% memory saving for 50Byte key-value. > 3. Faster on small key-value because of better cacheline usage. 20~30% > better read/write troughput than ConcurrentSkipListMap for 50Byte key-value. > CCSMap do not support recyle space when deleting element. But it doesn't > matter for hbase because of region flush. > CCSMap has been running on Alibaba's hbase clusters over 17 months, it cuts > down YGC time significantly. here are 2 graph of before and after. > !jira2.png! > !jira3.png! > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-24578) [WAL] Add a parameter to config RingBufferEventHandler's SyncFuture count
[ https://issues.apache.org/jira/browse/HBASE-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-24578. --- Hadoop Flags: Reviewed Resolution: Fixed > [WAL] Add a parameter to config RingBufferEventHandler's SyncFuture count > - > > Key: HBASE-24578 > URL: https://issues.apache.org/jira/browse/HBASE-24578 > Project: HBase > Issue Type: Improvement > Components: wal >Affects Versions: 1.4.13, 2.2.5 >Reporter: Reid Chan >Assignee: wenfeiyi666 >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.2.6 > > > The current value of RingBufferEventHandler's handler is the value of > {{hbase.regionserver.handler.count}}, which works good in default wal > provider --- one WAL per regionserver. > When trying to use WAL group provider, either by group or wal per region, the > default value is bad. If rs has 100 regions and wal per region strategy is > used, then rs will allocate 100 * > SyncFuture[$hbase.regionserver.handler.count] array > {code} > int maxHandlersCount = conf.getInt(HConstants.REGION_SERVER_HANDLER_COUNT, > 200); > this.ringBufferEventHandler = new RingBufferEventHandler( > conf.getInt("hbase.regionserver.hlog.syncer.count", 5), > maxHandlersCount); > ... > > RingBufferEventHandler(final int syncRunnerCount, final int maxHandlersCount) > { > this.syncFutures = new SyncFuture[maxHandlersCount]; > ... > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24578) [WAL] Add a parameter to config RingBufferEventHandler's SyncFuture count
[ https://issues.apache.org/jira/browse/HBASE-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24578: -- Release Note: Introduce a new parameter "hbase.regionserver.wal.sync.batch.count" to control the wal sync batch size which is equals to "hbase.regionserver.handler.count" by default. It should work well if you use default wal provider---one wal per regionserver. But if you use read/write separated handlers, you can set "hbase.regionserver.wal.sync.batch.count" to the number of write handlers. And if you use wal-per-groups or wal-per-region, you can consider lower "hbase.regionserver.wal.sync.batch.count", the default number will be too big and consume more memories as the number of wals grows. (was: Introduce a new parameter "hbase.regionserver.wal.sync.batch.count" to control the wal sync batch size which is equals to "hbase.regionserver.handler.count" by default. It should work well if you use default wal provider---one wal per regionserver. But if you use Read/Write separated handlers, you can set "hbase.regionserver.wal.sync.batch.count" to the number of write handlers. And if you use wal-per-groups or wal-per-region, you can consider lower "hbase.regionserver.wal.sync.batch.count", the default number will be too big and consume more memories as the number of wals grows.) > [WAL] Add a parameter to config RingBufferEventHandler's SyncFuture count > - > > Key: HBASE-24578 > URL: https://issues.apache.org/jira/browse/HBASE-24578 > Project: HBase > Issue Type: Improvement > Components: wal >Affects Versions: 1.4.13, 2.2.5 >Reporter: Reid Chan >Assignee: wenfeiyi666 >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.2.6 > > > The current value of RingBufferEventHandler's handler is the value of > {{hbase.regionserver.handler.count}}, which works good in default wal > provider --- one WAL per regionserver. > When trying to use WAL group provider, either by group or wal per region, the > default value is bad. If rs has 100 regions and wal per region strategy is > used, then rs will allocate 100 * > SyncFuture[$hbase.regionserver.handler.count] array > {code} > int maxHandlersCount = conf.getInt(HConstants.REGION_SERVER_HANDLER_COUNT, > 200); > this.ringBufferEventHandler = new RingBufferEventHandler( > conf.getInt("hbase.regionserver.hlog.syncer.count", 5), > maxHandlersCount); > ... > > RingBufferEventHandler(final int syncRunnerCount, final int maxHandlersCount) > { > this.syncFutures = new SyncFuture[maxHandlersCount]; > ... > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24578) [WAL] Add a parameter to config RingBufferEventHandler's SyncFuture count
[ https://issues.apache.org/jira/browse/HBASE-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24578: -- Release Note: Introduce a new parameter "hbase.regionserver.wal.sync.batch.count" to control the wal sync batch size which is equals to "hbase.regionserver.handler.count" by default. It should work well if you use default wal provider---one wal per regionserver. But if you use Read/Write separated handlers, you can set "hbase.regionserver.wal.sync.batch.count" to the number of write handlers. And if you use wal-per-groups or wal-per-region, you can consider lower "hbase.regionserver.wal.sync.batch.count", the default number will be too big and consume more memories as the number of wals grows. (was: Introduce a new parameter "hbase.regionserver.wal.sync.batch.count" to control the wal sync batch size which is equals to "hbase.regionserver.handler.count" by default. It should work well if you use default wal provider---one wal per regionserver. But if you use Read/Write separated handlers, you can set "hbase.regionserver.wal.sync.batch.count" to the number of write handlers.) > [WAL] Add a parameter to config RingBufferEventHandler's SyncFuture count > - > > Key: HBASE-24578 > URL: https://issues.apache.org/jira/browse/HBASE-24578 > Project: HBase > Issue Type: Improvement > Components: wal >Affects Versions: 1.4.13, 2.2.5 >Reporter: Reid Chan >Assignee: wenfeiyi666 >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.2.6 > > > The current value of RingBufferEventHandler's handler is the value of > {{hbase.regionserver.handler.count}}, which works good in default wal > provider --- one WAL per regionserver. > When trying to use WAL group provider, either by group or wal per region, the > default value is bad. If rs has 100 regions and wal per region strategy is > used, then rs will allocate 100 * > SyncFuture[$hbase.regionserver.handler.count] array > {code} > int maxHandlersCount = conf.getInt(HConstants.REGION_SERVER_HANDLER_COUNT, > 200); > this.ringBufferEventHandler = new RingBufferEventHandler( > conf.getInt("hbase.regionserver.hlog.syncer.count", 5), > maxHandlersCount); > ... > > RingBufferEventHandler(final int syncRunnerCount, final int maxHandlersCount) > { > this.syncFutures = new SyncFuture[maxHandlersCount]; > ... > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24578) [WAL] Add a parameter to config RingBufferEventHandler's SyncFuture count
[ https://issues.apache.org/jira/browse/HBASE-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24578: -- Release Note: Introduce a new parameter "hbase.regionserver.wal.sync.batch.count" to control the wal sync batch size which is equals to "hbase.regionserver.handler.count" by default. It should work well if you use default wal provider---one wal per regionserver. But if you use Read/Write separated handlers, you can set "hbase.regionserver.wal.sync.batch.count" to the number of write handlers. > [WAL] Add a parameter to config RingBufferEventHandler's SyncFuture count > - > > Key: HBASE-24578 > URL: https://issues.apache.org/jira/browse/HBASE-24578 > Project: HBase > Issue Type: Improvement > Components: wal >Affects Versions: 1.4.13, 2.2.5 >Reporter: Reid Chan >Assignee: wenfeiyi666 >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.2.6 > > > The current value of RingBufferEventHandler's handler is the value of > {{hbase.regionserver.handler.count}}, which works good in default wal > provider --- one WAL per regionserver. > When trying to use WAL group provider, either by group or wal per region, the > default value is bad. If rs has 100 regions and wal per region strategy is > used, then rs will allocate 100 * > SyncFuture[$hbase.regionserver.handler.count] array > {code} > int maxHandlersCount = conf.getInt(HConstants.REGION_SERVER_HANDLER_COUNT, > 200); > this.ringBufferEventHandler = new RingBufferEventHandler( > conf.getInt("hbase.regionserver.hlog.syncer.count", 5), > maxHandlersCount); > ... > > RingBufferEventHandler(final int syncRunnerCount, final int maxHandlersCount) > { > this.syncFutures = new SyncFuture[maxHandlersCount]; > ... > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24578) [WAL] Add a parameter to config RingBufferEventHandler's SyncFuture count
[ https://issues.apache.org/jira/browse/HBASE-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24578: -- Fix Version/s: 1.7.0 > [WAL] Add a parameter to config RingBufferEventHandler's SyncFuture count > - > > Key: HBASE-24578 > URL: https://issues.apache.org/jira/browse/HBASE-24578 > Project: HBase > Issue Type: Improvement > Components: wal >Affects Versions: 1.4.13, 2.2.5 >Reporter: Reid Chan >Assignee: wenfeiyi666 >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.2.6 > > > The current value of RingBufferEventHandler's handler is the value of > {{hbase.regionserver.handler.count}}, which works good in default wal > provider --- one WAL per regionserver. > When trying to use WAL group provider, either by group or wal per region, the > default value is bad. If rs has 100 regions and wal per region strategy is > used, then rs will allocate 100 * > SyncFuture[$hbase.regionserver.handler.count] array > {code} > int maxHandlersCount = conf.getInt(HConstants.REGION_SERVER_HANDLER_COUNT, > 200); > this.ringBufferEventHandler = new RingBufferEventHandler( > conf.getInt("hbase.regionserver.hlog.syncer.count", 5), > maxHandlersCount); > ... > > RingBufferEventHandler(final int syncRunnerCount, final int maxHandlersCount) > { > this.syncFutures = new SyncFuture[maxHandlersCount]; > ... > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20312) CCSMap: A faster, GC-friendly, less memory Concurrent Map for memstore
[ https://issues.apache.org/jira/browse/HBASE-20312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159625#comment-17159625 ] Reid Chan commented on HBASE-20312: --- Is anyone still working on this? I'd like to get insight of it, and have it test. > CCSMap: A faster, GC-friendly, less memory Concurrent Map for memstore > -- > > Key: HBASE-20312 > URL: https://issues.apache.org/jira/browse/HBASE-20312 > Project: HBase > Issue Type: New Feature > Components: regionserver >Reporter: Xiang Wang >Assignee: Chance Li >Priority: Major > Fix For: 3.0.0-alpha-1 > > Attachments: 1.1.2-ccsmap-number.png, HBASE-20312-1.3.2.patch, > HBASE-20312-master.v1.patch, HBASE-20312-master.v2.patch, > HBASE-20312-master.v3.patch, HBASE-20312-master.v4.patch, > HBASE-20312-master.v5.patch, HBASE-20312-master.v6.patch, > HBASE-20312-master.v7.patch, HBASE-20312-master.v8.patch, > HBASE-20312-master.v9.patch, ccsmap-branch-1.1.patch, hits.png, jira1.png, > jira2.png, jira3.png, off-heap-test-put-master.png, > on-heap-test-put-master.png > > > Now hbase use ConcurrentSkipListMap as memstore's data structure. > Although MemSLAB reduces memory fragment brought by key-value pairs. > Hundred of millions key-value pairs still make young generation > garbage-collection(gc) stop time long. > > These are 2 gc problems of ConcurrentSkipListMap: > 1. HBase needs 3 objects to store one key-value on expectation. One > Index(skiplist's average node height is 1), one Node, and one KeyValue. Too > many objects are created for memstore. > 2. Recent inserted KeyValue and its map structure(Index, Node) are assigned > on young generation.The card table (for CMS gc algorithm) or RSet(for G1 gc > algorithm) will change frequently on high writing throughput, which makes YGC > slow. > > We devleoped a new skip-list map called CompactdConcurrentSkipListMap(CCSMap > for short), > which provides similary features like ConcurrentSkipListMap but get rid of > Objects for every key-value pairs. > CCSMap's memory structure is like this picture: > !jira1.png! > > One CCSMap consists of a certain number of Chunks. One Chunk consists of a > certain number of nodes. One node is corresspding one element. This element's > all information and its key-value is encoded on a continuous memory segment > without any objects. > Features: > 1. all insert,update,delete operations is lock-free on CCSMap. > 2. Consume less memory, it brings 40% memory saving for 50Byte key-value. > 3. Faster on small key-value because of better cacheline usage. 20~30% > better read/write troughput than ConcurrentSkipListMap for 50Byte key-value. > CCSMap do not support recyle space when deleting element. But it doesn't > matter for hbase because of region flush. > CCSMap has been running on Alibaba's hbase clusters over 17 months, it cuts > down YGC time significantly. here are 2 graph of before and after. > !jira2.png! > !jira3.png! > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HBASE-24745) 'Failed report transition' logs too often
[ https://issues.apache.org/jira/browse/HBASE-24745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wenfeiyi666 reassigned HBASE-24745: --- Assignee: wenfeiyi666 > 'Failed report transition' logs too often > - > > Key: HBASE-24745 > URL: https://issues.apache.org/jira/browse/HBASE-24745 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.3.0 >Reporter: Michael Stack >Assignee: wenfeiyi666 >Priority: Minor > > The parent issue fixed a backoff that was too aggressive. Now I notice we try > too much. Saw 9k logs in 17 seconds of the below type... > {code:java} > 2020-07-15 14:36:23,104 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Failed report transition > server { host_name: "X.example.org" port: 16020 start_code: 1594823099666 } > transition { transition_ code: CLOSED region_info { region_id: > 1594814749475 table_name { namespace: "default" qualifier: > "IntegrationTestBigLinkedList" } start_key: "\"\"\"\"\"\"\" " end_key: > "#Q\352\f\003" offline: false split: false replica_id: 0 } proc_id: > 81545 }; retry (#) after 200805ms delay (Master is coming online...). > {code} > The delay doesn't seem correct or respected. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Reidddddd merged pull request #2063: HBASE-24578 [WAL] Add a parameter to config RingBufferEventHandler's SyncFuture count
Reidd merged pull request #2063: URL: https://github.com/apache/hbase/pull/2063 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Reidddddd commented on pull request #2063: HBASE-24578 [WAL] Add a parameter to config RingBufferEventHandler's SyncFuture count
Reidd commented on pull request #2063: URL: https://github.com/apache/hbase/pull/2063#issuecomment-659794617 Let me directly push it, it couldn't have much problem. It gets stuck too long This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HBASE-24637) Reseek regression related to filter SKIP hinting
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159618#comment-17159618 ] Lars Hofhansl commented on HBASE-24637: --- I see the logic in UserScanQueryMatcher (in mergeFilterResponse())) has changed to do exactly the logic I described above. It tries to be smarter in the case where the Filter said SKIP but the SQM said SEEK. In theory SEEK'ing is better, but it looks like it's causing exactly this change of behavior. > Reseek regression related to filter SKIP hinting > > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] bsglz commented on pull request #2042: HBASE-24704 Make the Table Schema easier to view even there are multi…
bsglz commented on pull request #2042: URL: https://github.com/apache/hbase/pull/2042#issuecomment-659786642 PTAL. Thanks. @wchevreuil @virajjasani This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bsglz commented on pull request #2044: HBASE-24709 Support MoveCostFunction use a lower multiplier in offpea…
bsglz commented on pull request #2044: URL: https://github.com/apache/hbase/pull/2044#issuecomment-659786007 @virajjasani Ping. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bsglz commented on pull request #2049: HBASE-24382 Flush partial stores of region filtered by seqId when arc…
bsglz commented on pull request #2049: URL: https://github.com/apache/hbase/pull/2049#issuecomment-659785821 Could you help to merge this one? Thanks. @wchevreuil This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bsglz commented on pull request #2054: HBASE-24664 Some changing of split region by overall region size rath…
bsglz commented on pull request #2054: URL: https://github.com/apache/hbase/pull/2054#issuecomment-659785660 @wchevreuil Ping. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HBASE-24665) all wal of RegionGroupingProvider together roll
[ https://issues.apache.org/jira/browse/HBASE-24665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159597#comment-17159597 ] wenfeiyi666 commented on HBASE-24665: - ping [~zhangduo] and [~anoop.hbase] > all wal of RegionGroupingProvider together roll > --- > > Key: HBASE-24665 > URL: https://issues.apache.org/jira/browse/HBASE-24665 > Project: HBase > Issue Type: Bug >Affects Versions: 2.3.0, master, 2.1.10, 1.4.14, 2.2.6 >Reporter: wenfeiyi666 >Assignee: wenfeiyi666 >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 2.1.10, 1.4.14, 2.2.7 > > > when use multiwal, any a wal request roll, all wal will be together roll. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Apache-HBase commented on pull request #2052: HBASE-24718 : Generic NamedQueue framework for multiple use-cases (Refactor SlowLog responses)
Apache-HBase commented on pull request #2052: URL: https://github.com/apache/hbase/pull/2052#issuecomment-659732589 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 0s | Docker mode activated. | | -0 :warning: | yetus | 0m 3s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck | ||| _ Prechecks _ | ||| _ master Compile Tests _ | | +0 :ok: | mvndep | 0m 21s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 3m 58s | master passed | | +1 :green_heart: | compile | 2m 26s | master passed | | +1 :green_heart: | shadedjars | 6m 1s | branch has no errors when building our shaded downstream artifacts. | | +1 :green_heart: | javadoc | 1m 31s | master passed | | -0 :warning: | patch | 8m 7s | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 13s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 3m 44s | the patch passed | | +1 :green_heart: | compile | 2m 28s | the patch passed | | +1 :green_heart: | javac | 2m 28s | the patch passed | | +1 :green_heart: | shadedjars | 6m 2s | patch has no errors when building our shaded downstream artifacts. | | +1 :green_heart: | javadoc | 1m 27s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 47s | hbase-protocol-shaded in the patch passed. | | +1 :green_heart: | unit | 1m 44s | hbase-common in the patch passed. | | +1 :green_heart: | unit | 1m 7s | hbase-client in the patch passed. | | +1 :green_heart: | unit | 201m 35s | hbase-server in the patch passed. | | | | 236m 34s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/2052 | | Optional Tests | javac javadoc unit shadedjars compile | | uname | Linux 6c81c7f0a354 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/hbase-personality.sh | | git revision | master / 2505c7760d | | Default Java | 1.8.0_232 | | Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/testReport/ | | Max. process+thread count | 3476 (vs. ulimit of 12500) | | modules | C: hbase-protocol-shaded hbase-common hbase-client hbase-server U: . | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/console | | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #2052: HBASE-24718 : Generic NamedQueue framework for multiple use-cases (Refactor SlowLog responses)
Apache-HBase commented on pull request #2052: URL: https://github.com/apache/hbase/pull/2052#issuecomment-659729307 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 30s | Docker mode activated. | | -0 :warning: | yetus | 0m 3s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck | ||| _ Prechecks _ | ||| _ master Compile Tests _ | | +0 :ok: | mvndep | 0m 22s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 4m 39s | master passed | | +1 :green_heart: | compile | 3m 7s | master passed | | +1 :green_heart: | shadedjars | 6m 21s | branch has no errors when building our shaded downstream artifacts. | | -0 :warning: | javadoc | 0m 25s | hbase-client in master failed. | | -0 :warning: | javadoc | 0m 16s | hbase-common in master failed. | | -0 :warning: | javadoc | 0m 40s | hbase-server in master failed. | | -0 :warning: | patch | 8m 33s | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 13s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 4m 29s | the patch passed | | +1 :green_heart: | compile | 3m 3s | the patch passed | | +1 :green_heart: | javac | 3m 3s | the patch passed | | +1 :green_heart: | shadedjars | 6m 16s | patch has no errors when building our shaded downstream artifacts. | | -0 :warning: | javadoc | 0m 16s | hbase-common in the patch failed. | | -0 :warning: | javadoc | 0m 25s | hbase-client in the patch failed. | | -0 :warning: | javadoc | 0m 40s | hbase-server in the patch failed. | ||| _ Other Tests _ | | +1 :green_heart: | unit | 1m 2s | hbase-protocol-shaded in the patch passed. | | +1 :green_heart: | unit | 1m 47s | hbase-common in the patch passed. | | +1 :green_heart: | unit | 1m 22s | hbase-client in the patch passed. | | +1 :green_heart: | unit | 188m 50s | hbase-server in the patch passed. | | | | 227m 20s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/2052 | | Optional Tests | javac javadoc unit shadedjars compile | | uname | Linux 87bd73b4c07e 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/hbase-personality.sh | | git revision | master / 2505c7760d | | Default Java | 2020-01-14 | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-common.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-common.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt | | Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/testReport/ | | Max. process+thread count | 3480 (vs. ulimit of 12500) | | modules | C: hbase-protocol-shaded hbase-common hbase-client hbase-server U: . | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/console | | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HBASE-24467) Backport HBASE-23963: Split TestFromClientSide; it takes too long to complete timing out
[ https://issues.apache.org/jira/browse/HBASE-24467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159535#comment-17159535 ] Hudson commented on HBASE-24467: Results for branch branch-2.2 [build #914 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Backport HBASE-23963: Split TestFromClientSide; it takes too long to complete > timing out > > > Key: HBASE-24467 > URL: https://issues.apache.org/jira/browse/HBASE-24467 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.2.5 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 2.2.6 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24721) rename_rsgroup overwriting the existing rsgroup.
[ https://issues.apache.org/jira/browse/HBASE-24721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159534#comment-17159534 ] Hudson commented on HBASE-24721: Results for branch branch-2.2 [build #914 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > rename_rsgroup overwriting the existing rsgroup. > > > Key: HBASE-24721 > URL: https://issues.apache.org/jira/browse/HBASE-24721 > Project: HBase > Issue Type: Bug >Reporter: chiranjeevi >Assignee: Mohammad Arshad >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.4.0, 2.2.6 > > > rename_rsgroup overwriting the current rsgroup. > Steps: > 1)add_rsgroup 'RSG1' and 'RSG2' > 2)move_servers_rsgroup 'RSG1',['server1:port'] > 3)rename_rsgroup 'RSG1','RSG2' > After performing step3 RSG1 overwriting to RSG2 and region servers added in > RSG1 are not available now. > Ideally system should show error message Group already exists: RSG2 > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24615) MutableRangeHistogram#updateSnapshotRangeMetrics doesn't calculate the distribution for last bucket.
[ https://issues.apache.org/jira/browse/HBASE-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159533#comment-17159533 ] Hudson commented on HBASE-24615: Results for branch branch-2.2 [build #914 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > MutableRangeHistogram#updateSnapshotRangeMetrics doesn't calculate the > distribution for last bucket. > > > Key: HBASE-24615 > URL: https://issues.apache.org/jira/browse/HBASE-24615 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 2.3.0, master, 1.3.7, 2.2.6 >Reporter: Rushabh Shah >Assignee: wenfeiyi666 >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.4.0, 2.2.6 > > > We are not processing the distribution for last bucket. > https://github.com/apache/hbase/blob/master/hbase-hadoop-compat/src/main/java/org/apache/hadoop/metrics2/lib/MutableRangeHistogram.java#L70 > {code:java} > public void updateSnapshotRangeMetrics(MetricsRecordBuilder > metricsRecordBuilder, > Snapshot snapshot) { > long priorRange = 0; > long cumNum = 0; > final long[] ranges = getRanges(); > final String rangeType = getRangeType(); > for (int i = 0; i < ranges.length - 1; i++) { -> The bug lies > here. We are not processing last bucket. > long val = snapshot.getCountAtOrBelow(ranges[i]); > if (val - cumNum > 0) { > metricsRecordBuilder.addCounter( > Interns.info(name + "_" + rangeType + "_" + priorRange + "-" + > ranges[i], desc), > val - cumNum); > } > priorRange = ranges[i]; > cumNum = val; > } > long val = snapshot.getCount(); > if (val - cumNum > 0) { > metricsRecordBuilder.addCounter( > Interns.info(name + "_" + rangeType + "_" + ranges[ranges.length - > 1] + "-inf", desc), > val - cumNum); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24689) Generate CHANGES.md and RELEASENOTES.md for 2.2.6
[ https://issues.apache.org/jira/browse/HBASE-24689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159538#comment-17159538 ] Hudson commented on HBASE-24689: Results for branch branch-2.2 [build #914 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Generate CHANGES.md and RELEASENOTES.md for 2.2.6 > - > > Key: HBASE-24689 > URL: https://issues.apache.org/jira/browse/HBASE-24689 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 2.2.6 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23963) Split TestFromClientSide; it takes too long to complete timing out
[ https://issues.apache.org/jira/browse/HBASE-23963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159536#comment-17159536 ] Hudson commented on HBASE-23963: Results for branch branch-2.2 [build #914 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/914//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Split TestFromClientSide; it takes too long to complete timing out > -- > > Key: HBASE-23963 > URL: https://issues.apache.org/jira/browse/HBASE-23963 > Project: HBase > Issue Type: Test > Components: test >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.0 > > Attachments: Screen Shot 2020-03-11 at 7.43.27 AM.png > > > The TestFromClientSide test was parameterized recently so we'd run full sweet > with one of three registries. Test now often takes longer than max 13 minutes > allowed. Split the test. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Apache-HBase commented on pull request #2077: HBASE-24684 Fetch ReplicationSink servers list from HMaster instead o…
Apache-HBase commented on pull request #2077: URL: https://github.com/apache/hbase/pull/2077#issuecomment-659701434 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 45s | Docker mode activated. | | -0 :warning: | yetus | 0m 3s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck | ||| _ Prechecks _ | ||| _ HBASE-24666 Compile Tests _ | | +0 :ok: | mvndep | 0m 25s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 5m 6s | HBASE-24666 passed | | +1 :green_heart: | compile | 3m 48s | HBASE-24666 passed | | +1 :green_heart: | shadedjars | 6m 44s | branch has no errors when building our shaded downstream artifacts. | | -0 :warning: | javadoc | 0m 29s | hbase-client in HBASE-24666 failed. | | -0 :warning: | javadoc | 0m 46s | hbase-server in HBASE-24666 failed. | | -0 :warning: | javadoc | 1m 3s | hbase-thrift in HBASE-24666 failed. | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 15s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 4m 57s | the patch passed | | +1 :green_heart: | compile | 3m 33s | the patch passed | | +1 :green_heart: | javac | 3m 33s | the patch passed | | +1 :green_heart: | shadedjars | 6m 46s | patch has no errors when building our shaded downstream artifacts. | | -0 :warning: | javadoc | 0m 31s | hbase-client in the patch failed. | | -0 :warning: | javadoc | 0m 51s | hbase-server in the patch failed. | | -0 :warning: | javadoc | 1m 11s | hbase-thrift in the patch failed. | ||| _ Other Tests _ | | +1 :green_heart: | unit | 1m 16s | hbase-protocol-shaded in the patch passed. | | +1 :green_heart: | unit | 1m 48s | hbase-client in the patch passed. | | -1 :x: | unit | 231m 36s | hbase-server in the patch failed. | | +1 :green_heart: | unit | 5m 50s | hbase-thrift in the patch passed. | | | | 281m 37s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/3/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/2077 | | Optional Tests | javac javadoc unit shadedjars compile | | uname | Linux b088724c72b6 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/hbase-personality.sh | | git revision | HBASE-24666 / 9e8c930feb | | Default Java | 2020-01-14 | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-thrift.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-thrift.txt | | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/3/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/3/testReport/ | | Max. process+thread count | 3037 (vs. ulimit of 12500) | | modules | C: hbase-protocol-shaded hbase-client hbase-server hbase-thrift U: . | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/3/console | | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] sguggilam commented on pull request #2076: HBASE-24740 Enable journal logging for HBase snapshot operation
sguggilam commented on pull request #2076: URL: https://github.com/apache/hbase/pull/2076#issuecomment-659700387 > Looks good overall, few minor nits. We are planning PR for master branch also? Yes @virajjasani , we can commit this PR for master branch as well This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] sguggilam commented on a change in pull request #2076: HBASE-24740 Enable journal logging for HBase snapshot operation
sguggilam commented on a change in pull request #2076: URL: https://github.com/apache/hbase/pull/2076#discussion_r456104565 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotManifest.java ## @@ -343,6 +343,13 @@ private void load() throws IOException { } } + /** + * Sets the status task for monitoring all the subtasks for Snapshot operation + */ + public void setMonitoredTask(MonitoredTask statusTask) { Review comment: There are few other places where create method is getting called in src and test classes. I just thought this would be easier for the callers to set the status as optional when needed instead of passing in to the create call where everyone has pass in null explicitly in the absence of task object. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] sguggilam commented on a change in pull request #2076: HBASE-24740 Enable journal logging for HBase snapshot operation
sguggilam commented on a change in pull request #2076: URL: https://github.com/apache/hbase/pull/2076#discussion_r456104808 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotManager.java ## @@ -624,14 +623,16 @@ private void takeSnapshotInternal(SnapshotDescription snapshot) throws IOExcepti AssignmentManager assignmentMgr = master.getAssignmentManager(); if (assignmentMgr.getTableStateManager().isTableState(snapshotTable, ZooKeeperProtos.Table.State.ENABLED)) { - LOG.debug("Table enabled, starting distributed snapshot."); + LOG.debug("Table enabled, starting distributed snapshot for " Review comment: Makes sense , updated in latest patch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (HBASE-24745) 'Failed report transition' logs too often
Michael Stack created HBASE-24745: - Summary: 'Failed report transition' logs too often Key: HBASE-24745 URL: https://issues.apache.org/jira/browse/HBASE-24745 Project: HBase Issue Type: Sub-task Affects Versions: 2.3.0 Reporter: Michael Stack The parent issue fixed a backoff that was too aggressive. Now I notice we try too much. Saw 9k logs in 17 seconds of the below type... {code:java} 2020-07-15 14:36:23,104 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Failed report transition server { host_name: "X.example.org" port: 16020 start_code: 1594823099666 } transition { transition_ code: CLOSED region_info { region_id: 1594814749475 table_name { namespace: "default" qualifier: "IntegrationTestBigLinkedList" } start_key: "\"\"\"\"\"\"\" " end_key: "#Q\352\f\003" offline: false split: false replica_id: 0 } proc_id: 81545 }; retry (#) after 200805ms delay (Master is coming online...). {code} The delay doesn't seem correct or respected. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Apache-HBase commented on pull request #2076: HBASE-24740 Enable journal logging for HBase snapshot operation
Apache-HBase commented on pull request #2076: URL: https://github.com/apache/hbase/pull/2076#issuecomment-659693967 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 41s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | -0 :warning: | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ branch-1 Compile Tests _ | | +1 :green_heart: | mvninstall | 9m 49s | branch-1 passed | | +1 :green_heart: | compile | 0m 40s | branch-1 passed with JDK v1.8.0_252 | | +1 :green_heart: | compile | 0m 46s | branch-1 passed with JDK v1.7.0_262 | | +1 :green_heart: | checkstyle | 1m 41s | branch-1 passed | | +1 :green_heart: | shadedjars | 3m 2s | branch has no errors when building our shaded downstream artifacts. | | +1 :green_heart: | javadoc | 0m 50s | branch-1 passed with JDK v1.8.0_252 | | +1 :green_heart: | javadoc | 0m 41s | branch-1 passed with JDK v1.7.0_262 | | +0 :ok: | spotbugs | 3m 3s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 0s | branch-1 passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 56s | the patch passed | | +1 :green_heart: | compile | 0m 42s | the patch passed with JDK v1.8.0_252 | | +1 :green_heart: | javac | 0m 42s | the patch passed | | +1 :green_heart: | compile | 0m 44s | the patch passed with JDK v1.7.0_262 | | +1 :green_heart: | javac | 0m 44s | the patch passed | | +1 :green_heart: | checkstyle | 1m 28s | hbase-server: The patch generated 0 new + 20 unchanged - 10 fixed = 20 total (was 30) | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedjars | 2m 46s | patch has no errors when building our shaded downstream artifacts. | | +1 :green_heart: | hadoopcheck | 4m 45s | Patch does not cause any errors with Hadoop 2.8.5 2.9.2. | | +1 :green_heart: | javadoc | 0m 32s | the patch passed with JDK v1.8.0_252 | | +1 :green_heart: | javadoc | 0m 44s | the patch passed with JDK v1.7.0_262 | | +1 :green_heart: | findbugs | 2m 54s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 133m 40s | hbase-server in the patch passed. | | +1 :green_heart: | asflicense | 0m 38s | The patch does not generate ASF License warnings. | | | | 175m 40s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2076/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/2076 | | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 34822685206c 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/Base-PreCommit-GitHub-PR_PR-2076/out/precommit/personality/provided.sh | | git revision | branch-1 / 71aec0f | | Default Java | 1.7.0_262 | | Multi-JDK versions | /usr/lib/jvm/zulu-8-amd64:1.8.0_252 /usr/lib/jvm/zulu-7-amd64:1.7.0_262 | | Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2076/2/testReport/ | | Max. process+thread count | 4488 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2076/2/console | | versions | git=1.9.1 maven=3.0.5 findbugs=3.0.1 | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (HBASE-7880) HFile Recovery/Rewrite Tool
[ https://issues.apache.org/jira/browse/HBASE-7880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huaxiang Sun reassigned HBASE-7880: --- Assignee: Huaxiang Sun (was: Hua Xiang) > HFile Recovery/Rewrite Tool > --- > > Key: HBASE-7880 > URL: https://issues.apache.org/jira/browse/HBASE-7880 > Project: HBase > Issue Type: New Feature > Components: HFile >Affects Versions: 0.95.2 >Reporter: Matteo Bertozzi >Assignee: Huaxiang Sun >Priority: Minor > Attachments: HBASE-7880-v0.patch > > > Sometimes is useful to have a tool to migrate files from a new version to an > old version (e.g. convert a new XYZ encoded/compressed file to an old > "uncompressed" format) > also it will be useful to been able to recover an hfile from a corrupted > state. (e.g. trailer missing/broken, ...) > The "user" can provide the information about the file (compression & co) and > try to recover as much as possible from the file by reading data blocks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HBASE-18070) Enable memstore replication for meta replica
[ https://issues.apache.org/jira/browse/HBASE-18070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huaxiang Sun reassigned HBASE-18070: Assignee: Huaxiang Sun (was: Hua Xiang) > Enable memstore replication for meta replica > > > Key: HBASE-18070 > URL: https://issues.apache.org/jira/browse/HBASE-18070 > Project: HBase > Issue Type: New Feature >Reporter: Hua Xiang >Assignee: Huaxiang Sun >Priority: Major > > Based on the current doc, memstore replication is not enabled for meta > replica. Memstore replication will be a good improvement for meta replica. > Create jira to track this effort (feasibility, design, implementation, etc). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24637) Reseek regression related to filter SKIP hinting
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159494#comment-17159494 ] Andrew Kyle Purtell commented on HBASE-24637: - I updated the description to center this on the reseeking, which is the problem, of which the hinting changes seem correlated but may not be causative > Reseek regression related to filter SKIP hinting > > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24637) Reseek regression related to filter SKIP hinting
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Kyle Purtell updated HBASE-24637: Summary: Reseek regression related to filter SKIP hinting (was: Filter SKIP hinting regression) > Reseek regression related to filter SKIP hinting > > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24637) Filter SKIP hinting regression
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159481#comment-17159481 ] Andrew Kyle Purtell edited comment on HBASE-24637 at 7/16/20, 8:51 PM: --- I agree the difference in hint codes is not the regression per se, the reseeking is the regression. There is a serious and proportional cost spent in reseeking in branch-2 that is absent in branch-1 under identical test conditions and same store files in hdfs. The metrics for this are store_reseek and store_reseek_ms. It is suspicious both hint code and reseek metrics show such deviation in branch-2 as opposed to branch-1 was (Author: apurtell): I agree the difference in hint codes is not the regression per se, the reseeking is the regression. There is a serious and proportional cost spent in reseeking in branch-2 that is absent in branch-1 under identical test conditions and same store files in hdfs. > Filter SKIP hinting regression > -- > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159486#comment-17159486 ] Lars Hofhansl commented on HBASE-24637: --- Yep. And unfortunately whether a SEEK is an advantage depends on many factors. If there are many versions then seeking to the next column and row is cheaper than skipping, and having theses hints enables that. However if there are few versions that SKIP'ing is better, and the optimization can only figure so much. I agree that we should restore the previous behavior. It pains me a bit, since getting more information from the SQM is good thing - looks like this was too much a good thing :) And likely just introduced by accident anyway. > Filter SKIP hinting regression > -- > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24637) Filter SKIP hinting regression
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159477#comment-17159477 ] Lars Hofhansl edited comment on HBASE-24637 at 7/16/20, 8:46 PM: - I see. SKIP is not a hint as such, though, it's the default. The hint (which can be ignore) is the SEEK hint. Both are implemented with return codes. Looks like the SQM is now marking transitions from Column to Column with a SEEK-to-next-column hint, and for each row with a SEEK-to-next-row. Also looking at the numbers, the optimization I mentioned is turning the vast majority of SEEKs back into SKIPs (and that check is not free). As I said, it's not wrong per se (need to look at the code more), but that does not mean that there isn't a performance regression - as I have described in the previous comment - that we need to fix, possibly by restoring the old behavior. Edit: Grammar :) was (Author: lhofhansl): I see. SKIP is not a hint as such, though, it's the default. The hint (which can be ignore) is the SEEK hint. Both are implemented with return codes. Looks like the SQM is now marking transitions from Column to Column with a SEEK-to-next-column hint, and for each row with a SEEK-to-next-row. Also look at the numbers the optimization I mentioned is turning the vast majority back into SKIPs (and that check is not free). As I said, it's not wrong per se (need to look at the code more), but that does not mean that there isn't a performance regression - as I have described in the previous comment - that we need to fix, possibly by restoring the old behavior. > Filter SKIP hinting regression > -- > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159481#comment-17159481 ] Andrew Kyle Purtell commented on HBASE-24637: - I agree the difference in hint codes is not the regression per se, the reseeking is the regression. There is a serious and proportional cost spent in reseeking in branch-2 that is absent in branch-1 under identical test conditions and same store files in hdfs. > Filter SKIP hinting regression > -- > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159480#comment-17159480 ] Lars Hofhansl commented on HBASE-24637: --- That is to say: A SEEK is a hint that can be turned into a series of SKIP, but a SKIP has no extra information. > Filter SKIP hinting regression > -- > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159477#comment-17159477 ] Lars Hofhansl commented on HBASE-24637: --- I see. SKIP is not a hint as such, though, it's the default. The hint (which can be ignore) is the SEEK hint. Both are implemented with return codes. Looks like the SQM is now marking transitions from Column to Column with a SEEK-to-next-column hint, and for each row with a SEEK-to-next-row. Also look at the numbers the optimization I mentioned is turning the vast majority back into SKIPs (and that check is not free). As I said, it's not wrong per se (need to look at the code more), but that does not mean that there isn't a performance regression - as I have described in the previous comment - that we need to fix, possibly by restoring the old behavior. > Filter SKIP hinting regression > -- > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24637) Filter SKIP hinting regression
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159469#comment-17159469 ] Andrew Kyle Purtell edited comment on HBASE-24637 at 7/16/20, 8:41 PM: --- To be extra clear (although it’s in the description :-) ) the issue here is with SKIP hinting, not SEEK hinting. You can see in the numbers that the filters are continuing to hint SKIP, but SQM is no longer honoring it. See filter_hint_xxx and sqm_hint_xxx metrics. In branch-1 filter_hint_skip and sqm_hint_skip are the same. In branch-2 filter_hint_skip is the same as branch-1, but sqm_hint_skip is always 0. was (Author: apurtell): To be extra clear (although it’s in the description :-) ) the issue here is with SKIP hinting, not SEEK hinting. You can see in the numbers that the filters are continuing to hint SKIP, but SQM is no longer honoring it. See filter_hint_xxx and sqm_hint_xxx metrics > Filter SKIP hinting regression > -- > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159469#comment-17159469 ] Andrew Kyle Purtell commented on HBASE-24637: - To be extra clear (although it’s in the description :-) ) the issue here is with SKIP hinting, not SEEK hinting. You can see in the numbers that the filters are continuing to hint SKIP, but SQM is no longer honoring it. See filter_hint_xxx and sqm_hint_xxx metrics > Filter SKIP hinting regression > -- > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharath Vissapragada updated HBASE-24742: - Fix Version/s: 2.2.6 2.1.10 > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0, 2.1.10, 2.2.6 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159467#comment-17159467 ] Lars Hofhansl commented on HBASE-24637: --- Maybe I misunderstood the data in the pdf...? Looks like the SQM is hinting seeks way more in branch-2 than branch-1. > Filter SKIP hinting regression > -- > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24637) Filter SKIP hinting regression
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159459#comment-17159459 ] Lars Hofhansl edited comment on HBASE-24637 at 7/16/20, 8:31 PM: - Hmm... The SQM giving more precise SEEK hints is not necessarily wrong. It's a hint that a SEEK is *possible*. With the SKIP vs SEEK optimization I put in place a while ago then decides at the StoreScanner to follow that hint or not. Now, that optimization itself is not free, it adds 1 or 2 extra compares. In HBASE-24742 I managed to remove one compare in most cases. So it might better now, but it's still not good if we issue too many SEEK hints, for each of which we then have to decide to follow it or not. was (Author: lhofhansl): Hmm... The SQM giving more precise SEEK hints is not necessarily wrong. It's a hint that a SEEK is *possible*. With the SKIP vs SEEK optimization I put in place a while ago then decides at the StoreScanner to follow that hint or not. Now, that itself optimization is not free, it adds one compare per Cell-version + 1 or 2 extra compares (# versions + 1 or 2 in total). In HBASE-24742 I managed to remove one compare in most cases. So it might better now, but it's still not good if we issue too many SEEK hints, for each of which we then have to decide to follow it or not. > Filter SKIP hinting regression > -- > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159462#comment-17159462 ] Andrew Kyle Purtell commented on HBASE-24637: - The filters are hinting SKIP but SQM, unlike in 1, is not, and furthermore it is reseeking at great expense where 1 does not at all. There is a real regression here. This is not an improvement. > Filter SKIP hinting regression > -- > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24637) Filter SKIP hinting regression
[ https://issues.apache.org/jira/browse/HBASE-24637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159459#comment-17159459 ] Lars Hofhansl commented on HBASE-24637: --- Hmm... The SQM giving more precise SEEK hints is not necessarily wrong. It's a hint that a SEEK is *possible*. With the SKIP vs SEEK optimization I put in place a while ago then decides at the StoreScanner to follow that hint or not. Now, that itself optimization is not free, it adds one compare per Cell-version + 1 or 2 extra compares (# versions + 1 or 2 in total). In HBASE-24742 I managed to remove one compare in most cases. So it might better now, but it's still not good if we issue too many SEEK hints, for each of which we then have to decide to follow it or not. > Filter SKIP hinting regression > -- > > Key: HBASE-24637 > URL: https://issues.apache.org/jira/browse/HBASE-24637 > Project: HBase > Issue Type: Bug > Components: Filters, Performance, Scanners >Affects Versions: 2.2.5 >Reporter: Andrew Kyle Purtell >Priority: Major > Attachments: W-7665966-FAST_DIFF-FILTER_ALL.pdf, > W-7665966-Instrument-low-level-scan-details-branch-1.patch, > W-7665966-Instrument-low-level-scan-details-branch-2.2.patch, > parse_call_trace.pl > > > I have been looking into reported performance regressions in HBase 2 relative > to HBase 1. Depending on the test scenario, HBase 2 can demonstrate > significantly better microbenchmarks in a number of cases, and usually shows > improvement in whole cluster benchmarks like YCSB. > To assist in debugging I added methods to RpcServer for updating per-call > metrics that leverage the fact it puts a reference to the current Call into a > thread local and that all activity for a given RPC is processed by a single > thread context. I then instrumented ScanQueryMatcher (in branch-1) and its > various friends (in branch-2.2), StoreScanner, HFileReaderV2 and > HFileReaderV3 (in branch-1) and HFileReaderImpl (in branch-2.2), HFileBlock, > and DefaultMemStore (branch-1) and SegmentScanner (branch-2.2). Test tables > with one family and 1, 5, 10, 20, 50, and 100 distinct column-qualifiers per > row were created, snapshot, dropped, and cloned from the snapshot. Both 1.6 > and 2.2 versions under test operated on identical data files in HDFS. For > tests with 1.6 and 2.2 on the server side the same 1.6 PE client was used, to > ensure only the server side differed. > The results for pe --filterAll were revealing. See attached. > It appears a refactor to ScanQueryMatcher and friends has disabled the > ability of filters to provide meaningful SKIP hints, which disables an > optimization that avoids reseeking, leading to a serious and proportional > regression in reseek activity and time spent in that code path. So for > queries that use filters, there can be a substantial regression. > Other test cases that did not use filters did not show this regression. If > filters are not used the behavior of ScanQueryMatcher between 1.6 and 2.2 was > almost identical, as measured by counts of the hint types returned, whether > or not column or version trackers are called, and counts of store seeks or > reseeks. Regarding micro-timings, there was a 10% variance in my testing and > results generally fell within this range, except for the filter all case of > course. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Apache-HBase commented on pull request #2052: HBASE-24718 : Generic NamedQueue framework for multiple use-cases (Refactor SlowLog responses)
Apache-HBase commented on pull request #2052: URL: https://github.com/apache/hbase/pull/2052#issuecomment-659652716 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 42s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +0 :ok: | prototool | 0m 0s | prototool was not available. | | +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | ||| _ master Compile Tests _ | | +0 :ok: | mvndep | 0m 24s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 4m 28s | master passed | | +1 :green_heart: | checkstyle | 2m 34s | master passed | | +0 :ok: | refguide | 5m 54s | branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. | | +1 :green_heart: | spotbugs | 8m 33s | master passed | | -0 :warning: | patch | 2m 53s | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 13s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 4m 12s | the patch passed | | -0 :warning: | checkstyle | 1m 22s | hbase-server: The patch generated 18 new + 113 unchanged - 4 fixed = 131 total (was 117) | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 1s | The patch has no ill-formed XML file. | | +0 :ok: | refguide | 6m 28s | patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. | | +1 :green_heart: | hadoopcheck | 13m 8s | Patch does not cause any errors with Hadoop 3.1.2 3.2.1. | | +1 :green_heart: | hbaseprotoc | 2m 36s | the patch passed | | +1 :green_heart: | spotbugs | 8m 46s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | asflicense | 0m 43s | The patch does not generate ASF License warnings. | | | | 71m 16s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/artifact/yetus-general-check/output/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/2052 | | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle refguide xml cc hbaseprotoc prototool | | uname | Linux e7aba7bdc4c8 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/hbase-personality.sh | | git revision | master / 2505c7760d | | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/artifact/yetus-general-check/output/branch-site/book.html | | checkstyle | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt | | refguide | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/artifact/yetus-general-check/output/patch-site/book.html | | Max. process+thread count | 84 (vs. ulimit of 12500) | | modules | C: hbase-protocol-shaded hbase-common hbase-client hbase-server U: . | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2052/4/console | | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159456#comment-17159456 ] Andrew Kyle Purtell commented on HBASE-24742: - Sounds good. The other issue is available for branch-2 findings. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24739) [Build] branch-1's build seems broken because of pylint
[ https://issues.apache.org/jira/browse/HBASE-24739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159453#comment-17159453 ] Hudson commented on HBASE-24739: Results for branch branch-1.4 [build #1230 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/1230/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/1230//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/1230//JDK7_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/1230//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > [Build] branch-1's build seems broken because of pylint > --- > > Key: HBASE-24739 > URL: https://issues.apache.org/jira/browse/HBASE-24739 > Project: HBase > Issue Type: Bug > Components: build >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0, 1.4.14 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-24742. --- Resolution: Fixed Also pushed to branch-2 and master. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-24742: -- Attachment: 24742-master.txt > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159444#comment-17159444 ] Lars Hofhansl commented on HBASE-24742: --- Master (and branch-2) patch. Will just apply as they're the same as the branch-1 patch. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0 > > Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-24742: -- Fix Version/s: 2.4.0 3.0.0-alpha-1 > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0 > > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-24742: --- Lemme put this into branch-2 and master as well. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 1.7.0 > > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159385#comment-17159385 ] Lars Hofhansl edited comment on HBASE-24742 at 7/16/20, 7:50 PM: - Merged into branch-1. I'll look into master/branch-2, but my feeling is that things are quite different there. [~apurtell] (before you yell at me for not looking at branch-2/master) :) was (Author: lhofhansl): Merged into branch-1. I'll look into master/branch-2, but my feeling is that things are quite different there. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 1.7.0 > > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-24459) Move the locateMeta logic from AsyncMetaRegionTableLocator to ConnectionRegistry
[ https://issues.apache.org/jira/browse/HBASE-24459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159413#comment-17159413 ] Viraj Jasani edited comment on HBASE-24459 at 7/16/20, 7:28 PM: Thanks [~zhangduo]. {quote}at server side, we will provide caches at backup masters to handle the requests. {quote} This means backup masters will have to talk to active HMaster(through client - asyncClusterConnection / ZKConnectionRegistry?) to get latest RegionLocations for meta? Moreover, cache invalidation for backup masters might be a tricky case. was (Author: vjasani): Thanks [~zhangduo]. {quote}at server side, we will provide caches at backup masters to handle the requests. {quote} This means backup masters will have to talk to active HMaster(through client / ZKConnectionRegistry?) to get latest RegionLocations for meta since we can't get that detail from ZK anymore? > Move the locateMeta logic from AsyncMetaRegionTableLocator to > ConnectionRegistry > > > Key: HBASE-24459 > URL: https://issues.apache.org/jira/browse/HBASE-24459 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > > Now the related code is only in AsyncMetaRegionTableLocator, we could make > the actually implementation pluggable, so we do not always need to go to the > active master. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24459) Move the locateMeta logic from AsyncMetaRegionTableLocator to ConnectionRegistry
[ https://issues.apache.org/jira/browse/HBASE-24459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159413#comment-17159413 ] Viraj Jasani commented on HBASE-24459: -- Thanks [~zhangduo]. {quote}at server side, we will provide caches at backup masters to handle the requests. {quote} This means backup masters will have to talk to active HMaster(through client / ZKConnectionRegistry?) to get latest RegionLocations for meta since we can't get that detail from ZK anymore? > Move the locateMeta logic from AsyncMetaRegionTableLocator to > ConnectionRegistry > > > Key: HBASE-24459 > URL: https://issues.apache.org/jira/browse/HBASE-24459 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > > Now the related code is only in AsyncMetaRegionTableLocator, we could make > the actually implementation pluggable, so we do not always need to go to the > active master. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Apache-HBase commented on pull request #2077: HBASE-24684 Fetch ReplicationSink servers list from HMaster instead o…
Apache-HBase commented on pull request #2077: URL: https://github.com/apache/hbase/pull/2077#issuecomment-659585448 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 32s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +0 :ok: | prototool | 0m 1s | prototool was not available. | | +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | ||| _ HBASE-24666 Compile Tests _ | | +0 :ok: | mvndep | 0m 13s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 3m 25s | HBASE-24666 passed | | +1 :green_heart: | checkstyle | 2m 26s | HBASE-24666 passed | | +1 :green_heart: | spotbugs | 7m 21s | HBASE-24666 passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 13s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 3m 24s | the patch passed | | +1 :green_heart: | checkstyle | 0m 11s | The patch passed checkstyle in hbase-protocol-shaded | | +1 :green_heart: | checkstyle | 0m 27s | The patch passed checkstyle in hbase-client | | +1 :green_heart: | checkstyle | 1m 8s | hbase-server: The patch generated 0 new + 185 unchanged - 2 fixed = 185 total (was 187) | | +1 :green_heart: | checkstyle | 0m 39s | The patch passed checkstyle in hbase-thrift | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | hadoopcheck | 11m 5s | Patch does not cause any errors with Hadoop 3.1.2 3.2.1. | | +1 :green_heart: | hbaseprotoc | 2m 27s | the patch passed | | +1 :green_heart: | spotbugs | 8m 1s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | asflicense | 0m 50s | The patch does not generate ASF License warnings. | | | | 50m 24s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/3/artifact/yetus-general-check/output/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/2077 | | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle cc hbaseprotoc prototool | | uname | Linux 285950dfccff 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/hbase-personality.sh | | git revision | HBASE-24666 / 9e8c930feb | | Max. process+thread count | 94 (vs. ulimit of 12500) | | modules | C: hbase-protocol-shaded hbase-client hbase-server hbase-thrift U: . | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/3/console | | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-24742. --- Resolution: Fixed > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 1.7.0 > > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159385#comment-17159385 ] Lars Hofhansl commented on HBASE-24742: --- Merged into branch-1. I'll look into master/branch-2, but my feeling is that things are quite different there. > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 1.7.0 > > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24742) Improve performance of SKIP vs SEEK logic
[ https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-24742: -- Fix Version/s: 1.7.0 > Improve performance of SKIP vs SEEK logic > - > > Key: HBASE-24742 > URL: https://issues.apache.org/jira/browse/HBASE-24742 > Project: HBase > Issue Type: Bug > Components: Performance, regionserver >Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0 >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl >Priority: Major > Fix For: 1.7.0 > > Attachments: hbase-1.6-regression-flame-graph.png, > hbase-24742-branch-1.txt > > > In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% > slowdown in scanning scenarios. > We tracked it back to HBASE-17958 and HBASE-19863. > Both add comparisons to one of the tightest HBase has. > [~bharathv] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] lhofhansl commented on pull request #2075: HBASE-24742 Improve performance of SKIP vs SEEK logic.
lhofhansl commented on pull request #2075: URL: https://github.com/apache/hbase/pull/2075#issuecomment-659553602 Darn... It used the wrong (old) email in the commit. Oh well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] lhofhansl merged pull request #2075: HBASE-24742 Improve performance of SKIP vs SEEK logic.
lhofhansl merged pull request #2075: URL: https://github.com/apache/hbase/pull/2075 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HBASE-24376) MergeNormalizer is merging non-adjacent regions and causing region overlaps/holes.
[ https://issues.apache.org/jira/browse/HBASE-24376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159355#comment-17159355 ] Huaxiang Sun commented on HBASE-24376: -- I think in hbase-1, the normalizer uses a force flag to do the merge, which means that if the two regions are not next to each other, it will still merge them. Are you sure that inconsistency is caused by normalizer? I.e, before normalizer run, table is consistent, after normalizer run, there is inconsistency. If that is the case, the issue is that normalizer merges two non-adjacent regions, which will cause overlaps. There is one such issue with hbase-2, but I checked the code, hbase-1 seems ok. You can go over the master log, dump out meta table, and inconsistency report from hbck to check if that is the case. > MergeNormalizer is merging non-adjacent regions and causing region > overlaps/holes. > -- > > Key: HBASE-24376 > URL: https://issues.apache.org/jira/browse/HBASE-24376 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.3.0 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Critical > Fix For: 3.0.0-alpha-1, 2.3.0 > > > Currently, we found normalizer was merging regions which are non-adjacent, it > will cause inconsistencies in the cluster. > {code:java} > 439055 2020-05-08 17:47:09,814 INFO > org.apache.hadoop.hbase.master.normalizer.MergeNormalizationPlan: Executing > merging normalization plan: MergeNormalizationPlan{firstRegion={ENCODED => > 47fe236a5e3649ded95cb64ad0c08492, NAME => > 'TABLE,\x03\x01\x05\x01\x04\x02,1554838974870.47fe236a5e3649ded95cb64ad > 0c08492.', STARTKEY => '\x03\x01\x05\x01\x04\x02', ENDKEY => > '\x03\x01\x05\x01\x04\x02\x01\x02\x02201904082200\x00\x00\x03Mac\x00\x00\x00\x00\x00\x00\x00\x00\x00iMac13,1\x00\x00\x00\x00\x00\x049.3-14E260\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x05'}, > secondRegion={ENCODED => 0c0f2aa67f4329d5c4 8ba0320f173d31, NAME => > 'TABLE,\x03\x01\x05\x02\x01\x01,1554830735526.0c0f2aa67f4329d5c48ba0320f173d31.', > STARTKEY => '\x03\x01\x05\x02\x01\x01', ENDKEY => > '\x03\x01\x05\x02\x01\x02'}} > 439056 2020-05-08 17:47:11,438 INFO org.apache.hadoop.hbase.ScheduledChore: > CatalogJanitor-*:16000 average execution time: 1676219193 ns. > 439057 2020-05-08 17:47:11,730 INFO org.apache.hadoop.hbase.master.HMaster: > Client=null/null merge regions [47fe236a5e3649ded95cb64ad0c08492], > [0c0f2aa67f4329d5c48ba0320f173d31] > {code} > > The root cause is that getMergeNormalizationPlan() uses a list of regionInfo > which is ordered by regionName. regionName does not necessary guarantee the > order of STARTKEY (let's say 'aa1', 'aa1!', in order of regionName, it will > be 'aa1!' followed by 'aa1'. This will result in normalizer merging > non-adjacent regions into one and creates overlaps. This is not an issue in > branch-1 as the list is already ordered by RegionInfo.COMPARATOR in > normalizer. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] lhofhansl commented on pull request #2075: HBASE-24742 Improve performance of SKIP vs SEEK logic.
lhofhansl commented on pull request #2075: URL: https://github.com/apache/hbase/pull/2075#issuecomment-659543608 Alright going to merge in a few. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] bharathv commented on pull request #2075: HBASE-24742 Improve performance of SKIP vs SEEK logic.
bharathv commented on pull request #2075: URL: https://github.com/apache/hbase/pull/2075#issuecomment-659540298 Agree that the indents can be fixed separately in the interest of keeping the patch smaller. +1 from my side on the patch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] lhofhansl edited a comment on pull request #2075: HBASE-24742 Improve performance of SKIP vs SEEK logic.
lhofhansl edited a comment on pull request #2075: URL: https://github.com/apache/hbase/pull/2075#issuecomment-659516053 Re: Checkstyle... The indentation of the entire block in StoreScanner.\ is wrong (5 instead of 4) happy to fix the entire block, but it'd be unrelated make the patch larger. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] lhofhansl commented on pull request #2075: HBASE-24742 Improve performance of SKIP vs SEEK logic.
lhofhansl commented on pull request #2075: URL: https://github.com/apache/hbase/pull/2075#issuecomment-659516053 Re: Checkstyle... The indentation of the entire block in StoreScanner. is wrong (5 instead of 4) happy to fix the entire block, but it'd be unrelated make the patch larger. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] joshelser commented on a change in pull request #1935: HBASE-22146 SpaceQuotaViolationPolicy Disable is not working in Names…
joshelser commented on a change in pull request #1935: URL: https://github.com/apache/hbase/pull/1935#discussion_r455881727 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/quotas/QuotaTableUtil.java ## @@ -628,6 +628,34 @@ static Put createPutForNamespaceSnapshotSize(String namespace, long size) { } } + /** + * Remove table usage snapshots (u:p columns) for the namespace passed + * @param connection connection to re-use + * @param namespace the namespace to fetch the list of table usage snapshots + */ + static void deleteTableUsageSnapshotsForNamespace(Connection connection, String namespace) +throws IOException { +Scan s = new Scan(); +//Get rows for all tables in namespace +s.setRowPrefixFilter(Bytes.toBytes("t." + namespace)); Review comment: need the namespace delimiter at the end of your prefix filter, otherwise you'll over-match e.g. deleting usage for the namespace 'foo' would also end up matching and deleting for the namespace 'foobar'. Making it 't.foo:' should be good enough. ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/quotas/QuotaTableUtil.java ## @@ -628,6 +628,34 @@ static Put createPutForNamespaceSnapshotSize(String namespace, long size) { } } + /** + * Remove table usage snapshots (u:p columns) for the namespace passed + * @param connection connection to re-use + * @param namespace the namespace to fetch the list of table usage snapshots + */ + static void deleteTableUsageSnapshotsForNamespace(Connection connection, String namespace) +throws IOException { +Scan s = new Scan(); +//Get rows for all tables in namespace +s.setRowPrefixFilter(Bytes.toBytes("t." + namespace)); Review comment: Bonus points if you write a test which demonstrates the bug before you fix it :) ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/QuotaUtil.java ## @@ -266,6 +266,13 @@ private static void deleteQuotas(final Connection connection, final byte[] rowKe if (qualifier != null) { delete.addColumns(QUOTA_FAMILY_INFO, qualifier); } +if (isNamespaceRowKey(rowKey)) { + String ns = getNamespaceFromRowKey(rowKey); + Quotas namespaceQuota = getNamespaceQuota(connection,ns); + if (namespaceQuota != null && namespaceQuota.hasSpace()) { +deleteTableUsageSnapshotsForNamespace(connection, ns); Review comment: Please add a comment here as to why we need to delete the table usage rows (to prevent the next person from having to come back and do the same investigation we did :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache9 commented on a change in pull request #2014: HBASE-24673 TransitionRegionStateProcedure of non-meta regions should…
Apache9 commented on a change in pull request #2014: URL: https://github.com/apache/hbase/pull/2014#discussion_r455880826 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java ## @@ -200,14 +200,21 @@ private void queueAssign(MasterProcedureEnv env, RegionStateNode regionNode) } } - private void openRegion(MasterProcedureEnv env, RegionStateNode regionNode) throws IOException { + private void openRegion(MasterProcedureEnv env, RegionStateNode regionNode) +throws IOException, ProcedureSuspendedException { ServerName loc = regionNode.getRegionLocation(); if (loc == null) { LOG.warn("No location specified for {}, jump back to state {} to get one", getRegion(), RegionStateTransitionState.REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE); setNextState(RegionStateTransitionState.REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE); return; } +final boolean isMeta = regionNode.getRegionInfo().isMetaRegion(); +final boolean isMetaAvailable = !env.getAssignmentManager().isMetaRegionInTransition(); +if (!isMeta && !isMetaAvailable) { + // meta is not assigned yet, so yield + throw new ProcedureSuspendedException(); Review comment: This can not solve all the proble, neither the check here, nor the check in waitInitialized. It could always happen that when checking the meta is online,but when you actually write to it, it goes offline... So in general, we should have a way to deal with meta update failure(maybe just a retry at procedure level?)and have a smaller timeout on updaing meta operation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (HBASE-24739) [Build] branch-1's build seems broken because of pylint
[ https://issues.apache.org/jira/browse/HBASE-24739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan resolved HBASE-24739. --- Hadoop Flags: Reviewed Resolution: Fixed > [Build] branch-1's build seems broken because of pylint > --- > > Key: HBASE-24739 > URL: https://issues.apache.org/jira/browse/HBASE-24739 > Project: HBase > Issue Type: Bug > Components: build >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0, 1.4.14 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24739) [Build] branch-1's build seems broken because of pylint
[ https://issues.apache.org/jira/browse/HBASE-24739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159286#comment-17159286 ] Reid Chan commented on HBASE-24739: --- Pushed to branch-1 and branch-1.4 > [Build] branch-1's build seems broken because of pylint > --- > > Key: HBASE-24739 > URL: https://issues.apache.org/jira/browse/HBASE-24739 > Project: HBase > Issue Type: Bug > Components: build >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0, 1.4.14 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24739) [Build] branch-1's build seems broken because of pylint
[ https://issues.apache.org/jira/browse/HBASE-24739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24739: -- Component/s: build > [Build] branch-1's build seems broken because of pylint > --- > > Key: HBASE-24739 > URL: https://issues.apache.org/jira/browse/HBASE-24739 > Project: HBase > Issue Type: Bug > Components: build >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0, 1.4.14 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24739) [Build] branch-1's build seems broken because of pylint
[ https://issues.apache.org/jira/browse/HBASE-24739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan updated HBASE-24739: -- Fix Version/s: 1.4.14 1.7.0 > [Build] branch-1's build seems broken because of pylint > --- > > Key: HBASE-24739 > URL: https://issues.apache.org/jira/browse/HBASE-24739 > Project: HBase > Issue Type: Bug >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > Fix For: 1.7.0, 1.4.14 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] wchevreuil commented on a change in pull request #2052: HBASE-24718 : Generic NamedQueue framework for multiple use-cases (Refactor SlowLog responses)
wchevreuil commented on a change in pull request #2052: URL: https://github.com/apache/hbase/pull/2052#discussion_r455866079 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/namequeues/LogEventHandler.java ## @@ -0,0 +1,101 @@ +/* + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.namequeues; + +import com.lmax.disruptor.EventHandler; +import com.lmax.disruptor.RingBuffer; + +import java.util.HashMap; +import java.util.Map; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.namequeues.request.NamedQueueGetRequest; +import org.apache.hadoop.hbase.namequeues.response.NamedQueueGetResponse; +import org.apache.yetus.audience.InterfaceAudience; + +/** + * Event Handler run by disruptor ringbuffer consumer. + * Although this is generic implementation for namedQueue, it can have individual queue specific + * logic. + */ +@InterfaceAudience.Private +class LogEventHandler implements EventHandler { + + // Map that binds namedQueues to corresponding queue service implementation. + // If NamedQueue of specific type is enabled, corresponding service will be used to + // insert and retrieve records. + // Individual queue sizes should be determined based on their individual configs within + // each service. + private final Map namedQueueServices = +new HashMap<>(); + + LogEventHandler(final Configuration conf) { +// add all service mappings here +namedQueueServices Review comment: Use reflection and avoid changing this class with every new implementation added, something similar to what we do for [SaslServerAuthenticationProviders](https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/security/provider/SaslServerAuthenticationProviders.java#L107) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HBASE-24459) Move the locateMeta logic from AsyncMetaRegionTableLocator to ConnectionRegistry
[ https://issues.apache.org/jira/browse/HBASE-24459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159242#comment-17159242 ] Duo Zhang commented on HBASE-24459: --- The plan is to let the ZKConnectionRegistry to go to the active master, as we can get the active master from zk. It will be the registry used inside HBase Cluster, i.e, the Connections at RS side will use it to locate meta. And for MasterConnectionRegistry, the client side is almost the same, we will do hedge requests to the configured masters, and at server side, we will provide caches at backup masters to handle the requests. > Move the locateMeta logic from AsyncMetaRegionTableLocator to > ConnectionRegistry > > > Key: HBASE-24459 > URL: https://issues.apache.org/jira/browse/HBASE-24459 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > > Now the related code is only in AsyncMetaRegionTableLocator, we could make > the actually implementation pluggable, so we do not always need to go to the > active master. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-24690) Set version to 2.2.6 in branch-2.2 for first RC of 2.2.6
[ https://issues.apache.org/jira/browse/HBASE-24690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-24690: --- Fix Version/s: 2.2.6 > Set version to 2.2.6 in branch-2.2 for first RC of 2.2.6 > > > Key: HBASE-24690 > URL: https://issues.apache.org/jira/browse/HBASE-24690 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Priority: Major > Fix For: 2.2.6 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-24690) Set version to 2.2.6 in branch-2.2 for first RC of 2.2.6
[ https://issues.apache.org/jira/browse/HBASE-24690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-24690. Assignee: Guanghao Zhang Resolution: Fixed > Set version to 2.2.6 in branch-2.2 for first RC of 2.2.6 > > > Key: HBASE-24690 > URL: https://issues.apache.org/jira/browse/HBASE-24690 > Project: HBase > Issue Type: Sub-task >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 2.2.6 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-24467) Backport HBASE-23963: Split TestFromClientSide; it takes too long to complete timing out
[ https://issues.apache.org/jira/browse/HBASE-24467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang resolved HBASE-24467. Resolution: Fixed > Backport HBASE-23963: Split TestFromClientSide; it takes too long to complete > timing out > > > Key: HBASE-24467 > URL: https://issues.apache.org/jira/browse/HBASE-24467 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.2.5 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 2.2.6 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Apache-HBase commented on pull request #2077: HBASE-24684 Fetch ReplicationSink servers list from HMaster instead o…
Apache-HBase commented on pull request #2077: URL: https://github.com/apache/hbase/pull/2077#issuecomment-659345109 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 31s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +0 :ok: | prototool | 0m 1s | prototool was not available. | | +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | ||| _ HBASE-24666 Compile Tests _ | | +0 :ok: | mvndep | 0m 23s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 3m 37s | HBASE-24666 passed | | +1 :green_heart: | checkstyle | 2m 31s | HBASE-24666 passed | | +1 :green_heart: | spotbugs | 7m 31s | HBASE-24666 passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 13s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 3m 19s | the patch passed | | -0 :warning: | checkstyle | 0m 28s | hbase-client: The patch generated 2 new + 38 unchanged - 0 fixed = 40 total (was 38) | | -0 :warning: | checkstyle | 1m 7s | hbase-server: The patch generated 1 new + 185 unchanged - 2 fixed = 186 total (was 187) | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | hadoopcheck | 11m 11s | Patch does not cause any errors with Hadoop 3.1.2 3.2.1. | | +1 :green_heart: | hbaseprotoc | 2m 27s | the patch passed | | +1 :green_heart: | spotbugs | 8m 14s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | asflicense | 0m 51s | The patch does not generate ASF License warnings. | | | | 51m 21s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/2/artifact/yetus-general-check/output/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/2077 | | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle cc hbaseprotoc prototool | | uname | Linux 54059399f6e4 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/hbase-personality.sh | | git revision | HBASE-24666 / 9e8c930feb | | checkstyle | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/2/artifact/yetus-general-check/output/diff-checkstyle-hbase-client.txt | | checkstyle | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/2/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt | | Max. process+thread count | 94 (vs. ulimit of 12500) | | modules | C: hbase-protocol-shaded hbase-client hbase-server hbase-thrift U: . | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/2/console | | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (HBASE-24459) Move the locateMeta logic from AsyncMetaRegionTableLocator to ConnectionRegistry
[ https://issues.apache.org/jira/browse/HBASE-24459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17158625#comment-17158625 ] Viraj Jasani edited comment on HBASE-24459 at 7/16/20, 11:06 AM: - Few questions: If we move locateMeta to ConnectionRegistry, MasterRegistry can call MasterProtos.locateMetaRegion(), however, what about ZKConnectionRegistry? Are we expecting default registry to be MasterRegistry with splittable meta work (skimmed through design doc, maybe I missed something related to this)? I think mostly no because many internal cluster connections still do require ZKConnection right? I hope the intention of this Jira is for clients to hedge the requests in a random order to avoid hot-spotting a single HMaster, hence using Master Registry. Also, with splittable meta, we have no chance of ZK locating meta regions. Is that correct? As of now, logic of *locateMetaRegion() call to active HMaster* is kept in AsyncMetaTableRegionLocator. Is it due to the same reason that we don't have a way for ZK Resitry to provide us meta regions? was (Author: vjasani): Few questions: If we move locateMeta to ConnectionRegistry, MasterRegistry can call MasterProtos.locateMetaRegion(), however, what about ZKConnectionRegistry? Are we expecting default registry to be MasterRegistry with splittable meta work (skimmed through design doc, maybe I missed something related to this)? I think mostly no because many internal cluster connections still do require ZKConnection right? I hope the intention of this Jira is for clients to hedge the requests in a random order to avoid hot-spotting a single HMaster, hence using Master Registry. Also, with splittable meta, we have no chance of ZK locating meta regions. Is that correct? > Move the locateMeta logic from AsyncMetaRegionTableLocator to > ConnectionRegistry > > > Key: HBASE-24459 > URL: https://issues.apache.org/jira/browse/HBASE-24459 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > > Now the related code is only in AsyncMetaRegionTableLocator, we could make > the actually implementation pluggable, so we do not always need to go to the > active master. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24739) [Build] branch-1's build seems broken because of pylint
[ https://issues.apache.org/jira/browse/HBASE-24739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159124#comment-17159124 ] Hudson commented on HBASE-24739: Results for branch branch-1 [build #1326 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1326/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1326//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1326//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1326//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > [Build] branch-1's build seems broken because of pylint > --- > > Key: HBASE-24739 > URL: https://issues.apache.org/jira/browse/HBASE-24739 > Project: HBase > Issue Type: Bug >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Blocker > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24721) rename_rsgroup overwriting the existing rsgroup.
[ https://issues.apache.org/jira/browse/HBASE-24721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159123#comment-17159123 ] Hudson commented on HBASE-24721: Results for branch branch-1 [build #1326 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1326/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1326//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1326//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1326//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > rename_rsgroup overwriting the existing rsgroup. > > > Key: HBASE-24721 > URL: https://issues.apache.org/jira/browse/HBASE-24721 > Project: HBase > Issue Type: Bug >Reporter: chiranjeevi >Assignee: Mohammad Arshad >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.4.0, 2.2.6 > > > rename_rsgroup overwriting the current rsgroup. > Steps: > 1)add_rsgroup 'RSG1' and 'RSG2' > 2)move_servers_rsgroup 'RSG1',['server1:port'] > 3)rename_rsgroup 'RSG1','RSG2' > After performing step3 RSG1 overwriting to RSG2 and region servers added in > RSG1 are not available now. > Ideally system should show error message Group already exists: RSG2 > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-24615) MutableRangeHistogram#updateSnapshotRangeMetrics doesn't calculate the distribution for last bucket.
[ https://issues.apache.org/jira/browse/HBASE-24615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159122#comment-17159122 ] Hudson commented on HBASE-24615: Results for branch branch-1 [build #1326 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1326/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1326//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1326//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1326//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > MutableRangeHistogram#updateSnapshotRangeMetrics doesn't calculate the > distribution for last bucket. > > > Key: HBASE-24615 > URL: https://issues.apache.org/jira/browse/HBASE-24615 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 2.3.0, master, 1.3.7, 2.2.6 >Reporter: Rushabh Shah >Assignee: wenfeiyi666 >Priority: Major > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.4.0, 2.2.6 > > > We are not processing the distribution for last bucket. > https://github.com/apache/hbase/blob/master/hbase-hadoop-compat/src/main/java/org/apache/hadoop/metrics2/lib/MutableRangeHistogram.java#L70 > {code:java} > public void updateSnapshotRangeMetrics(MetricsRecordBuilder > metricsRecordBuilder, > Snapshot snapshot) { > long priorRange = 0; > long cumNum = 0; > final long[] ranges = getRanges(); > final String rangeType = getRangeType(); > for (int i = 0; i < ranges.length - 1; i++) { -> The bug lies > here. We are not processing last bucket. > long val = snapshot.getCountAtOrBelow(ranges[i]); > if (val - cumNum > 0) { > metricsRecordBuilder.addCounter( > Interns.info(name + "_" + rangeType + "_" + priorRange + "-" + > ranges[i], desc), > val - cumNum); > } > priorRange = ranges[i]; > cumNum = val; > } > long val = snapshot.getCount(); > if (val - cumNum > 0) { > metricsRecordBuilder.addCounter( > Interns.info(name + "_" + rangeType + "_" + ranges[ranges.length - > 1] + "-inf", desc), > val - cumNum); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] virajjasani commented on a change in pull request #2076: HBASE-24740 Enable journal logging for HBase snapshot operation
virajjasani commented on a change in pull request #2076: URL: https://github.com/apache/hbase/pull/2076#discussion_r455697655 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotManifest.java ## @@ -343,6 +343,13 @@ private void load() throws IOException { } } + /** + * Sets the status task for monitoring all the subtasks for Snapshot operation + */ + public void setMonitoredTask(MonitoredTask statusTask) { Review comment: Passing with create() sounds good, this class is anyways IA.Private, we can add new argument to public methods. ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/master/snapshot/SnapshotManager.java ## @@ -34,22 +37,21 @@ import java.util.concurrent.TimeUnit; import java.util.concurrent.locks.ReadWriteLock; import java.util.concurrent.locks.ReentrantReadWriteLock; - import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; -import org.apache.hadoop.hbase.classification.InterfaceAudience; -import org.apache.hadoop.hbase.classification.InterfaceStability; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataInputStream; import org.apache.hadoop.fs.FileStatus; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; -import org.apache.hadoop.hbase.TableName; import org.apache.hadoop.hbase.HBaseInterfaceAudience; import org.apache.hadoop.hbase.HConstants; import org.apache.hadoop.hbase.HTableDescriptor; -import org.apache.hadoop.hbase.Stoppable; import org.apache.hadoop.hbase.MetaTableAccessor; +import org.apache.hadoop.hbase.Stoppable; +import org.apache.hadoop.hbase.TableName; +import org.apache.hadoop.hbase.classification.InterfaceAudience; +import org.apache.hadoop.hbase.classification.InterfaceStability; Review comment: looks like import reorder is due to formatter :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] infraio merged pull request #2073: HBASE-24467 Backport HBASE-23963: Split TestFromClientSide; it takes …
infraio merged pull request #2073: URL: https://github.com/apache/hbase/pull/2073 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] ddupg commented on a change in pull request #2077: HBASE-24684 Fetch ReplicationSink servers list from HMaster instead o…
ddupg commented on a change in pull request #2077: URL: https://github.com/apache/hbase/pull/2077#discussion_r455626321 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/HBaseReplicationEndpoint.java ## @@ -148,27 +149,30 @@ public boolean isAborted() { return false; } + /** + * Get the connection to peer cluster + * @return connection to peer cluster + * @throws IOException + */ + protected synchronized Connection getPeerConnection() throws IOException { +if (peerConnection == null) { + peerConnection = ConnectionFactory.createConnection(ctx.getConfiguration()); +} +return peerConnection; + } + /** * Get the list of all the region servers from the specified peer - * @param zkw zk connection to use * @return list of region server addresses or an empty list if the slave is unavailable */ - protected static List fetchSlavesAddresses(ZKWatcher zkw) Review comment: OK,try new impl firstly and old ZK impl secondly, is that OK? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #2071: HBASE-24743 Reject to add a peer which replicate to itself earlier
Apache-HBase commented on pull request #2071: URL: https://github.com/apache/hbase/pull/2071#issuecomment-659248670 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 2s | Docker mode activated. | | -0 :warning: | yetus | 0m 2s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck | ||| _ Prechecks _ | ||| _ master Compile Tests _ | | +1 :green_heart: | mvninstall | 4m 30s | master passed | | +1 :green_heart: | compile | 1m 11s | master passed | | +1 :green_heart: | shadedjars | 6m 18s | branch has no errors when building our shaded downstream artifacts. | | -0 :warning: | javadoc | 0m 40s | hbase-server in master failed. | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 4m 28s | the patch passed | | +1 :green_heart: | compile | 1m 9s | the patch passed | | +1 :green_heart: | javac | 1m 9s | the patch passed | | +1 :green_heart: | shadedjars | 6m 21s | patch has no errors when building our shaded downstream artifacts. | | -0 :warning: | javadoc | 0m 41s | hbase-server in the patch failed. | ||| _ Other Tests _ | | -1 :x: | unit | 201m 27s | hbase-server in the patch failed. | | | | 229m 22s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2071/3/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/2071 | | Optional Tests | javac javadoc unit shadedjars compile | | uname | Linux 3680f72cba1e 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/hbase-personality.sh | | git revision | master / 2505c7760d | | Default Java | 2020-01-14 | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2071/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2071/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt | | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2071/3/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2071/3/testReport/ | | Max. process+thread count | 3126 (vs. ulimit of 12500) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2071/3/console | | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HBASE-11288) Splittable Meta
[ https://issues.apache.org/jira/browse/HBASE-11288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159008#comment-17159008 ] Hudson commented on HBASE-11288: Results for branch HBASE-11288 [build #3 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-11288/3/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-11288/3/General_20Nightly_20Build_20Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-11288/3/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-11288/3/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (x) {color:red}-1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-11288/3/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Splittable Meta > --- > > Key: HBASE-11288 > URL: https://issues.apache.org/jira/browse/HBASE-11288 > Project: HBase > Issue Type: Umbrella > Components: meta >Reporter: Francis Christopher Liu >Assignee: Francis Christopher Liu >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Apache-HBase commented on pull request #2075: HBASE-24742 Improve performance of SKIP vs SEEK logic.
Apache-HBase commented on pull request #2075: URL: https://github.com/apache/hbase/pull/2075#issuecomment-659226090 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 37s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | -0 :warning: | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ branch-1 Compile Tests _ | | +1 :green_heart: | mvninstall | 9m 46s | branch-1 passed | | +1 :green_heart: | compile | 0m 39s | branch-1 passed with JDK v1.8.0_252 | | +1 :green_heart: | compile | 0m 44s | branch-1 passed with JDK v1.7.0_262 | | +1 :green_heart: | checkstyle | 1m 42s | branch-1 passed | | +1 :green_heart: | shadedjars | 3m 0s | branch has no errors when building our shaded downstream artifacts. | | +1 :green_heart: | javadoc | 0m 48s | branch-1 passed with JDK v1.8.0_252 | | +1 :green_heart: | javadoc | 0m 40s | branch-1 passed with JDK v1.7.0_262 | | +0 :ok: | spotbugs | 3m 3s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 3m 0s | branch-1 passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 1m 56s | the patch passed | | +1 :green_heart: | compile | 0m 40s | the patch passed with JDK v1.8.0_252 | | +1 :green_heart: | javac | 0m 40s | the patch passed | | +1 :green_heart: | compile | 0m 43s | the patch passed with JDK v1.7.0_262 | | +1 :green_heart: | javac | 0m 43s | the patch passed | | -1 :x: | checkstyle | 1m 30s | hbase-server: The patch generated 1 new + 38 unchanged - 1 fixed = 39 total (was 39) | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | shadedjars | 2m 47s | patch has no errors when building our shaded downstream artifacts. | | +1 :green_heart: | hadoopcheck | 4m 43s | Patch does not cause any errors with Hadoop 2.8.5 2.9.2. | | +1 :green_heart: | javadoc | 0m 31s | the patch passed with JDK v1.8.0_252 | | +1 :green_heart: | javadoc | 0m 41s | the patch passed with JDK v1.7.0_262 | | +1 :green_heart: | findbugs | 2m 50s | the patch passed | ||| _ Other Tests _ | | +1 :green_heart: | unit | 134m 29s | hbase-server in the patch passed. | | +1 :green_heart: | asflicense | 0m 38s | The patch does not generate ASF License warnings. | | | | 176m 2s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2075/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/2075 | | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux af8ea015eb72 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/Base-PreCommit-GitHub-PR_PR-2075/out/precommit/personality/provided.sh | | git revision | branch-1 / b249092 | | Default Java | 1.7.0_262 | | Multi-JDK versions | /usr/lib/jvm/zulu-8-amd64:1.8.0_252 /usr/lib/jvm/zulu-7-amd64:1.7.0_262 | | checkstyle | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2075/3/artifact/out/diff-checkstyle-hbase-server.txt | | Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2075/3/testReport/ | | Max. process+thread count | 4208 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2075/3/console | | versions | git=1.9.1 maven=3.0.5 findbugs=3.0.1 | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #2077: HBASE-24684 Fetch ReplicationSink servers list from HMaster instead o…
Apache-HBase commented on pull request #2077: URL: https://github.com/apache/hbase/pull/2077#issuecomment-659222632 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 1m 10s | Docker mode activated. | | -0 :warning: | yetus | 0m 3s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck | ||| _ Prechecks _ | ||| _ HBASE-24666 Compile Tests _ | | +0 :ok: | mvndep | 0m 22s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 4m 42s | HBASE-24666 passed | | +1 :green_heart: | compile | 3m 20s | HBASE-24666 passed | | +1 :green_heart: | shadedjars | 6m 22s | branch has no errors when building our shaded downstream artifacts. | | -0 :warning: | javadoc | 0m 25s | hbase-client in HBASE-24666 failed. | | -0 :warning: | javadoc | 0m 40s | hbase-server in HBASE-24666 failed. | | -0 :warning: | javadoc | 1m 0s | hbase-thrift in HBASE-24666 failed. | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 13s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 4m 31s | the patch passed | | +1 :green_heart: | compile | 3m 21s | the patch passed | | +1 :green_heart: | javac | 3m 21s | the patch passed | | +1 :green_heart: | shadedjars | 6m 27s | patch has no errors when building our shaded downstream artifacts. | | -0 :warning: | javadoc | 0m 24s | hbase-client in the patch failed. | | -0 :warning: | javadoc | 0m 42s | hbase-server in the patch failed. | | -0 :warning: | javadoc | 0m 57s | hbase-thrift in the patch failed. | ||| _ Other Tests _ | | +1 :green_heart: | unit | 1m 3s | hbase-protocol-shaded in the patch passed. | | +1 :green_heart: | unit | 1m 27s | hbase-client in the patch passed. | | -1 :x: | unit | 214m 18s | hbase-server in the patch failed. | | +1 :green_heart: | unit | 5m 3s | hbase-thrift in the patch passed. | | | | 259m 0s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/2077 | | Optional Tests | javac javadoc unit shadedjars compile | | uname | Linux fa7455056916 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/hbase-personality.sh | | git revision | HBASE-24666 / 9e8c930feb | | Default Java | 2020-01-14 | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-thrift.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt | | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-thrift.txt | | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/1/testReport/ | | Max. process+thread count | 2947 (vs. ulimit of 12500) | | modules | C: hbase-protocol-shaded hbase-client hbase-server hbase-thrift U: . | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2077/1/console | | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] Apache-HBase commented on pull request #2071: HBASE-24743 Reject to add a peer which replicate to itself earlier
Apache-HBase commented on pull request #2071: URL: https://github.com/apache/hbase/pull/2071#issuecomment-659216624 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 31s | Docker mode activated. | | -0 :warning: | yetus | 0m 3s | Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck | ||| _ Prechecks _ | ||| _ master Compile Tests _ | | +1 :green_heart: | mvninstall | 3m 24s | master passed | | +1 :green_heart: | compile | 0m 55s | master passed | | +1 :green_heart: | shadedjars | 5m 39s | branch has no errors when building our shaded downstream artifacts. | | +1 :green_heart: | javadoc | 0m 39s | master passed | ||| _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 3m 29s | the patch passed | | +1 :green_heart: | compile | 0m 54s | the patch passed | | +1 :green_heart: | javac | 0m 54s | the patch passed | | +1 :green_heart: | shadedjars | 5m 36s | patch has no errors when building our shaded downstream artifacts. | | +1 :green_heart: | javadoc | 0m 36s | the patch passed | ||| _ Other Tests _ | | -1 :x: | unit | 146m 46s | hbase-server in the patch failed. | | | | 170m 25s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2071/3/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/2071 | | Optional Tests | javac javadoc unit shadedjars compile | | uname | Linux e34379920a0a 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/hbase-personality.sh | | git revision | master / 2505c7760d | | Default Java | 1.8.0_232 | | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2071/3/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2071/3/testReport/ | | Max. process+thread count | 4644 (vs. ulimit of 12500) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2071/3/console | | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) | | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (HBASE-24376) MergeNormalizer is merging non-adjacent regions and causing region overlaps/holes.
[ https://issues.apache.org/jira/browse/HBASE-24376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17158914#comment-17158914 ] Ruslan Sabitov commented on HBASE-24376: [~huaxiangsun] thank you for your reply. Here is another example: 2020-06-04 17:31:59,611 INFO org.apache.hadoop.hbase.master.normalizer.MergeNormalizationPlan: Executing merging normalization plan: MergeNormalizationPlan\{firstRegion={ENCODED => 5d2d90c3697145c15bc2dc1841bbd140, NAME => 'tableName,-7-7d--TPxHgzWi3x6Nw,1591257734317.5d2d90c3697145c15bc2dc1841bbd140.', STARTKEY => '-7-7d--TPxHgzWi3x6Nw', ENDKEY => '-F--'}, secondRegion=\{ENCODED => 5efe3811574f9f390d4cc8fc6099aa18, NAME => 'tableName,-Myro7PP4ncDkwLSQqVv,1591257751881.5efe3811574f9f390d4cc8fc6099aa18.', STARTKEY => '-Myro7PP4ncDkwLSQqVv', ENDKEY => '-V–'}} I parsed HBase Master log and collected all lines with text MergeNormalizationPlan and where r1 ENDKEY is not equal r2 STARTKEY: [https://pastebin.com/5wqr28br] I'd like to add this table becomes inconsistent each time I enable normalization for the table. Do you have any ideas how to check that the problem in the normalization? > MergeNormalizer is merging non-adjacent regions and causing region > overlaps/holes. > -- > > Key: HBASE-24376 > URL: https://issues.apache.org/jira/browse/HBASE-24376 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 2.3.0 >Reporter: Huaxiang Sun >Assignee: Huaxiang Sun >Priority: Critical > Fix For: 3.0.0-alpha-1, 2.3.0 > > > Currently, we found normalizer was merging regions which are non-adjacent, it > will cause inconsistencies in the cluster. > {code:java} > 439055 2020-05-08 17:47:09,814 INFO > org.apache.hadoop.hbase.master.normalizer.MergeNormalizationPlan: Executing > merging normalization plan: MergeNormalizationPlan{firstRegion={ENCODED => > 47fe236a5e3649ded95cb64ad0c08492, NAME => > 'TABLE,\x03\x01\x05\x01\x04\x02,1554838974870.47fe236a5e3649ded95cb64ad > 0c08492.', STARTKEY => '\x03\x01\x05\x01\x04\x02', ENDKEY => > '\x03\x01\x05\x01\x04\x02\x01\x02\x02201904082200\x00\x00\x03Mac\x00\x00\x00\x00\x00\x00\x00\x00\x00iMac13,1\x00\x00\x00\x00\x00\x049.3-14E260\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x05'}, > secondRegion={ENCODED => 0c0f2aa67f4329d5c4 8ba0320f173d31, NAME => > 'TABLE,\x03\x01\x05\x02\x01\x01,1554830735526.0c0f2aa67f4329d5c48ba0320f173d31.', > STARTKEY => '\x03\x01\x05\x02\x01\x01', ENDKEY => > '\x03\x01\x05\x02\x01\x02'}} > 439056 2020-05-08 17:47:11,438 INFO org.apache.hadoop.hbase.ScheduledChore: > CatalogJanitor-*:16000 average execution time: 1676219193 ns. > 439057 2020-05-08 17:47:11,730 INFO org.apache.hadoop.hbase.master.HMaster: > Client=null/null merge regions [47fe236a5e3649ded95cb64ad0c08492], > [0c0f2aa67f4329d5c48ba0320f173d31] > {code} > > The root cause is that getMergeNormalizationPlan() uses a list of regionInfo > which is ordered by regionName. regionName does not necessary guarantee the > order of STARTKEY (let's say 'aa1', 'aa1!', in order of regionName, it will > be 'aa1!' followed by 'aa1'. This will result in normalizer merging > non-adjacent regions into one and creates overlaps. This is not an issue in > branch-1 as the list is already ordered by RegionInfo.COMPARATOR in > normalizer. > -- This message was sent by Atlassian Jira (v8.3.4#803005)