[jira] [Commented] (HBASE-21356) bulkLoadHFile API should ensure that rs has the source hfile's write permission
[ https://issues.apache.org/jira/browse/HBASE-21356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16659643#comment-16659643 ] Umesh Agashe commented on HBASE-21356: -- +1, lgtm. A few comments left on RB. > bulkLoadHFile API should ensure that rs has the source hfile's write > permission > --- > > Key: HBASE-21356 > URL: https://issues.apache.org/jira/browse/HBASE-21356 > Project: HBase > Issue Type: Bug >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2 > > Attachments: HBASE-21356.v1.patch > > > If the rs bulk load a HFile but has no write permission of it, we can read & > compact the hfile, but after the compaction finished, the HFile willl be > moved to archive directory, the HFileCleaner won't has permission to delete, > then the HFile will always be keep in HDFS. > Need check the file's write permission when run bulkLoadHFile at server side, > if no write permission, then reject. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21214) [hbck2] setTableState just sets hbase:meta state, not in-memory state
[ https://issues.apache.org/jira/browse/HBASE-21214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624218#comment-16624218 ] Umesh Agashe commented on HBASE-21214: -- +1 lgtm > [hbck2] setTableState just sets hbase:meta state, not in-memory state > - > > Key: HBASE-21214 > URL: https://issues.apache.org/jira/browse/HBASE-21214 > Project: HBase > Issue Type: Sub-task > Components: amv2, hbck2 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.1.1 > > Attachments: 21214.patch, HBASE-21214.master.001.patch > > > Means that we have to go get another Master to see the table state change > because in-memory state is still pegged at the old value. > TODO: Check the is_enabled/is_disabled shell commands to make sure they are > reading from the right place. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624213#comment-16624213 ] Umesh Agashe commented on HBASE-20941: -- bq. It does not update the in-memory state in the Master [~stack], Changes for this JIRA were committed on August 7 and then HBASE-21025 added cache for TableState. > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.2 > > Attachments: hbase-20941.master.001.patch, > hbase-20941.master.002.patch, hbase-20941.master.003.patch, > hbase-20941.master.004.patch, hbase-20941.master.004.patch, > hbase-20941.master.004.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-21023) Add bypassProcedureToCompletion() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16619828#comment-16619828 ] Umesh Agashe edited comment on HBASE-21023 at 9/18/18 10:43 PM: ah! looking... Thanks [~stack]! was (Author: uagashe): ah! looking... > Add bypassProcedureToCompletion() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > Attachments: hbase-21023.master.001.patch, > hbase-21023.master.001.patch, hbase-21023.master.002.patch > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21023) Add bypassProcedureToCompletion() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16619828#comment-16619828 ] Umesh Agashe commented on HBASE-21023: -- ah! looking... > Add bypassProcedureToCompletion() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > Attachments: hbase-21023.master.001.patch, > hbase-21023.master.001.patch, hbase-21023.master.002.patch > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21169) Initiate hbck2 tool in hbase-operator-tools repo
[ https://issues.apache.org/jira/browse/HBASE-21169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16619789#comment-16619789 ] Umesh Agashe commented on HBASE-21169: -- {quote}Maybe we need to introduce a new HBaseInterfaceAudience called HBCK to indicate that this tool is used by HBCK2? {quote} Patch for HBASE-20941 already adds this. Its used as: {code:java} @InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.HBCK) {code} > Initiate hbck2 tool in hbase-operator-tools repo > > > Key: HBASE-21169 > URL: https://issues.apache.org/jira/browse/HBASE-21169 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.1.0 >Reporter: Umesh Agashe >Assignee: stack >Priority: Major > Attachments: hbase-21169.master.001.patch > > > Create hbck2 tool in hbase-operator-tools > (https://github.com/apache/hbase-operator-tools.git) repo. This is not > intended to be complete tool but initial changes with usage, ability to > connect to server, logging, and using newly added HbckService etc. Code > changes to address specific use cases can be added later and tool will evolve. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21023) Add bypassProcedureToCompletion() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16615446#comment-16615446 ] Umesh Agashe commented on HBASE-21023: -- Attached patch 002 with changes as per review comments. > Add bypassProcedureToCompletion() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > Attachments: hbase-21023.master.001.patch, > hbase-21023.master.001.patch, hbase-21023.master.002.patch > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21023) Add bypassProcedureToCompletion() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-21023: - Attachment: hbase-21023.master.002.patch > Add bypassProcedureToCompletion() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > Attachments: hbase-21023.master.001.patch, > hbase-21023.master.001.patch, hbase-21023.master.002.patch > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21169) Initiate hbck2 tool in hbase-operator-tools repo
[ https://issues.apache.org/jira/browse/HBASE-21169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-21169: - Attachment: hbase-21169.master.001.patch > Initiate hbck2 tool in hbase-operator-tools repo > > > Key: HBASE-21169 > URL: https://issues.apache.org/jira/browse/HBASE-21169 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.1.0 >Reporter: Umesh Agashe >Assignee: stack >Priority: Major > Attachments: hbase-21169.master.001.patch > > > Create hbck2 tool in hbase-operator-tools > (https://github.com/apache/hbase-operator-tools.git) repo. This is not > intended to be complete tool but initial changes with usage, ability to > connect to server, logging, and using newly added HbckService etc. Code > changes to address specific use cases can be added later and tool will evolve. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21169) Initiate hbck2 tool in hbase-operator-tools repo
[ https://issues.apache.org/jira/browse/HBASE-21169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16615245#comment-16615245 ] Umesh Agashe commented on HBASE-21169: -- "working' is vague in the comment above. To be clear, I have some code changes. I am not working on doc. Thanks for the doc link [~stack]! > Initiate hbck2 tool in hbase-operator-tools repo > > > Key: HBASE-21169 > URL: https://issues.apache.org/jira/browse/HBASE-21169 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.1.0 >Reporter: Umesh Agashe >Assignee: stack >Priority: Major > > Create hbck2 tool in hbase-operator-tools > (https://github.com/apache/hbase-operator-tools.git) repo. This is not > intended to be complete tool but initial changes with usage, ability to > connect to server, logging, and using newly added HbckService etc. Code > changes to address specific use cases can be added later and tool will evolve. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21169) Initiate hbck2 tool in hbase-operator-tools repo
[ https://issues.apache.org/jira/browse/HBASE-21169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16615158#comment-16615158 ] Umesh Agashe commented on HBASE-21169: -- [~stack], I was working on it and thought I assigned it to myself to indicate the same. Lets talk about it. > Initiate hbck2 tool in hbase-operator-tools repo > > > Key: HBASE-21169 > URL: https://issues.apache.org/jira/browse/HBASE-21169 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.1.0 >Reporter: Umesh Agashe >Assignee: stack >Priority: Major > > Create hbck2 tool in hbase-operator-tools > (https://github.com/apache/hbase-operator-tools.git) repo. This is not > intended to be complete tool but initial changes with usage, ability to > connect to server, logging, and using newly added HbckService etc. Code > changes to address specific use cases can be added later and tool will evolve. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21169) Initiate hbck2 tool in hbase-operator-tools repo
Umesh Agashe created HBASE-21169: Summary: Initiate hbck2 tool in hbase-operator-tools repo Key: HBASE-21169 URL: https://issues.apache.org/jira/browse/HBASE-21169 Project: HBase Issue Type: Sub-task Components: hbck2 Affects Versions: 2.1.0 Reporter: Umesh Agashe Assignee: Umesh Agashe Create hbck2 tool in hbase-operator-tools (https://github.com/apache/hbase-operator-tools.git) repo. This is not intended to be complete tool but initial changes with usage, ability to connect to server, logging, and using newly added HbckService etc. Code changes to address specific use cases can be added later and tool will evolve. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21023) Add bypassProcedureToCompletion() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606560#comment-16606560 ] Umesh Agashe commented on HBASE-21023: -- retry, errors doesn't look related. > Add bypassProcedureToCompletion() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > Attachments: hbase-21023.master.001.patch, > hbase-21023.master.001.patch > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21023) Add bypassProcedureToCompletion() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-21023: - Attachment: hbase-21023.master.001.patch > Add bypassProcedureToCompletion() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > Attachments: hbase-21023.master.001.patch, > hbase-21023.master.001.patch > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21023) Add bypassProcedureToCompletion() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-21023: - Status: Patch Available (was: In Progress) > Add bypassProcedureToCompletion() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > Attachments: hbase-21023.master.001.patch > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21023) Add bypassProcedureToCompletion() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606212#comment-16606212 ] Umesh Agashe commented on HBASE-21023: -- Patch adds API for bypassing procedure to completion for clients to use. See comments above from [~stack] and [~allan163] regarding choice of client and how it should use the API. > Add bypassProcedureToCompletion() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > Attachments: hbase-21023.master.001.patch > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HBASE-21023) Add bypassProcedureToCompletion() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-21023 started by Umesh Agashe. > Add bypassProcedureToCompletion() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > Attachments: hbase-21023.master.001.patch > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21023) Add bypassProcedureToCompletion() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-21023: - Attachment: hbase-21023.master.001.patch > Add bypassProcedureToCompletion() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > Attachments: hbase-21023.master.001.patch > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21023) Add bypassProcedureToCompletion() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-21023: - Summary: Add bypassProcedureToCompletion() API to HbckService (was: Add completeProcedure/s() API to HbckService) > Add bypassProcedureToCompletion() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21023) Add completeProcedure/s() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597822#comment-16597822 ] Umesh Agashe commented on HBASE-21023: -- HBASE-21083 adds bypassing procedures to completion. This could be used as an alternative to purging procedures. So subject and description of the Jira is now changed to completeProcedure/s() which will bypass the procedure/s and parents to completion without doing actual work. This will be useful for operators from hbck2 to unstuck procedures. > Add completeProcedure/s() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21023) Add completeProcedure/s() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-21023: - Description: completeProcedure/s(): some procedures do not support abort at every step. When these procedures get stuck then they can not be aborted or make further progress. Corrective action is to bypass these procedures to completion. (was: purgeProcedure/s(): some procedures do not support abort at every step. When these procedures get stuck then they can not be aborted or make further progress. Corrective action is to purge these procedures from ProcWAL. Provide option to purge sub-procedures as well.) > Add completeProcedure/s() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > > completeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to bypass these procedures to completion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21023) Add completeProcedure/s() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-21023: - Summary: Add completeProcedure/s() API to HbckService (was: Add purgeProcedure/s() API to HbckService) > Add completeProcedure/s() API to HbckService > > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > > purgeProcedure/s(): some procedures do not support abort at every step. When > these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to purge these procedures from ProcWAL. > Provide option to purge sub-procedures as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure
[ https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597814#comment-16597814 ] Umesh Agashe edited comment on HBASE-21083 at 8/30/18 7:20 PM: --- [~stack], can this be committed to master as well? was (Author: uagashe): @stack, can this be committed to master as well? > Introduce a mechanism to bypass the execution of a stuck procedure > -- > > Key: HBASE-21083 > URL: https://issues.apache.org/jira/browse/HBASE-21083 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 3.0.0, 2.1.1, 2.0.2 > > Attachments: HBASE-21083.branch-2.0.001.patch, > HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, > HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, > HBASE-21083.branch-2.1.001.patch > > > Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to > introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can > continue running. > we still have some unrevealed bugs hiding in our AMv2 and procedureV2 > system, we need something to interfere with stuck procedures before HBCK2 can > work. This is very crucial for a production ready system. > For now, we have little ways to interfere with running procedures. Aborting > them is not a good choice, since some procedures are not abort-able. And some > procedure may have overridden the abort() method, which will ignore the abort > request. > So, here, I will introduce a mechanism to bypass the execution of a stuck > procedure. > Basically, I added a field called 'bypass' to Procedure class. If we set this > field to true, all the logic in execute/rollback will be skipped, letting > this procedure and its ancestors complete normally and releasing the lock > resources at last. > Notice that bypassing a procedure may leave the cluster in a middle state, > e.g. the region not assigned, or some hdfs files left behind. > The Operators need know the side effect of bypassing and recover the > inconsistent state of the cluster themselves, like issuing new procedures to > assign the regions. > A patch will be uploaded and review board will be open. For now, only APIs in > ProcedureExecutor are provided. If anything is fine, I will add it to master > service and add a shell command to bypass a procedure. Or, maybe we can use > dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure
[ https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597814#comment-16597814 ] Umesh Agashe commented on HBASE-21083: -- @stack, can this be committed to master as well? > Introduce a mechanism to bypass the execution of a stuck procedure > -- > > Key: HBASE-21083 > URL: https://issues.apache.org/jira/browse/HBASE-21083 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 3.0.0, 2.1.1, 2.0.2 > > Attachments: HBASE-21083.branch-2.0.001.patch, > HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, > HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, > HBASE-21083.branch-2.1.001.patch > > > Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to > introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can > continue running. > we still have some unrevealed bugs hiding in our AMv2 and procedureV2 > system, we need something to interfere with stuck procedures before HBCK2 can > work. This is very crucial for a production ready system. > For now, we have little ways to interfere with running procedures. Aborting > them is not a good choice, since some procedures are not abort-able. And some > procedure may have overridden the abort() method, which will ignore the abort > request. > So, here, I will introduce a mechanism to bypass the execution of a stuck > procedure. > Basically, I added a field called 'bypass' to Procedure class. If we set this > field to true, all the logic in execute/rollback will be skipped, letting > this procedure and its ancestors complete normally and releasing the lock > resources at last. > Notice that bypassing a procedure may leave the cluster in a middle state, > e.g. the region not assigned, or some hdfs files left behind. > The Operators need know the side effect of bypassing and recover the > inconsistent state of the cluster themselves, like issuing new procedures to > assign the regions. > A patch will be uploaded and review board will be open. For now, only APIs in > ProcedureExecutor are provided. If anything is fine, I will add it to master > service and add a shell command to bypass a procedure. Or, maybe we can use > dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure
[ https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595494#comment-16595494 ] Umesh Agashe commented on HBASE-21083: -- Thanks for addressing the review comments, [~stack]! Thanks [~allan163] for the changes! > Introduce a mechanism to bypass the execution of a stuck procedure > -- > > Key: HBASE-21083 > URL: https://issues.apache.org/jira/browse/HBASE-21083 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Affects Versions: 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 3.0.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21083.branch-2.0.001.patch, > HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, > HBASE-21083.branch-2.1.001.patch > > > Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to > introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can > continue running. > we still have some unrevealed bugs hiding in our AMv2 and procedureV2 > system, we need something to interfere with stuck procedures before HBCK2 can > work. This is very crucial for a production ready system. > For now, we have little ways to interfere with running procedures. Aborting > them is not a good choice, since some procedures are not abort-able. And some > procedure may have overridden the abort() method, which will ignore the abort > request. > So, here, I will introduce a mechanism to bypass the execution of a stuck > procedure. > Basically, I added a field called 'bypass' to Procedure class. If we set this > field to true, all the logic in execute/rollback will be skipped, letting > this procedure and its ancestors complete normally and releasing the lock > resources at last. > Notice that bypassing a procedure may leave the cluster in a middle state, > e.g. the region not assigned, or some hdfs files left behind. > The Operators need know the side effect of bypassing and recover the > inconsistent state of the cluster themselves, like issuing new procedures to > assign the regions. > A patch will be uploaded and review board will be open. For now, only APIs in > ProcedureExecutor are provided. If anything is fine, I will add it to master > service and add a shell command to bypass a procedure. Or, maybe we can use > dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16594100#comment-16594100 ] Umesh Agashe commented on HBASE-20941: -- hadoop.hbase.util.TestHBaseFsckReplication failed in last 2 builds. 'IOException("Duplicate hbck - Abort")' is thrown when lock file for hbck already exists due to existing (another) instance of hbck running. In this case I think there is a stale file that didn't get cleaned. This seems to be unrelated to the changes in the patch. It runs locally in my dev environment. > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch, > hbase-20941.master.002.patch, hbase-20941.master.003.patch, > hbase-20941.master.004.patch, hbase-20941.master.004.patch, > hbase-20941.master.004.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592075#comment-16592075 ] Umesh Agashe commented on HBASE-20941: -- retry > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch, > hbase-20941.master.002.patch, hbase-20941.master.003.patch, > hbase-20941.master.004.patch, hbase-20941.master.004.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20941: - Attachment: hbase-20941.master.004.patch > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch, > hbase-20941.master.002.patch, hbase-20941.master.003.patch, > hbase-20941.master.004.patch, hbase-20941.master.004.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20941: - Attachment: hbase-20941.master.004.patch > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch, > hbase-20941.master.002.patch, hbase-20941.master.003.patch, > hbase-20941.master.004.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589272#comment-16589272 ] Umesh Agashe commented on HBASE-20941: -- Thanks for the review [~stack]! Build passes and all review comments for far are addressed. Waiting for more reviews or ship it. > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch, > hbase-20941.master.002.patch, hbase-20941.master.003.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20941: - Attachment: hbase-20941.master.003.patch > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch, > hbase-20941.master.002.patch, hbase-20941.master.003.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584467#comment-16584467 ] Umesh Agashe commented on HBASE-20941: -- Uploaded patch 002 with changes per review comments. > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch, > hbase-20941.master.002.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20941: - Attachment: hbase-20941.master.002.patch > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch, > hbase-20941.master.002.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-20874) Sending compaction descriptions from all regionservers to master.
[ https://issues.apache.org/jira/browse/HBASE-20874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584237#comment-16584237 ] Umesh Agashe edited comment on HBASE-20874 at 8/17/18 6:10 PM: --- {code} /testptch/hbase/hbase-shell/src/main/ruby/hbase/admin.rb:102:81: C: Metrics/LineLength: Line is too long. [100/80] /testptch/hbase/hbase-shell/src/main/ruby/hbase/admin.rb:114:81: C: Metrics/LineLength: Line is too long. [99/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:36:81: C: Metrics/LineLength: Line is too long. [90/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:37:81: C: Metrics/LineLength: Line is too long. [83/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:43:81: C: Metrics/LineLength: Line is too long. [84/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:44:81: C: Metrics/LineLength: Line is too long. [89/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:45:81: C: Metrics/LineLength: Line is too long. [97/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:46:81: C: Metrics/LineLength: Line is too long. [97/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:47:81: C: Metrics/LineLength: Line is too long. [81/80]{code} The above errors will go away after addressing HBASE-20851 and following issues are showing up in most files: {code} /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:35:7: C: Metrics/AbcSize: Assignment Branch Condition size for command is too high. [45.01/15] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:35:7: C: Metrics/MethodLength: Method has too many lines. [16/10] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:36:26: C: Style/WordArray: Use `%w` or `%W` for an array of words.{code} was (Author: uagashe): /testptch/hbase/hbase-shell/src/main/ruby/hbase/admin.rb:102:81: C: Metrics/LineLength: Line is too long. [100/80] /testptch/hbase/hbase-shell/src/main/ruby/hbase/admin.rb:114:81: C: Metrics/LineLength: Line is too long. [99/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:36:81: C: Metrics/LineLength: Line is too long. [90/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:37:81: C: Metrics/LineLength: Line is too long. [83/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:43:81: C: Metrics/LineLength: Line is too long. [84/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:44:81: C: Metrics/LineLength: Line is too long. [89/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:45:81: C: Metrics/LineLength: Line is too long. [97/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:46:81: C: Metrics/LineLength: Line is too long. [97/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:47:81: C: Metrics/LineLength: Line is too long. [81/80] The above errors will go away after addressing HBASE-20851 and following issues are showing up in most files: /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:35:7: C: Metrics/AbcSize: Assignment Branch Condition size for command is too high. [45.01/15] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:35:7: C: Metrics/MethodLength: Method has too many lines. [16/10] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:36:26: C: Style/WordArray: Use `%w` or `%W` for an array of words. > Sending compaction descriptions from all regionservers to master. > - > > Key: HBASE-20874 > URL: https://issues.apache.org/jira/browse/HBASE-20874 > Project: HBase > Issue Type: Sub-task >Reporter: Mohit Goel >Assignee: Mohit Goel >Priority: Minor > Attachments: HBASE-20874.master.004.patch, > HBASE-20874.master.005.patch, HBASE-20874.master.006.patch > > > Need to send the compaction description from region servers to Master , to > let master know of the entire compaction state of the cluster. Further need > to change the implementation of client Side API than like getCompactionState, > which will consult master for the result instead of sending individual > request to regionservers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20874) Sending compaction descriptions from all regionservers to master.
[ https://issues.apache.org/jira/browse/HBASE-20874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16584237#comment-16584237 ] Umesh Agashe commented on HBASE-20874: -- /testptch/hbase/hbase-shell/src/main/ruby/hbase/admin.rb:102:81: C: Metrics/LineLength: Line is too long. [100/80] /testptch/hbase/hbase-shell/src/main/ruby/hbase/admin.rb:114:81: C: Metrics/LineLength: Line is too long. [99/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:36:81: C: Metrics/LineLength: Line is too long. [90/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:37:81: C: Metrics/LineLength: Line is too long. [83/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:43:81: C: Metrics/LineLength: Line is too long. [84/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:44:81: C: Metrics/LineLength: Line is too long. [89/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:45:81: C: Metrics/LineLength: Line is too long. [97/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:46:81: C: Metrics/LineLength: Line is too long. [97/80] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:47:81: C: Metrics/LineLength: Line is too long. [81/80] The above errors will go away after addressing HBASE-20851 and following issues are showing up in most files: /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:35:7: C: Metrics/AbcSize: Assignment Branch Condition size for command is too high. [45.01/15] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:35:7: C: Metrics/MethodLength: Method has too many lines. [16/10] /testptch/hbase/hbase-shell/src/main/ruby/shell/commands/compactions.rb:36:26: C: Style/WordArray: Use `%w` or `%W` for an array of words. > Sending compaction descriptions from all regionservers to master. > - > > Key: HBASE-20874 > URL: https://issues.apache.org/jira/browse/HBASE-20874 > Project: HBase > Issue Type: Sub-task >Reporter: Mohit Goel >Assignee: Mohit Goel >Priority: Minor > Attachments: HBASE-20874.master.004.patch, > HBASE-20874.master.005.patch, HBASE-20874.master.006.patch > > > Need to send the compaction description from region servers to Master , to > let master know of the entire compaction state of the cluster. Further need > to change the implementation of client Side API than like getCompactionState, > which will consult master for the result instead of sending individual > request to regionservers. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576973#comment-16576973 ] Umesh Agashe commented on HBASE-20941: -- Thanks for the review, [~busbey]. Working on changes per review comments. > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20482) Print a link to the ref guide chapter for the shell during startup
[ https://issues.apache.org/jira/browse/HBASE-20482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572467#comment-16572467 ] Umesh Agashe commented on HBASE-20482: -- +1 lgtm > Print a link to the ref guide chapter for the shell during startup > -- > > Key: HBASE-20482 > URL: https://issues.apache.org/jira/browse/HBASE-20482 > Project: HBase > Issue Type: Task > Components: documentation, shell >Reporter: Sakthi >Assignee: Sakthi >Priority: Minor > Attachments: hbase-20482.branch-1.2.001.patch, > hbase-20482.branch-2.0.001.patch, hbase-20482.master.001.patch, > hbase-20482.master.002.patch, hbase-20482.master.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21023) Add purgeProcedure/s() API to HbckService
[ https://issues.apache.org/jira/browse/HBASE-21023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-21023: - Description: purgeProcedure/s(): some procedures do not support abort at every step. When these procedures get stuck then they can not be aborted or make further progress. Corrective action is to purge these procedures from ProcWAL. Provide option to purge sub-procedures as well. > Add purgeProcedure/s() API to HbckService > - > > Key: HBASE-21023 > URL: https://issues.apache.org/jira/browse/HBASE-21023 > Project: HBase > Issue Type: Sub-task > Components: hbck2 >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.2.0 > > > purgeProcedure/s(): some procedures do not support abort at every step. When > these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to purge these procedures from ProcWAL. > Provide option to purge sub-procedures as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21023) Add purgeProcedure/s() API to HbckService
Umesh Agashe created HBASE-21023: Summary: Add purgeProcedure/s() API to HbckService Key: HBASE-21023 URL: https://issues.apache.org/jira/browse/HBASE-21023 Project: HBase Issue Type: Sub-task Components: hbck2 Affects Versions: 2.0.1 Reporter: Umesh Agashe Assignee: Umesh Agashe Fix For: 2.2.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20941: - Status: Patch Available (was: In Progress) > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572166#comment-16572166 ] Umesh Agashe commented on HBASE-20941: -- Considering size of the patch moving out following API to separate JIRA: * purgeProcedure/s(): some procedures do not support abort at every step. When these procedures get stuck then they can not be aborted or make further progress. Corrective action is to purge these procedures from ProcWAL. Provide option to purge sub-procedures as well. The patch adds and implements HbckService to master and adds UT for the client. > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20941: - Description: Create HbckService in master and implement following methods: # setTableState(): If table state are inconsistent with action/ procedures working on them, sometimes manipulating their states in meta fix things. was: Create HbckService in master and implement following methods: # purgeProcedure/s(): some procedures do not support abort at every step. When these procedures get stuck then they can not be aborted or make further progress. Corrective action is to purge these procedures from ProcWAL. Provide option to purge sub-procedures as well. # setTable/RegionState(): If table/ region state are inconsistent with action/ procedures working on them, sometimes manipulating their states in meta fix things. > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch > > > Create HbckService in master and implement following methods: > # setTableState(): If table state are inconsistent with action/ procedures > working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20941: - Attachment: hbase-20941.master.001.patch > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-20941.master.001.patch > > > Create HbckService in master and implement following methods: > # purgeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to purge these procedures from ProcWAL. > Provide option to purge sub-procedures as well. > # setTable/RegionState(): If table/ region state are inconsistent with > action/ procedures working on them, sometimes manipulating their states in > meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-20941 started by Umesh Agashe. > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > > Create HbckService in master and implement following methods: > # purgeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to purge these procedures from ProcWAL. > Provide option to purge sub-procedures as well. > # setTable/RegionState(): If table/ region state are inconsistent with > action/ procedures working on them, sometimes manipulating their states in > meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21018) RS crashed because AsyncFS was unable to update HDFS data encryption key
[ https://issues.apache.org/jira/browse/HBASE-21018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572068#comment-16572068 ] Umesh Agashe commented on HBASE-21018: -- +1 > RS crashed because AsyncFS was unable to update HDFS data encryption key > > > Key: HBASE-21018 > URL: https://issues.apache.org/jira/browse/HBASE-21018 > Project: HBase > Issue Type: Bug > Components: wal >Affects Versions: 2.0.0 > Environment: Hadoop 3.0.0, HBase 2.0.0, > HDFS configuration dfs.encrypt.data.transfer = true >Reporter: Wei-Chiu Chuang >Priority: Critical > Attachments: HBASE-21018.master.001.patch > > > We (+[~uagashe]) found HBase RegionServer doesn't update HDFS data encryption > key correctly, and in some cases after retry 10 times, it aborts. > {noformat} > 2018-08-03 17:37:03,233 WARN > org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper: create > fan-out dfs output > /hbase/WALs/rs1.example.com,22101,1533318719239/rs1.example.com%2C22101%2C1533318719239.rs1.example.com%2C22101%2C1533318719239.regiongroup-0.1533343022981 > failed, retry = 1 > org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: > Can't re-compute encryption key for nonce, since the required block key > (keyID=1685436998) doesn't exist. Current key: 1085959374 > at > org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper$SaslNegotiateHandler.check(FanOutOneBlockAsyncDFSOutputSaslHelper.java:399) > at > org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper$SaslNegotiateHandler.channelRead(FanOutOneBlockAsyncDFSOutputSaslHelper.java:470) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) > at > org.apache.hbase.thirdparty.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) > at > org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) > at > org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) > at > org.apache.hbase.thirdparty.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) > at > org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) > at > org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:801) > at > org.apache.hbase.thirdparty.io.netty.channel.epo
[jira] [Commented] (HBASE-21018) RS crashed because AsyncFS was unable to update HDFS data encryption key
[ https://issues.apache.org/jira/browse/HBASE-21018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16571064#comment-16571064 ] Umesh Agashe commented on HBASE-21018: -- Hi [~Apache9], any details you can provide on this will be helpful. thanks! > RS crashed because AsyncFS was unable to update HDFS data encryption key > > > Key: HBASE-21018 > URL: https://issues.apache.org/jira/browse/HBASE-21018 > Project: HBase > Issue Type: Bug > Components: wal >Affects Versions: 2.0.0 > Environment: Hadoop 3.0.0, HBase 2.0.0, > HDFS configuration dfs.encrypt.data.transfer = true >Reporter: Wei-Chiu Chuang >Priority: Critical > Attachments: HBASE-21018.master.001.patch > > > We (+[~uagashe]) found HBase RegionServer doesn't update HDFS data encryption > key correctly, and in some cases after retry 10 times, it aborts. > {noformat} > 2018-08-03 17:37:03,233 WARN > org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper: create > fan-out dfs output > /hbase/WALs/rs1.example.com,22101,1533318719239/rs1.example.com%2C22101%2C1533318719239.rs1.example.com%2C22101%2C1533318719239.regiongroup-0.1533343022981 > failed, retry = 1 > org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: > Can't re-compute encryption key for nonce, since the required block key > (keyID=1685436998) doesn't exist. Current key: 1085959374 > at > org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper$SaslNegotiateHandler.check(FanOutOneBlockAsyncDFSOutputSaslHelper.java:399) > at > org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper$SaslNegotiateHandler.channelRead(FanOutOneBlockAsyncDFSOutputSaslHelper.java:470) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) > at > org.apache.hbase.thirdparty.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) > at > org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) > at > org.apache.hbase.thirdparty.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:284) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) > at > org.apache.hbase.thirdparty.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) > at > org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) > at > org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) > at > org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935) > at > org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.j
[jira] [Commented] (HBASE-20815) In TestServerCrashProcedure collect and assert on submitted and failed counts for ServerCrashProcedure
[ https://issues.apache.org/jira/browse/HBASE-20815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556368#comment-16556368 ] Umesh Agashe commented on HBASE-20815: -- +1, lgtm. Thanks for adding testRecoveryOnRsWithMeta() , [~xucang]! > In TestServerCrashProcedure collect and assert on submitted and failed counts > for ServerCrashProcedure > -- > > Key: HBASE-20815 > URL: https://issues.apache.org/jira/browse/HBASE-20815 > Project: HBase > Issue Type: Bug > Components: amv2 >Reporter: Umesh Agashe >Assignee: Xu Cang >Priority: Minor > Attachments: HBASE-20815.master.001.patch, > HBASE-20815.master.002.patch, HBASE-20815.master.002.patch > > > We need to collect and possibly assert on number of procedures submitted and > failed for ServerCrashProcedures. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20941) Create and implement HbckService in master
[ https://issues.apache.org/jira/browse/HBASE-20941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20941: - Summary: Create and implement HbckService in master (was: Cre) > Create and implement HbckService in master > -- > > Key: HBASE-20941 > URL: https://issues.apache.org/jira/browse/HBASE-20941 > Project: HBase > Issue Type: Sub-task >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > > Create HbckService in master and implement following methods: > # purgeProcedure/s(): some procedures do not support abort at every step. > When these procedures get stuck then they can not be aborted or make further > progress. Corrective action is to purge these procedures from ProcWAL. > Provide option to purge sub-procedures as well. > # setTable/RegionState(): If table/ region state are inconsistent with > action/ procedures working on them, sometimes manipulating their states in > meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20941) Cre
Umesh Agashe created HBASE-20941: Summary: Cre Key: HBASE-20941 URL: https://issues.apache.org/jira/browse/HBASE-20941 Project: HBase Issue Type: Sub-task Reporter: Umesh Agashe Assignee: Umesh Agashe Create HbckService in master and implement following methods: # purgeProcedure/s(): some procedures do not support abort at every step. When these procedures get stuck then they can not be aborted or make further progress. Corrective action is to purge these procedures from ProcWAL. Provide option to purge sub-procedures as well. # setTable/RegionState(): If table/ region state are inconsistent with action/ procedures working on them, sometimes manipulating their states in meta fix things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20815) In TestServerCrashProcedure collect and assert on submitted and failed counts for ServerCrashProcedure
[ https://issues.apache.org/jira/browse/HBASE-20815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556075#comment-16556075 ] Umesh Agashe commented on HBASE-20815: -- Unit test for patch 002 failed. Failure doesn't look related to the changes. Retrying. > In TestServerCrashProcedure collect and assert on submitted and failed counts > for ServerCrashProcedure > -- > > Key: HBASE-20815 > URL: https://issues.apache.org/jira/browse/HBASE-20815 > Project: HBase > Issue Type: Bug > Components: amv2 >Reporter: Umesh Agashe >Assignee: Xu Cang >Priority: Minor > Attachments: HBASE-20815.master.001.patch, > HBASE-20815.master.002.patch, HBASE-20815.master.002.patch > > > We need to collect and possibly assert on number of procedures submitted and > failed for ServerCrashProcedures. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20815) In TestServerCrashProcedure collect and assert on submitted and failed counts for ServerCrashProcedure
[ https://issues.apache.org/jira/browse/HBASE-20815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20815: - Attachment: HBASE-20815.master.002.patch > In TestServerCrashProcedure collect and assert on submitted and failed counts > for ServerCrashProcedure > -- > > Key: HBASE-20815 > URL: https://issues.apache.org/jira/browse/HBASE-20815 > Project: HBase > Issue Type: Bug > Components: amv2 >Reporter: Umesh Agashe >Assignee: Xu Cang >Priority: Minor > Attachments: HBASE-20815.master.001.patch, > HBASE-20815.master.002.patch, HBASE-20815.master.002.patch > > > We need to collect and possibly assert on number of procedures submitted and > failed for ServerCrashProcedures. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20815) In TestServerCrashProcedure collect and assert on submitted and failed counts for ServerCrashProcedure
[ https://issues.apache.org/jira/browse/HBASE-20815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16554918#comment-16554918 ] Umesh Agashe commented on HBASE-20815: -- Thanks for the patch [~xucang]! It looks good. One minor comment: I see testCount is incremented before making calls to testRecoveryAndDoubleExecution() in all instances. Can testCount be incremented in testRecoveryAndDoubleExecution() itself? Thanks! > In TestServerCrashProcedure collect and assert on submitted and failed counts > for ServerCrashProcedure > -- > > Key: HBASE-20815 > URL: https://issues.apache.org/jira/browse/HBASE-20815 > Project: HBase > Issue Type: Bug > Components: amv2 >Reporter: Umesh Agashe >Assignee: Xu Cang >Priority: Minor > Attachments: HBASE-20815.master.001.patch > > > We need to collect and possibly assert on number of procedures submitted and > failed for ServerCrashProcedures. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-6028) Implement a cancel for in-progress compactions
[ https://issues.apache.org/jira/browse/HBASE-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16534206#comment-16534206 ] Umesh Agashe commented on HBASE-6028: - [~busbey], created HBASE-20851 for rubocop config changes. > Implement a cancel for in-progress compactions > -- > > Key: HBASE-6028 > URL: https://issues.apache.org/jira/browse/HBASE-6028 > Project: HBase > Issue Type: Bug > Components: regionserver >Reporter: Derek Wollenstein >Assignee: Mohit Goel >Priority: Minor > Labels: beginner > Attachments: HBASE-6028.master.007.patch, > HBASE-6028.master.008.patch, HBASE-6028.master.008.patch, > HBASE-6028.master.009.patch > > > Depending on current server load, it can be extremely expensive to run > periodic minor / major compactions. It would be helpful to have a feature > where a user could use the shell or a client tool to explicitly cancel an > in-progress compactions. This would allow a system to recover when too many > regions became eligible for compactions at once -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20851) Change rubocop config for max line length of 100
[ https://issues.apache.org/jira/browse/HBASE-20851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20851: - Labels: beginner beginners (was: ) > Change rubocop config for max line length of 100 > > > Key: HBASE-20851 > URL: https://issues.apache.org/jira/browse/HBASE-20851 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.1 >Reporter: Umesh Agashe >Priority: Minor > Labels: beginner, beginners > > Existing ruby and Java code uses max line length of 100 characters. Change > rubocop config with: > {code:java} > Metrics/LineLength: > Max: 100 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20851) Change rubocop config for max line length of 100
Umesh Agashe created HBASE-20851: Summary: Change rubocop config for max line length of 100 Key: HBASE-20851 URL: https://issues.apache.org/jira/browse/HBASE-20851 Project: HBase Issue Type: Bug Components: shell Affects Versions: 2.0.1 Reporter: Umesh Agashe Existing ruby and Java code uses max line length of 100 characters. Change rubocop config with: {code:java} Metrics/LineLength: Max: 100 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-6028) Implement a cancel for in-progress compactions
[ https://issues.apache.org/jira/browse/HBASE-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-6028: Attachment: HBASE-6028.master.008.patch > Implement a cancel for in-progress compactions > -- > > Key: HBASE-6028 > URL: https://issues.apache.org/jira/browse/HBASE-6028 > Project: HBase > Issue Type: Bug > Components: regionserver >Reporter: Derek Wollenstein >Assignee: Mohit Goel >Priority: Minor > Labels: beginner > Attachments: HBASE-6028.master.007.patch, > HBASE-6028.master.008.patch, HBASE-6028.master.008.patch > > > Depending on current server load, it can be extremely expensive to run > periodic minor / major compactions. It would be helpful to have a feature > where a user could use the shell or a client tool to explicitly cancel an > in-progress compactions. This would allow a system to recover when too many > regions became eligible for compactions at once -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-6028) Implement a cancel for in-progress compactions
[ https://issues.apache.org/jira/browse/HBASE-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16530763#comment-16530763 ] Umesh Agashe commented on HBASE-6028: - 2 of the rubocop messages are about 'Line too long'. Rest of the ruby code has 100 chars wide lines, rubocop expects 80. These messages can be ignored. Unit test failure 'TestSyncReplicationStandbyKillMaster' doesn't seem to be related to the changes. Retrying the build. > Implement a cancel for in-progress compactions > -- > > Key: HBASE-6028 > URL: https://issues.apache.org/jira/browse/HBASE-6028 > Project: HBase > Issue Type: Bug > Components: regionserver >Reporter: Derek Wollenstein >Assignee: Mohit Goel >Priority: Minor > Labels: beginner > Attachments: HBASE-6028.master.007.patch, HBASE-6028.master.008.patch > > > Depending on current server load, it can be extremely expensive to run > periodic minor / major compactions. It would be helpful to have a feature > where a user could use the shell or a client tool to explicitly cancel an > in-progress compactions. This would allow a system to recover when too many > regions became eligible for compactions at once -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20814) fix error prone assertion failure ignored warnings
[ https://issues.apache.org/jira/browse/HBASE-20814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526865#comment-16526865 ] Umesh Agashe commented on HBASE-20814: -- +1, lgtm > fix error prone assertion failure ignored warnings > -- > > Key: HBASE-20814 > URL: https://issues.apache.org/jira/browse/HBASE-20814 > Project: HBase > Issue Type: Sub-task > Components: build, test >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Major > Attachments: HBASE-20814.master.001.patch > > > when we have assertion failures ignored, that likely means we're missing a > test case, let's make sure our tests are actually running and covering what > we think they are. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-16549) Procedure v2 - Add new AM metrics
[ https://issues.apache.org/jira/browse/HBASE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526833#comment-16526833 ] Umesh Agashe commented on HBASE-16549: -- Done,HBASE-20815. Thanks [~mdrob]! > Procedure v2 - Add new AM metrics > - > > Key: HBASE-16549 > URL: https://issues.apache.org/jira/browse/HBASE-16549 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Region Assignment >Affects Versions: 2.0.0 >Reporter: Matteo Bertozzi >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-16549-hbase-14614.v1.patch, > HBASE-16549-hbase-14614.v2-v3.patch, HBASE-16549-hbase-14614.v2.patch, > HBASE-16549-hbase-14614.v3.patch, HBASE-16549-hbase-14614.v3.patch, > HBASE-16549.master.v4.patch, HBASE-16549.master.v4.patch, > HBASE-16549.master.v4.patch, HBASE-16549.master.v5.patch > > > With the new AM we can add a bunch of metrics > - assign/unassign time > - server crash time > - grouping related metrics? (how many batch we do, and similar?) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20815) In TestServerCrashProcedure collect and assert on submitted and failed counts for ServerCrashProcedure
Umesh Agashe created HBASE-20815: Summary: In TestServerCrashProcedure collect and assert on submitted and failed counts for ServerCrashProcedure Key: HBASE-20815 URL: https://issues.apache.org/jira/browse/HBASE-20815 Project: HBase Issue Type: Bug Components: amv2 Reporter: Umesh Agashe We need to collect and possibly assert on number of procedures submitted and failed for ServerCrashProcedures. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-16549) Procedure v2 - Add new AM metrics
[ https://issues.apache.org/jira/browse/HBASE-16549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16526800#comment-16526800 ] Umesh Agashe commented on HBASE-16549: -- [~mdrob], It was added with intention to use it later. When I it was added due to the flakiness of the test, asserting on the counts was not possible. > Procedure v2 - Add new AM metrics > - > > Key: HBASE-16549 > URL: https://issues.apache.org/jira/browse/HBASE-16549 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Region Assignment >Affects Versions: 2.0.0 >Reporter: Matteo Bertozzi >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-16549-hbase-14614.v1.patch, > HBASE-16549-hbase-14614.v2-v3.patch, HBASE-16549-hbase-14614.v2.patch, > HBASE-16549-hbase-14614.v3.patch, HBASE-16549-hbase-14614.v3.patch, > HBASE-16549.master.v4.patch, HBASE-16549.master.v4.patch, > HBASE-16549.master.v4.patch, HBASE-16549.master.v5.patch > > > With the new AM we can add a bunch of metrics > - assign/unassign time > - server crash time > - grouping related metrics? (how many batch we do, and similar?) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-18366) Fix flaky test hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta
[ https://issues.apache.org/jira/browse/HBASE-18366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe resolved HBASE-18366. -- Resolution: Not A Problem Not flaky anymore. Fixed by other JIRAs. > Fix flaky test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta > - > > Key: HBASE-18366 > URL: https://issues.apache.org/jira/browse/HBASE-18366 > Project: HBase > Issue Type: Bug >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-18366.fix1.patch, hbase-18366.fix2.patch > > > It worked for a few days after enabling it with HBASE-18278. But started > failing after commits: > 6786b2b > 68436c9 > 75d2eca > 50bb045 > df93c13 > It works with one commit before: c5abb6c. Need to see what changed with those > commits. > Currently it fails with TableNotFoundException. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20403) Prefetch sometimes doesn't work with encrypted file system
[ https://issues.apache.org/jira/browse/HBASE-20403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518682#comment-16518682 ] Umesh Agashe commented on HBASE-20403: -- +1 for the patch! Nice. Thanks [~tlipcon]! > Prefetch sometimes doesn't work with encrypted file system > -- > > Key: HBASE-20403 > URL: https://issues.apache.org/jira/browse/HBASE-20403 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0-beta-2 >Reporter: Umesh Agashe >Assignee: Todd Lipcon >Priority: Major > Fix For: 3.0.0 > > Attachments: hbase-20403.patch > > > Log from long running test has following stack trace a few times: > {code} > 2018-04-09 18:33:21,523 WARN > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl: Prefetch > path=hdfs://ns1/hbase/data/default/IntegrationTestBigLinkedList_20180409172704/35f1a7ef13b9d327665228abdbcdffae/meta/9089d98b2a6b4847b3fcf6aceb124988, > offset=36884200, end=231005989 > java.lang.IllegalArgumentException > at java.nio.Buffer.limit(Buffer.java:275) > at > org.apache.hadoop.hdfs.ByteBufferStrategy.readFromBlock(ReaderStrategy.java:183) > at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:705) > at > org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:766) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:831) > at > org.apache.hadoop.crypto.CryptoInputStream.read(CryptoInputStream.java:197) > at java.io.DataInputStream.read(DataInputStream.java:149) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock.readWithExtra(HFileBlock.java:762) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readAtOffset(HFileBlock.java:1559) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1771) > at > org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1594) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1488) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$1.run(HFileReaderImpl.java:278) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} > Size on disk calculations seem to get messed up due to encryption. Possible > fixes can be: > * if file is encrypted with FileStatus#isEncrypted() and do not prefetch. > * document that hbase.rs.prefetchblocksonopen cannot be true if file is > encrypted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)
[ https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16512757#comment-16512757 ] Umesh Agashe commented on HBASE-19121: -- Current usage: {code:java} usage: hbase org.apache.hadoop.hbase.util.HBaseFsck2 [OPTIONS] [ACTIONS] Options: -l,--timelag Restrict actions to regions that are not updated in last seconds. -e,--noExclusive Run even if another instance of hbck is running. -t,--tables Restrict actions to specified comma seperated list of tables. -r,--regions Restrict actions to specified comma seperated list of regions. -s,--regionServers Restrict actions to specified comma seperated list of region servers. -d,--details Report details. -v,--verbose Verbose output. Actions: FixAssignments Try fixing assignments of regions stuck in transition by submitting assign/ unassign procedures.{code} > HBCK for AMv2 (A.K.A HBCK2) > --- > > Key: HBASE-19121 > URL: https://issues.apache.org/jira/browse/HBASE-19121 > Project: HBase > Issue Type: Bug > Components: hbck >Reporter: stack >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-19121.master.001.patch > > > We don't have an hbck for the new AM. Old hbck may actually do damage going > against AMv2. > Fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)
[ https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-19121: - Status: Patch Available (was: Open) > HBCK for AMv2 (A.K.A HBCK2) > --- > > Key: HBASE-19121 > URL: https://issues.apache.org/jira/browse/HBASE-19121 > Project: HBase > Issue Type: Bug > Components: hbck >Reporter: stack >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-19121.master.001.patch > > > We don't have an hbck for the new AM. Old hbck may actually do damage going > against AMv2. > Fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)
[ https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe reassigned HBASE-19121: Assignee: Umesh Agashe > HBCK for AMv2 (A.K.A HBCK2) > --- > > Key: HBASE-19121 > URL: https://issues.apache.org/jira/browse/HBASE-19121 > Project: HBase > Issue Type: Bug > Components: hbck >Reporter: stack >Assignee: Umesh Agashe >Priority: Major > Attachments: hbase-19121.master.001.patch > > > We don't have an hbck for the new AM. Old hbck may actually do damage going > against AMv2. > Fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)
[ https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508721#comment-16508721 ] Umesh Agashe commented on HBASE-19121: -- HBCK2 will evolve. First version with basic command line options and parsing is in 001 patch. It also has action to FixAssignments. > HBCK for AMv2 (A.K.A HBCK2) > --- > > Key: HBASE-19121 > URL: https://issues.apache.org/jira/browse/HBASE-19121 > Project: HBase > Issue Type: Bug > Components: hbck >Reporter: stack >Priority: Major > Attachments: hbase-19121.master.001.patch > > > We don't have an hbck for the new AM. Old hbck may actually do damage going > against AMv2. > Fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)
[ https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-19121: - Attachment: hbase-19121.master.001.patch > HBCK for AMv2 (A.K.A HBCK2) > --- > > Key: HBASE-19121 > URL: https://issues.apache.org/jira/browse/HBASE-19121 > Project: HBase > Issue Type: Bug > Components: hbck >Reporter: stack >Priority: Major > Attachments: hbase-19121.master.001.patch > > > We don't have an hbck for the new AM. Old hbck may actually do damage going > against AMv2. > Fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20679) Add the ability to compile JSP dynamically in Jetty
[ https://issues.apache.org/jira/browse/HBASE-20679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16502259#comment-16502259 ] Umesh Agashe commented on HBASE-20679: -- Thanks for the patch [~allan163]! @stack, we need to discuss moving away from running master requirement for hbck2. > Add the ability to compile JSP dynamically in Jetty > --- > > Key: HBASE-20679 > URL: https://issues.apache.org/jira/browse/HBASE-20679 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-20679.002.patch, HBASE-20679.patch > > > As discussed in HBASE-20617, adding the ability to dynamically compile jsp > enable us to do some hot fix. > For example, several days ago, in our testing HBase-2.0 cluster, > procedureWals were corrupted due to some unknown reasons. After restarting > the cluster, since some procedures(AssignProcedure for example) were > corrupted and couldn't be replayed. Some regions were stuck in RIT forever. > We couldn't use HBCK since it haven't support AssignmentV2 yet. As a matter > of fact, the namespace region was not online, so the master was not inited, > we even couldn't use shell command like assign/move. But, we wrote a jsp and > fix this issue easily. The jsp file is like this: > {code:java} > <% > String action = request.getParameter("action"); > HMaster master = (HMaster)getServletContext().getAttribute(HMaster.MASTER); > List offlineRegionsToAssign = new ArrayList<>(); > List regionRITs = > master.getAssignmentManager() > .getRegionStates().getRegionsInTransition(); > for (RegionStates.RegionStateNode regionStateNode : regionRITs) { > // if regionStateNode don't have a procedure attached, but meta state > shows > // this region is in RIT, that means the previous procedure may be > corrupted > // we need to create a new assignProcedure to assign them > if (!regionStateNode.isInTransition()) { > offlineRegionsToAssign.add(regionStateNode.getRegionInfo()); > out.println("RIT region:" + regionStateNode); > } > } > // Assign offline regions. Uses round-robin. > if ("fix".equals(action) && offlineRegionsToAssign.size() > 0) { > > master.getMasterProcedureExecutor().submitProcedures(master.getAssignmentManager(). > createRoundRobinAssignProcedures(offlineRegionsToAssign)); > } else { > out.println("use ?action=fix to fix RIT regions"); > } > %> > {code} > Above it is only one example we can do if we have the ability to compile jsp > dynamically. We think it is very useful. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20634) Reopen region while server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498698#comment-16498698 ] Umesh Agashe commented on HBASE-20634: -- +1 lgtm > Reopen region while server crash can cause the procedure to be stuck > > > Key: HBASE-20634 > URL: https://issues.apache.org/jira/browse/HBASE-20634 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: stack >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20634-UT.patch, HBASE-20634.branch-2.0.001.patch, > HBASE-20634.branch-2.0.002.patch, HBASE-20634.branch-2.0.003.patch, > HBASE-20634.branch-2.0.004.patch, HBASE-20634.branch-2.0.005.patch, > HBASE-20634.branch-2.0.006.patch, HBASE-20634.branch-2.0.006.patch, > HBASE-20634.branch-2.0.007.patch > > > Found this when implementing HBASE-20424, where we will transit the peer sync > replication state while there is server crash. > The problem is that, in ServerCrashAssign, we do not have the region lock, so > it is possible that after we call handleRIT to clear the existing > assign/unassign procedures related to this rs, and before we schedule the > assign procedures, it is possible that that we schedule a unassign procedure > for a region on the crashed rs. This procedure will not receive the > ServerCrashException, instead, in addToRemoteDispatcher, it will find that it > can not dispatch the remote call and then a FailedRemoteDispatchException > will be raised. But we do not treat this exception the same with > ServerCrashException, instead, we will try to expire the rs. Obviously the rs > has already been marked as expired, so this is almost a no-op. Then the > procedure will be stuck there for ever. > A possible way to fix it is to treat FailedRemoteDispatchException the same > with ServerCrashException, as it will be created in addToRemoteDispatcher > only, and the only reason we can not dispatch a remote call is that the rs > has already been dead. The nodeMap is a ConcurrentMap so I think we could use > it as a guard. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20634) Reopen region while server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16498618#comment-16498618 ] Umesh Agashe commented on HBASE-20634: -- [~stack], I have posted comments on RB for latest patch. thanks! > Reopen region while server crash can cause the procedure to be stuck > > > Key: HBASE-20634 > URL: https://issues.apache.org/jira/browse/HBASE-20634 > Project: HBase > Issue Type: Bug >Reporter: Duo Zhang >Assignee: stack >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20634-UT.patch, HBASE-20634.branch-2.0.001.patch, > HBASE-20634.branch-2.0.002.patch, HBASE-20634.branch-2.0.003.patch, > HBASE-20634.branch-2.0.004.patch, HBASE-20634.branch-2.0.005.patch, > HBASE-20634.branch-2.0.006.patch, HBASE-20634.branch-2.0.006.patch > > > Found this when implementing HBASE-20424, where we will transit the peer sync > replication state while there is server crash. > The problem is that, in ServerCrashAssign, we do not have the region lock, so > it is possible that after we call handleRIT to clear the existing > assign/unassign procedures related to this rs, and before we schedule the > assign procedures, it is possible that that we schedule a unassign procedure > for a region on the crashed rs. This procedure will not receive the > ServerCrashException, instead, in addToRemoteDispatcher, it will find that it > can not dispatch the remote call and then a FailedRemoteDispatchException > will be raised. But we do not treat this exception the same with > ServerCrashException, instead, we will try to expire the rs. Obviously the rs > has already been marked as expired, so this is almost a no-op. Then the > procedure will be stuck there for ever. > A possible way to fix it is to treat FailedRemoteDispatchException the same > with ServerCrashException, as it will be created in addToRemoteDispatcher > only, and the only reason we can not dispatch a remote call is that the rs > has already been dead. The nodeMap is a ConcurrentMap so I think we could use > it as a guard. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException
[ https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479831#comment-16479831 ] Umesh Agashe commented on HBASE-20552: -- [~elserj], I don't have a repro. I thought I had a repro but it was due to the bug which was inadvertently introduced in recent commit and got fixed in addendum (HBASE-20564). So far I found 2 instances of missing edits around the same time. First, in master proc wal where 003 is not able to read pids 468 onwards. And second, in meta region: pid=475 on 005 started with: {code:java} 2018-05-02 05:39:45,811 INFO [PEWorker-6] assignment.AssignProcedure: Starting pid=475, ppid=471, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=test_hbase_ha_load_test_tool_hbase, region=94f6ca283dbb4445b2bcdc321b734d28; rit=OFFLINE, location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502; forceNewPlan=false, retain=true {code} After this it was updated twice on 005: {code:java} 2018-05-02 05:39:45,983 INFO [PEWorker-1] assignment.RegionStateStore: pid=475 updating hbase:meta row=94f6ca283dbb4445b2bcdc321b734d28, regionState=OPENING 2018-05-02 05:39:46,580 INFO [PEWorker-1] assignment.RegionStateStore: pid=475 updating hbase:meta row=94f6ca283dbb4445b2bcdc321b734d28, regionState=OPEN, openSeqNum=13401, regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 {code} But when 003 read and printed meta, it has: {code:java} 2018-05-02 05:44:08,236 INFO [master/ctr-e138-1518143905142-279227-01-03:2] assignment.RegionStateStore: Load hbase:meta entry region=94f6ca283dbb4445b2bcdc321b734d28, regionState=OPEN, lastHost=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, regionLocation=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502 {code} The location server including timestamp matches to when pid=471 started "location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502". So 2 updates from pid=471 to meta are missing. > HBase RegionServer was shutdown due to UnexpectedStateException > --- > > Key: HBASE-20552 > URL: https://issues.apache.org/jira/browse/HBASE-20552 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Romil Choksi >Assignee: Umesh Agashe >Priority: Critical > Attachments: > 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, > 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log > > > This was observed during cluster testing (source code sync'ed with hbase-2.0, > built May 2nd): > {code} > 2018-05-02 05:44:10,089 ERROR > [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] > master.MasterRpcServices: Region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported > a fatal error: > * ABORTING region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- > 1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138- > 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has > otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has other
[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException
[ https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478044#comment-16478044 ] Umesh Agashe commented on HBASE-20552: -- [~yuzhih...@gmail.com], Just want to confirm that you saw this on branch-2.0 or master? > HBase RegionServer was shutdown due to UnexpectedStateException > --- > > Key: HBASE-20552 > URL: https://issues.apache.org/jira/browse/HBASE-20552 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Romil Choksi >Assignee: Umesh Agashe >Priority: Critical > Attachments: > 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, > 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log > > > This was observed during cluster testing (source code sync'ed with hbase-2.0, > built May 2nd): > {code} > 2018-05-02 05:44:10,089 ERROR > [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] > master.MasterRpcServices: Region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported > a fatal error: > * ABORTING region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- > 1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138- > 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has > otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037) > ... 7 more > * > Cause: > org.apache.hadoop.hbase.YouAreDeadException: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, >table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has
[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException
[ https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472885#comment-16472885 ] Umesh Agashe commented on HBASE-20552: -- I think its real problem in the code. Working on repro and the patch. > HBase RegionServer was shutdown due to UnexpectedStateException > --- > > Key: HBASE-20552 > URL: https://issues.apache.org/jira/browse/HBASE-20552 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Romil Choksi >Assignee: Umesh Agashe >Priority: Critical > Attachments: > 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, > 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log > > > This was observed during cluster testing (source code sync'ed with hbase-2.0, > built May 2nd): > {code} > 2018-05-02 05:44:10,089 ERROR > [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] > master.MasterRpcServices: Region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported > a fatal error: > * ABORTING region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- > 1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138- > 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has > otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037) > ... 7 more > * > Cause: > org.apache.hadoop.hbase.YouAreDeadException: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, >table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at
[jira] [Work started] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException
[ https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-20552 started by Umesh Agashe. > HBase RegionServer was shutdown due to UnexpectedStateException > --- > > Key: HBASE-20552 > URL: https://issues.apache.org/jira/browse/HBASE-20552 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Romil Choksi >Assignee: Umesh Agashe >Priority: Critical > Attachments: > 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, > 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log > > > This was observed during cluster testing (source code sync'ed with hbase-2.0, > built May 2nd): > {code} > 2018-05-02 05:44:10,089 ERROR > [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] > master.MasterRpcServices: Region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported > a fatal error: > * ABORTING region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- > 1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138- > 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has > otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037) > ... 7 more > * > Cause: > org.apache.hadoop.hbase.YouAreDeadException: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, >table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.ja
[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException
[ https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472482#comment-16472482 ] Umesh Agashe commented on HBASE-20552: -- Further, M003 starts SCP with pid=507 for R007: {code:java} 2018-05-02 05:44:08,413 INFO [PEWorker-6] procedure.ServerCrashProcedure: Start pid=507, state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure server=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502, splitWal=true, meta=false{code} This starts AssignProcedure with pid=508 for region 94f6ca283dbb4445b2bcdc321b734d28: {code:java} 2018-05-02 05:44:08,480 INFO [PEWorker-6] assignment.AssignProcedure: Starting pid=508, ppid=507, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=test_hbase_ha_load_test_tool_hbase, region=94f6ca283dbb4445b2bcdc321b734d28; rit=OFFLINE, location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502; forceNewPlan=false, retain=true 2018-05-02 05:44:08,659 INFO [PEWorker-11] assignment.RegionStateStore: pid=508 updating hbase:meta row=94f6ca283dbb4445b2bcdc321b734d28, regionState=OPENING, regionLocation=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353 2018-05-02 05:44:08,727 INFO [PEWorker-11] assignment.RegionTransitionProcedure: Dispatch pid=508, ppid=507, state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure table=test_hbase_ha_load_test_tool_hbase, region=94f6ca283dbb4445b2bcdc321b734d28; rit=OPENING, location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353 ... 2018-05-02 05:44:09,213 DEBUG [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] assignment.RegionTransitionProcedure: Received report OPENED seqId=13402, pid=508, ppid=507, state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure table=test_hbase_ha_load_test_tool_hbase, region=94f6ca283dbb4445b2bcdc321b734d28; rit=OPENING, location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353 2018-05-02 05:44:09,213 DEBUG [PEWorker-12] assignment.RegionTransitionProcedure: Finishing pid=508, ppid=507, state=RUNNABLE:REGION_TRANSITION_FINISH; AssignProcedure table=test_hbase_ha_load_test_tool_hbase, region=94f6ca283dbb4445b2bcdc321b734d28; rit=OPENING, location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353 2018-05-02 05:44:09,214 INFO [PEWorker-12] assignment.RegionStateStore: pid=508 updating hbase:meta row=94f6ca283dbb4445b2bcdc321b734d28, regionState=OPEN, openSeqNum=13402, regionLocation=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353 2018-05-02 05:44:09,258 INFO [PEWorker-12] procedure2.ProcedureExecutor: Finished subprocedure(s) of pid=507, state=RUNNABLE:SERVER_CRASH_HANDLE_RIT2; ServerCrashProcedure server=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502, splitWal=true, meta=false; resume parent processing. 2018-05-02 05:44:09,258 INFO [PEWorker-12] procedure2.ProcedureExecutor: Finished pid=508, ppid=507, state=SUCCESS; AssignProcedure table=test_hbase_ha_load_test_tool_hbase, region=94f6ca283dbb4445b2bcdc321b734d28 in 764msec 2018-05-02 05:44:09,273 INFO [PEWorker-14] procedure2.ProcedureExecutor: Finished pid=507, state=SUCCESS; ServerCrashProcedure server=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502, splitWal=true, meta=false in 975msec{code} Strange thing is SCP for R007 is assigning region back to R007! > HBase RegionServer was shutdown due to UnexpectedStateException > --- > > Key: HBASE-20552 > URL: https://issues.apache.org/jira/browse/HBASE-20552 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Romil Choksi >Assignee: Umesh Agashe >Priority: Critical > Attachments: > 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, > 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-07.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-08.hwx.site.log > > > This was observed during cluster testing (source code sync'ed with hbase-2.0, > built May 2nd): > {code} > 2018-05-02 05:44:10,089 ERROR > [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] > master.MasterRpcServices: Region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported > a fatal error: > * ABORTING region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- > 1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > r
[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException
[ https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472456#comment-16472456 ] Umesh Agashe commented on HBASE-20552: -- bq. Log for server 0002 was attached already. Thanks! and also for 007? > HBase RegionServer was shutdown due to UnexpectedStateException > --- > > Key: HBASE-20552 > URL: https://issues.apache.org/jira/browse/HBASE-20552 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Romil Choksi >Assignee: Umesh Agashe >Priority: Critical > Attachments: > 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, > 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log > > > This was observed during cluster testing (source code sync'ed with hbase-2.0, > built May 2nd): > {code} > 2018-05-02 05:44:10,089 ERROR > [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] > master.MasterRpcServices: Region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported > a fatal error: > * ABORTING region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- > 1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138- > 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has > otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037) > ... 7 more > * > Cause: > org.apache.hadoop.hbase.YouAreDeadException: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, >table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037) > ... 7 more > at sun.reflect.N
[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException
[ https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472438#comment-16472438 ] Umesh Agashe commented on HBASE-20552: -- bq. Was there any region on 0008 you're interested in ? 670f6b815d2acac905130e5440d59304 1d954f21d711345a9587d995cecea136 91f73e76bbe7bc8a61b1b1299d34c6ab > HBase RegionServer was shutdown due to UnexpectedStateException > --- > > Key: HBASE-20552 > URL: https://issues.apache.org/jira/browse/HBASE-20552 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Romil Choksi >Assignee: Umesh Agashe >Priority: Critical > Attachments: > 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, > 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log > > > This was observed during cluster testing (source code sync'ed with hbase-2.0, > built May 2nd): > {code} > 2018-05-02 05:44:10,089 ERROR > [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] > master.MasterRpcServices: Region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported > a fatal error: > * ABORTING region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- > 1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138- > 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has > otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037) > ... 7 more > * > Cause: > org.apache.hadoop.hbase.YouAreDeadException: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, >table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkO
[jira] [Comment Edited] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException
[ https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472409#comment-16472409 ] Umesh Agashe edited comment on HBASE-20552 at 5/11/18 6:23 PM: --- Usually following warnings can be ignored. But these messages followed by "Completed pid=" looks trouble. When M003 became active at around 2018-05-02 05:43:33, there are a few warnings while reading master proc wal: {code:java} 2018-05-02 05:43:33,529 WARN [master/ctr-e138-1518143905142-279227-01-03:2] wal.WALProcedureStore: Unable to read tracker for hdfs://mycluster/apps/hbase/data/MasterProcWALs/pv2-0004.log - Invalid Trailer version. got 8 expected 1 2018-05-02 05:43:33,638 DEBUG [master/ctr-e138-1518143905142-279227-01-03:2] wal.WALProcedureStore: Roll new state log: 5 2018-05-02 05:43:33,655 INFO [master/ctr-e138-1518143905142-279227-01-03:2] procedure2.ProcedureExecutor: Recovered WALProcedureStore lease in 219msec 2018-05-02 05:43:33,681 INFO [master/ctr-e138-1518143905142-279227-01-03:2] wal.ProcedureWALFormatReader: Rebuilding tracker for hdfs://mycluster/apps/hbase/data/MasterProcWALs/pv2-0004.log 2018-05-02 05:43:33,816 WARN [master/ctr-e138-1518143905142-279227-01-03:2] wal.ProcedureWALFormatReader: Nothing left to decode. Exiting with missing EOF, log=hdfs://mycluster/apps/hbase/data/MasterProcWALs/pv2-0004.log 2018-05-02 05:43:33,875 DEBUG [master/ctr-e138-1518143905142-279227-01-03:2] procedure2.ProcedureExecutor: Completed pid=467, state=SUCCESS; MoveRegionProcedure hri=4c37ee7a4e1210e481debdc2933fc4d2, source=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, destination=ctr-e138-1518143905142-279227-01-03.hwx.site,16020,15252394258262018-05-02 05:43:33,876 DEBUG [master/ctr-e138-1518143905142-279227-01-03:2] procedure2.ProcedureExecutor: Completed pid=465, state=SUCCESS; MoveRegionProcedure hri=94f6ca283dbb4445b2bcdc321b734d28, source=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, destination=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502 2018-05-02 05:43:33,876 DEBUG [master/ctr-e138-1518143905142-279227-01-03:2] procedure2.ProcedureExecutor: Completed pid=462, state=SUCCESS; MoveRegionProcedure hri=a8ff96226d546f0ea151823ae73e5a1b, source=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, destination=ctr-e138-1518143905142-279227-01-08.hwx.site,16020,1525238658606{code} M003 during startup has no log messages for procedures with ids 468 to 504 even though they ran and completed on M005. This is unusual. RecoverMetaProcedure on M003 starts with id 505 which is correct. Orthogonal to above observation we have meta update issue as well. On M005, pid=471 is SCP for R007 which also hosts meta. Meta is re-assigned with pid=472 to R002 which is followed by other region assignments {code:java} pid=478 e75a388bc2011feed75bdc1a0e99a9a9 regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site pid=474 670f6b815d2acac905130e5440d59304 regionLocation=ctr-e138-1518143905142-279227-01-08.hwx.site pid=479 c963eb77dbdc6dbab886dbe4eebba5ad regionLocation=ctr-e138-1518143905142-279227-01-06.hwx.site pid=481 b5180eee96b616afdf79578309c66a11 regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site pid=486 8dc6fd2022c2fdf8c065fbd16cadaaca regionLocation=ctr-e138-1518143905142-279227-01-03.hwx.site pid=480 f3db9f9879ed03f488dcb89bea834237 regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site pid=484 c078deb2474e9c19b85b5fdb9efaa47d regionLocation=ctr-e138-1518143905142-279227-01-06.hwx.site pid=475 94f6ca283dbb4445b2bcdc321b734d28 regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site pid=483 1d954f21d711345a9587d995cecea136 regionLocation=ctr-e138-1518143905142-279227-01-08.hwx.site pid=476 1595f38ee901be7c67b997fe2fc95951 regionLocation=ctr-e138-1518143905142-279227-01-06.hwx.site pid=482 a6e0d7561c4f19e78f94d37462588281 regionLocation=ctr-e138-1518143905142-279227-01-06.hwx.site pid=485 91f73e76bbe7bc8a61b1b1299d34c6ab regionLocation=ctr-e138-1518143905142-279227-01-08.hwx.site pid=477 a0620fc83de532a37f6a9bb8f99cc6c4 regionLocation=ctr-e138-1518143905142-279227-01-03.hwx.site{code} >From the logs all the procedures finished successfully without skipping steps. >Meta doesn't seem to be updated for 4 of these assignments. When M003 logs all >regions from meta at startup, locations for following 4 regions don't match >with the target locations in above procedures: {code:java} 670f6b815d2acac905130e5440d59304 ctr-e138-1518143905142-279227-01-08.hwx.site lastHost=ctr-e138-1518143905142-279227-01-07.hwx.site regionLocation=ctr-e138-1518143905142-279227-01-
[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException
[ https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472409#comment-16472409 ] Umesh Agashe commented on HBASE-20552: -- Usually following warnings can be ignored. But these messages followed by "Completed pid=" looks trouble. When M003 became active at around 2018-05-02 05:43:33, there are a few warnings while reading master proc wal: {code:java} 2018-05-02 05:43:33,529 WARN [master/ctr-e138-1518143905142-279227-01-03:2] wal.WALProcedureStore: Unable to read tracker for hdfs://mycluster/apps/hbase/data/MasterProcWALs/pv2-0004.log - Invalid Trailer version. got 8 expected 1 2018-05-02 05:43:33,638 DEBUG [master/ctr-e138-1518143905142-279227-01-03:2] wal.WALProcedureStore: Roll new state log: 5 2018-05-02 05:43:33,655 INFO [master/ctr-e138-1518143905142-279227-01-03:2] procedure2.ProcedureExecutor: Recovered WALProcedureStore lease in 219msec 2018-05-02 05:43:33,681 INFO [master/ctr-e138-1518143905142-279227-01-03:2] wal.ProcedureWALFormatReader: Rebuilding tracker for hdfs://mycluster/apps/hbase/data/MasterProcWALs/pv2-0004.log 2018-05-02 05:43:33,816 WARN [master/ctr-e138-1518143905142-279227-01-03:2] wal.ProcedureWALFormatReader: Nothing left to decode. Exiting with missing EOF, log=hdfs://mycluster/apps/hbase/data/MasterProcWALs/pv2-0004.log 2018-05-02 05:43:33,875 DEBUG [master/ctr-e138-1518143905142-279227-01-03:2] procedure2.ProcedureExecutor: Completed pid=467, state=SUCCESS; MoveRegionProcedure hri=4c37ee7a4e1210e481debdc2933fc4d2, source=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, destination=ctr-e138-1518143905142-279227-01-03.hwx.site,16020,15252394258262018-05-02 05:43:33,876 DEBUG [master/ctr-e138-1518143905142-279227-01-03:2] procedure2.ProcedureExecutor: Completed pid=465, state=SUCCESS; MoveRegionProcedure hri=94f6ca283dbb4445b2bcdc321b734d28, source=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, destination=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525238558502 2018-05-02 05:43:33,876 DEBUG [master/ctr-e138-1518143905142-279227-01-03:2] procedure2.ProcedureExecutor: Completed pid=462, state=SUCCESS; MoveRegionProcedure hri=a8ff96226d546f0ea151823ae73e5a1b, source=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474, destination=ctr-e138-1518143905142-279227-01-08.hwx.site,16020,1525238658606{code} M003 during startup has no log messages for procedures with ids 468 to 504 even though they are ran and completed on M005. This is unusual. RecoverMetaProcedure on M003 starts with id 505 which is correct. Orthogonal to above observation we have meta update issue as well. On M005, pid=471 is SCP for R007 which also hosts meta. Meta is re-assigned with pid=472 to R002 which is followed by other region assignments {code:java} pid=478 e75a388bc2011feed75bdc1a0e99a9a9 regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site pid=474 670f6b815d2acac905130e5440d59304 regionLocation=ctr-e138-1518143905142-279227-01-08.hwx.site pid=479 c963eb77dbdc6dbab886dbe4eebba5ad regionLocation=ctr-e138-1518143905142-279227-01-06.hwx.site pid=481 b5180eee96b616afdf79578309c66a11 regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site pid=486 8dc6fd2022c2fdf8c065fbd16cadaaca regionLocation=ctr-e138-1518143905142-279227-01-03.hwx.site pid=480 f3db9f9879ed03f488dcb89bea834237 regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site pid=484 c078deb2474e9c19b85b5fdb9efaa47d regionLocation=ctr-e138-1518143905142-279227-01-06.hwx.site pid=475 94f6ca283dbb4445b2bcdc321b734d28 regionLocation=ctr-e138-1518143905142-279227-01-02.hwx.site pid=483 1d954f21d711345a9587d995cecea136 regionLocation=ctr-e138-1518143905142-279227-01-08.hwx.site pid=476 1595f38ee901be7c67b997fe2fc95951 regionLocation=ctr-e138-1518143905142-279227-01-06.hwx.site pid=482 a6e0d7561c4f19e78f94d37462588281 regionLocation=ctr-e138-1518143905142-279227-01-06.hwx.site pid=485 91f73e76bbe7bc8a61b1b1299d34c6ab regionLocation=ctr-e138-1518143905142-279227-01-08.hwx.site pid=477 a0620fc83de532a37f6a9bb8f99cc6c4 regionLocation=ctr-e138-1518143905142-279227-01-03.hwx.site{code} >From the logs all the procedures finished successfully without skipping steps. >Meta doesn't seem to be updated for 4 of these assignments. When M003 logs all >regions from meta at startup, locations for following 4 regions don't match >with the target locations in above procedures: {code:java} 670f6b815d2acac905130e5440d59304 ctr-e138-1518143905142-279227-01-08.hwx.site lastHost=ctr-e138-1518143905142-279227-01-07.hwx.site regionLocation=ctr-e138-1518143905142-279227-01-07.hwx.site 94f6ca283dbb4445b2bcdc321b734
[jira] [Commented] (HBASE-20544) downstream HBaseTestingUtility fails with invalid port
[ https://issues.apache.org/jira/browse/HBASE-20544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472384#comment-16472384 ] Umesh Agashe commented on HBASE-20544: -- +1 for addendum > downstream HBaseTestingUtility fails with invalid port > -- > > Key: HBASE-20544 > URL: https://issues.apache.org/jira/browse/HBASE-20544 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Blocker > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20544.0.patch, HBASE-20544.1.patch, > HBASE-20544.2.patch, HBASE-20544.addendum.0.patch > > > Attempting to update hbase-downstreamer to use our 2.0.0 release fails with > an invalid port in the event that {{hbase.localcluster.assign.random.ports}} > isn't set (or is set to false, specifically): > {code} > 2018-05-08 06:10:06,508 ERROR [main] regionserver.HRegionServer > (HRegionServer.java:(631)) - Failed construction RegionServer > java.lang.IllegalArgumentException: port out of range:-1 > at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143) > at java.net.InetSocketAddress.(InetSocketAddress.java:224) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.(RSRpcServices.java:1217) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.(RSRpcServices.java:1184) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.createRpcServices(HRegionServer.java:723) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:561) > at > org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.(MiniHBaseCluster.java:147) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createRegionServerThread(JVMClusterUtil.java:86) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addRegionServer(LocalHBaseCluster.java:184) > at > org.apache.hadoop.hbase.LocalHBaseCluster$1.run(LocalHBaseCluster.java:198) > at > org.apache.hadoop.hbase.LocalHBaseCluster$1.run(LocalHBaseCluster.java:195) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:313) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addRegionServer(LocalHBaseCluster.java:194) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:261) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:121) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1042) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:988) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:859) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:853) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:782) > at > org.hbase.downstreamer.TestHBaseMiniCluster.testSpinUpMiniHBaseCluster(TestHBaseMiniCluster.java:16) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.Pa
[jira] [Assigned] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException
[ https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe reassigned HBASE-20552: Assignee: Umesh Agashe > HBase RegionServer was shutdown due to UnexpectedStateException > --- > > Key: HBASE-20552 > URL: https://issues.apache.org/jira/browse/HBASE-20552 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Romil Choksi >Assignee: Umesh Agashe >Priority: Critical > Attachments: > 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, > 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log > > > This was observed during cluster testing (source code sync'ed with hbase-2.0, > built May 2nd): > {code} > 2018-05-02 05:44:10,089 ERROR > [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] > master.MasterRpcServices: Region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported > a fatal error: > * ABORTING region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- > 1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138- > 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has > otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037) > ... 7 more > * > Cause: > org.apache.hadoop.hbase.YouAreDeadException: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, >table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037) > ... 7 more > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorA
[jira] [Commented] (HBASE-20552) HBase RegionServer was shutdown due to UnexpectedStateException
[ https://issues.apache.org/jira/browse/HBASE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469544#comment-16469544 ] Umesh Agashe commented on HBASE-20552: -- Thanks for attaching the logs. Need to go through logs to see if its similar to what we have seen so far... > HBase RegionServer was shutdown due to UnexpectedStateException > --- > > Key: HBASE-20552 > URL: https://issues.apache.org/jira/browse/HBASE-20552 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Romil Choksi >Priority: Critical > Attachments: > 102143-master-ctr-e138-1518143905142-279227-01-03.hwx.site.log, > 102143-master-ctr-e138-1518143905142-279227-01-05.hwx.site.log, > 102143-regionserver-ctr-e138-1518143905142-279227-01-02.hwx.site.log > > > This was observed during cluster testing (source code sync'ed with hbase-2.0, > built May 2nd): > {code} > 2018-05-02 05:44:10,089 ERROR > [RpcServer.default.FPBQ.Fifo.handler=28,queue=1,port=2] > master.MasterRpcServices: Region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 reported > a fatal error: > * ABORTING region server > ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=ctr-e138- > 1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on server=ctr-e138- > 1518143905142-279227-01-02.hwx.site,16020,1525239334474 but state has > otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037) > ... 7 more > * > Cause: > org.apache.hadoop.hbase.YouAreDeadException: > org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, >table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1065) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:987) > at > org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:459) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) > at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) > Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: > rit=OPEN, > location=ctr-e138-1518143905142-279227-01-07.hwx.site,16020,1525239609353, > table=test_hbase_ha_load_test_tool_hbase, > region=94f6ca283dbb4445b2bcdc321b734d28reported OPEN on > server=ctr-e138-1518143905142-279227-01-02.hwx.site,16020,1525239334474 > but state has otherwise. > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1037) > ... 7 more > at sun.reflect.N
[jira] [Commented] (HBASE-20544) downstream HBaseTestingUtility fails with invalid port
[ https://issues.apache.org/jira/browse/HBASE-20544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469511#comment-16469511 ] Umesh Agashe commented on HBASE-20544: -- IMO defaults ever being set to -1 is possible but not probable. If defaults are ever set to -1 then shouldn't the condition be like: {code:java} int port = conf.getInt(HConstants.MASTER_INFO_PORT, HConstants.DEFAULT_MASTER_INFOPORT); if (port != -1 && port == HConstants.DEFAULT_MASTER_INFOPORT) { {code} Feel free to ignore the nit. I've already added my +1 to the changes. > downstream HBaseTestingUtility fails with invalid port > -- > > Key: HBASE-20544 > URL: https://issues.apache.org/jira/browse/HBASE-20544 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Blocker > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20544.0.patch, HBASE-20544.1.patch, > HBASE-20544.2.patch > > > Attempting to update hbase-downstreamer to use our 2.0.0 release fails with > an invalid port in the event that {{hbase.localcluster.assign.random.ports}} > isn't set (or is set to false, specifically): > {code} > 2018-05-08 06:10:06,508 ERROR [main] regionserver.HRegionServer > (HRegionServer.java:(631)) - Failed construction RegionServer > java.lang.IllegalArgumentException: port out of range:-1 > at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143) > at java.net.InetSocketAddress.(InetSocketAddress.java:224) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.(RSRpcServices.java:1217) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.(RSRpcServices.java:1184) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.createRpcServices(HRegionServer.java:723) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:561) > at > org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.(MiniHBaseCluster.java:147) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createRegionServerThread(JVMClusterUtil.java:86) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addRegionServer(LocalHBaseCluster.java:184) > at > org.apache.hadoop.hbase.LocalHBaseCluster$1.run(LocalHBaseCluster.java:198) > at > org.apache.hadoop.hbase.LocalHBaseCluster$1.run(LocalHBaseCluster.java:195) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:313) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addRegionServer(LocalHBaseCluster.java:194) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:261) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:121) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1042) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:988) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:859) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:853) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:782) > at > org.hbase.downstreamer.TestHBaseMiniCluster.testSpinUpMiniHBaseCluster(TestHBaseMiniCluster.java:16) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.Parent
[jira] [Commented] (HBASE-20544) downstream HBaseTestingUtility fails with invalid port
[ https://issues.apache.org/jira/browse/HBASE-20544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469395#comment-16469395 ] Umesh Agashe commented on HBASE-20544: -- If its explicitly set to -1 then (-1 == HConstants.DEFAULT_MASTER_INFOPORT) will be false which is same as ( -1 != -1) being false. > downstream HBaseTestingUtility fails with invalid port > -- > > Key: HBASE-20544 > URL: https://issues.apache.org/jira/browse/HBASE-20544 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Blocker > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20544.0.patch, HBASE-20544.1.patch, > HBASE-20544.2.patch > > > Attempting to update hbase-downstreamer to use our 2.0.0 release fails with > an invalid port in the event that {{hbase.localcluster.assign.random.ports}} > isn't set (or is set to false, specifically): > {code} > 2018-05-08 06:10:06,508 ERROR [main] regionserver.HRegionServer > (HRegionServer.java:(631)) - Failed construction RegionServer > java.lang.IllegalArgumentException: port out of range:-1 > at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143) > at java.net.InetSocketAddress.(InetSocketAddress.java:224) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.(RSRpcServices.java:1217) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.(RSRpcServices.java:1184) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.createRpcServices(HRegionServer.java:723) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:561) > at > org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.(MiniHBaseCluster.java:147) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createRegionServerThread(JVMClusterUtil.java:86) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addRegionServer(LocalHBaseCluster.java:184) > at > org.apache.hadoop.hbase.LocalHBaseCluster$1.run(LocalHBaseCluster.java:198) > at > org.apache.hadoop.hbase.LocalHBaseCluster$1.run(LocalHBaseCluster.java:195) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:313) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addRegionServer(LocalHBaseCluster.java:194) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:261) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:121) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1042) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:988) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:859) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:853) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:782) > at > org.hbase.downstreamer.TestHBaseMiniCluster.testSpinUpMiniHBaseCluster(TestHBaseMiniCluster.java:16) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org
[jira] [Commented] (HBASE-20544) downstream HBaseTestingUtility fails with invalid port
[ https://issues.apache.org/jira/browse/HBASE-20544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469372#comment-16469372 ] Umesh Agashe commented on HBASE-20544: -- +1 for the latest patch. nit: {code:java} if (conf.getInt(HConstants.REGIONSERVER_INFO_PORT, 0) != -1 && conf.getInt(HConstants.REGIONSERVER_INFO_PORT, HConstants.DEFAULT_REGIONSERVER_INFOPORT) == HConstants.DEFAULT_REGIONSERVER_INFOPORT) {{code} can be effectively changed to: {code:java} if (conf.getInt(HConstants.REGIONSERVER_INFO_PORT, HConstants.DEFAULT_REGIONSERVER_INFOPORT) == HConstants.DEFAULT_REGIONSERVER_INFOPORT) {{code} and same for: {code:java} if (conf.getInt(HConstants.MASTER_INFO_PORT, 0) != -1 && conf.getInt(HConstants.MASTER_INFO_PORT, HConstants.DEFAULT_MASTER_INFOPORT) == HConstants.DEFAULT_MASTER_INFOPORT) {{code} > downstream HBaseTestingUtility fails with invalid port > -- > > Key: HBASE-20544 > URL: https://issues.apache.org/jira/browse/HBASE-20544 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 2.0.0 >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Blocker > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20544.0.patch, HBASE-20544.1.patch, > HBASE-20544.2.patch > > > Attempting to update hbase-downstreamer to use our 2.0.0 release fails with > an invalid port in the event that {{hbase.localcluster.assign.random.ports}} > isn't set (or is set to false, specifically): > {code} > 2018-05-08 06:10:06,508 ERROR [main] regionserver.HRegionServer > (HRegionServer.java:(631)) - Failed construction RegionServer > java.lang.IllegalArgumentException: port out of range:-1 > at java.net.InetSocketAddress.checkPort(InetSocketAddress.java:143) > at java.net.InetSocketAddress.(InetSocketAddress.java:224) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.(RSRpcServices.java:1217) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.(RSRpcServices.java:1184) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.createRpcServices(HRegionServer.java:723) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:561) > at > org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.(MiniHBaseCluster.java:147) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.hbase.util.JVMClusterUtil.createRegionServerThread(JVMClusterUtil.java:86) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addRegionServer(LocalHBaseCluster.java:184) > at > org.apache.hadoop.hbase.LocalHBaseCluster$1.run(LocalHBaseCluster.java:198) > at > org.apache.hadoop.hbase.LocalHBaseCluster$1.run(LocalHBaseCluster.java:195) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:313) > at > org.apache.hadoop.hbase.LocalHBaseCluster.addRegionServer(LocalHBaseCluster.java:194) > at > org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:261) > at > org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:121) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:1042) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:988) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:859) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:853) > at > org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:782) > at > org.hbase.downstreamer.TestHBaseMiniCluster.testSpinUpMiniHBaseCluster(TestHBaseMiniCluster.java:16) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runner
[jira] [Commented] (HBASE-20224) Web UI is broken in standalone mode
[ https://issues.apache.org/jira/browse/HBASE-20224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467916#comment-16467916 ] Umesh Agashe commented on HBASE-20224: -- [~busbey], can you reconcile the patch for HBASE-20544 with the patch 004 here? Specifically around files: hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java and hbase-server/src/main/java/org/apache/hadoop/hbase/LocalHBaseCluster.java hbase-server/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java > Web UI is broken in standalone mode > --- > > Key: HBASE-20224 > URL: https://issues.apache.org/jira/browse/HBASE-20224 > Project: HBase > Issue Type: Bug > Components: UI, Usability >Affects Versions: 2.0.0-beta-2 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Blocker > Fix For: 2.0.0, 2.0.1 > > Attachments: > 0001-HBASE-20224-Web-UI-is-broken-in-standalone-mode-ADDE.ADDENDUM.patch, > 20224-addendum.3.txt, 20224.addendum.4, 20224.addendum.5, 20224.addendum.6, > HBASE-20224.master.004.patch, hbase-20224.master.001.patch, > hbase-20224.master.002.patch, hbase-20224.master.003.patch, > hbase-20224.master.addendum.patch > > > Web UI doesn't show up in standalone mode on default port. This can be seen > on master and branch-2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20224) Web UI is broken in standalone mode
[ https://issues.apache.org/jira/browse/HBASE-20224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467891#comment-16467891 ] Umesh Agashe commented on HBASE-20224: -- I don't see patch 004 committed. Can this be re-opened for tracking? > Web UI is broken in standalone mode > --- > > Key: HBASE-20224 > URL: https://issues.apache.org/jira/browse/HBASE-20224 > Project: HBase > Issue Type: Bug > Components: UI, Usability >Affects Versions: 2.0.0-beta-2 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Blocker > Fix For: 2.0.0 > > Attachments: > 0001-HBASE-20224-Web-UI-is-broken-in-standalone-mode-ADDE.ADDENDUM.patch, > 20224-addendum.3.txt, 20224.addendum.4, 20224.addendum.5, 20224.addendum.6, > HBASE-20224.master.004.patch, hbase-20224.master.001.patch, > hbase-20224.master.002.patch, hbase-20224.master.003.patch, > hbase-20224.master.addendum.patch > > > Web UI doesn't show up in standalone mode on default port. This can be seen > on master and branch-2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20514) On Master restart if table is stuck in DISABLING state, CLOSED regions should not be considered stuck in-transition
[ https://issues.apache.org/jira/browse/HBASE-20514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460225#comment-16460225 ] Umesh Agashe commented on HBASE-20514: -- Added DISABLING state to check if table is in DISABLED or DISABLING state for ignoring the region. > On Master restart if table is stuck in DISABLING state, CLOSED regions should > not be considered stuck in-transition > --- > > Key: HBASE-20514 > URL: https://issues.apache.org/jira/browse/HBASE-20514 > Project: HBase > Issue Type: Bug > Components: amv2 >Affects Versions: 2.0.0 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.0.1 > > Attachments: hbase-20514.master.001.patch > > > When master is restarted, in AssignmentManager#loadMeta(), if table is in > DISABLED state nothing is done for regions in CLOSED state. But if table is > stuck in DISABLING state then CLOSED regions are considered as stuck > in-transition. CLOSED regions of DISABLING/ DISABLED table can be handled the > same way. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20514) On Master restart if table is stuck in DISABLING state, CLOSED regions should not be considered stuck in-transition
[ https://issues.apache.org/jira/browse/HBASE-20514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20514: - Attachment: hbase-20514.master.001.patch > On Master restart if table is stuck in DISABLING state, CLOSED regions should > not be considered stuck in-transition > --- > > Key: HBASE-20514 > URL: https://issues.apache.org/jira/browse/HBASE-20514 > Project: HBase > Issue Type: Bug > Components: amv2 >Affects Versions: 2.0.0 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.0.1 > > Attachments: hbase-20514.master.001.patch > > > When master is restarted, in AssignmentManager#loadMeta(), if table is in > DISABLED state nothing is done for regions in CLOSED state. But if table is > stuck in DISABLING state then CLOSED regions are considered as stuck > in-transition. CLOSED regions of DISABLING/ DISABLED table can be handled the > same way. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20514) On Master restart if table is stuck in DISABLING state, CLOSED regions should not be considered stuck in-transition
[ https://issues.apache.org/jira/browse/HBASE-20514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20514: - Status: Patch Available (was: In Progress) > On Master restart if table is stuck in DISABLING state, CLOSED regions should > not be considered stuck in-transition > --- > > Key: HBASE-20514 > URL: https://issues.apache.org/jira/browse/HBASE-20514 > Project: HBase > Issue Type: Bug > Components: amv2 >Affects Versions: 2.0.0 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.0.1 > > Attachments: hbase-20514.master.001.patch > > > When master is restarted, in AssignmentManager#loadMeta(), if table is in > DISABLED state nothing is done for regions in CLOSED state. But if table is > stuck in DISABLING state then CLOSED regions are considered as stuck > in-transition. CLOSED regions of DISABLING/ DISABLED table can be handled the > same way. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work started] (HBASE-20514) On Master restart if table is stuck in DISABLING state, CLOSED regions should not be considered stuck in-transition
[ https://issues.apache.org/jira/browse/HBASE-20514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-20514 started by Umesh Agashe. > On Master restart if table is stuck in DISABLING state, CLOSED regions should > not be considered stuck in-transition > --- > > Key: HBASE-20514 > URL: https://issues.apache.org/jira/browse/HBASE-20514 > Project: HBase > Issue Type: Bug > Components: amv2 >Affects Versions: 2.0.0 >Reporter: Umesh Agashe >Assignee: Umesh Agashe >Priority: Major > Fix For: 2.0.1 > > > When master is restarted, in AssignmentManager#loadMeta(), if table is in > DISABLED state nothing is done for regions in CLOSED state. But if table is > stuck in DISABLING state then CLOSED regions are considered as stuck > in-transition. CLOSED regions of DISABLING/ DISABLED table can be handled the > same way. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20514) On Master restart if table is stuck in DISABLING state, CLOSED regions should not be considered stuck in-transition
Umesh Agashe created HBASE-20514: Summary: On Master restart if table is stuck in DISABLING state, CLOSED regions should not be considered stuck in-transition Key: HBASE-20514 URL: https://issues.apache.org/jira/browse/HBASE-20514 Project: HBase Issue Type: Bug Components: amv2 Affects Versions: 2.0.0 Reporter: Umesh Agashe Assignee: Umesh Agashe Fix For: 2.0.1 When master is restarted, in AssignmentManager#loadMeta(), if table is in DISABLED state nothing is done for regions in CLOSED state. But if table is stuck in DISABLING state then CLOSED regions are considered as stuck in-transition. CLOSED regions of DISABLING/ DISABLED table can be handled the same way. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20492) UnassignProcedure is stuck in retry loop on region stuck in OPENING state
[ https://issues.apache.org/jira/browse/HBASE-20492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459351#comment-16459351 ] Umesh Agashe commented on HBASE-20492: -- +1, after fixing new checkstyle errors. > UnassignProcedure is stuck in retry loop on region stuck in OPENING state > - > > Key: HBASE-20492 > URL: https://issues.apache.org/jira/browse/HBASE-20492 > Project: HBase > Issue Type: Bug > Components: amv2 >Affects Versions: 2.0.0 >Reporter: Umesh Agashe >Assignee: stack >Priority: Critical > Fix For: 2.0.1 > > Attachments: HBASE-20492.branch-2.0.001.patch, > HBASE-20492.branch-2.0.002.patch, HBASE-20492.branch-2.0.003.patch > > > UnassignProcedure gets stuck in a retry loop for a region stuck in OPENING > state. From logs: > {code:java} > 2018-04-25 15:59:53,825 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: > Retryable error trying to transition: pid=142564, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList_20180331004141, > region=bd2fb2c7d39236c9b9085f350358df7c, > server=vb1122.halxg.cloudera.com,22101,1522626198450; rit=OPENING, > location=vb1122.halxg.cloudera.com,22101,1522626198450 > org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected > [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but > current state=OPENING > at > org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:158) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1514) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) > at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1227) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1738) > 2018-04-25 15:59:53,892 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: > Retryable error trying to transition: pid=142564, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList_20180331004141, > region=bd2fb2c7d39236c9b9085f350358df7c, > server=vb1122.halxg.cloudera.com,22101,1522626198450; rit=OPENING, > location=vb1122.halxg.cloudera.com,22101,1522626198450 > org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected > [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but > current state=OPENING > at > org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:158) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1514) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) > at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1227) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1738){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)
[ https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456901#comment-16456901 ] Umesh Agashe commented on HBASE-19121: -- For region states: {code:java} "scan 'hbase:meta', { ROWPREFIXFILTER => 't1,', COLUMNS => 'info:state'}{code} > HBCK for AMv2 (A.K.A HBCK2) > --- > > Key: HBASE-19121 > URL: https://issues.apache.org/jira/browse/HBASE-19121 > Project: HBase > Issue Type: Bug > Components: hbck >Reporter: stack >Priority: Major > > We don't have an hbck for the new AM. Old hbck may actually do damage going > against AMv2. > Fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20492) UnassignProcedure is stuck in retry loop on region stuck in OPENING state
[ https://issues.apache.org/jira/browse/HBASE-20492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16453246#comment-16453246 ] Umesh Agashe commented on HBASE-20492: -- Can not abort hung procedure and restarting master doesn't help. > UnassignProcedure is stuck in retry loop on region stuck in OPENING state > - > > Key: HBASE-20492 > URL: https://issues.apache.org/jira/browse/HBASE-20492 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Umesh Agashe >Priority: Major > Fix For: 2.0.1 > > > UnassignProcedure gets stuck in a retry loop for a region stuck in OPENING > state. From logs: > {code:java} > 2018-04-25 15:59:53,825 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: > Retryable error trying to transition: pid=142564, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList_20180331004141, > region=bd2fb2c7d39236c9b9085f350358df7c, > server=vb1122.halxg.cloudera.com,22101,1522626198450; rit=OPENING, > location=vb1122.halxg.cloudera.com,22101,1522626198450 > org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected > [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but > current state=OPENING > at > org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:158) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1514) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) > at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1227) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1738) > 2018-04-25 15:59:53,892 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: > Retryable error trying to transition: pid=142564, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList_20180331004141, > region=bd2fb2c7d39236c9b9085f350358df7c, > server=vb1122.halxg.cloudera.com,22101,1522626198450; rit=OPENING, > location=vb1122.halxg.cloudera.com,22101,1522626198450 > org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected > [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but > current state=OPENING > at > org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:158) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1514) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) > at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1227) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1738){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20492) UnassignProcedure is stuck in retry loop on region stuck in OPENING state
[ https://issues.apache.org/jira/browse/HBASE-20492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20492: - Fix Version/s: 2.0.1 > UnassignProcedure is stuck in retry loop on region stuck in OPENING state > - > > Key: HBASE-20492 > URL: https://issues.apache.org/jira/browse/HBASE-20492 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Umesh Agashe >Priority: Major > Fix For: 2.0.1 > > > UnassignProcedure gets stuck in a retry loop for a region stuck in OPENING > state. From logs: > {code:java} > 2018-04-25 15:59:53,825 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: > Retryable error trying to transition: pid=142564, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList_20180331004141, > region=bd2fb2c7d39236c9b9085f350358df7c, > server=vb1122.halxg.cloudera.com,22101,1522626198450; rit=OPENING, > location=vb1122.halxg.cloudera.com,22101,1522626198450 > org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected > [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but > current state=OPENING > at > org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:158) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1514) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) > at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1227) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1738) > 2018-04-25 15:59:53,892 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: > Retryable error trying to transition: pid=142564, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList_20180331004141, > region=bd2fb2c7d39236c9b9085f350358df7c, > server=vb1122.halxg.cloudera.com,22101,1522626198450; rit=OPENING, > location=vb1122.halxg.cloudera.com,22101,1522626198450 > org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected > [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but > current state=OPENING > at > org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:158) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1514) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) > at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1227) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1738){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20492) UnassignProcedure is stuck in retry loop on region stuck in OPENING state
[ https://issues.apache.org/jira/browse/HBASE-20492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-20492: - Summary: UnassignProcedure is stuck in retry loop on region stuck in OPENING state (was: UnassignProcedure is stuck in retry loop on region with state OPENING) > UnassignProcedure is stuck in retry loop on region stuck in OPENING state > - > > Key: HBASE-20492 > URL: https://issues.apache.org/jira/browse/HBASE-20492 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Umesh Agashe >Priority: Major > Fix For: 2.0.1 > > > UnassignProcedure gets stuck in a retry loop for a region stuck in OPENING > state. From logs: > {code:java} > 2018-04-25 15:59:53,825 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: > Retryable error trying to transition: pid=142564, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList_20180331004141, > region=bd2fb2c7d39236c9b9085f350358df7c, > server=vb1122.halxg.cloudera.com,22101,1522626198450; rit=OPENING, > location=vb1122.halxg.cloudera.com,22101,1522626198450 > org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected > [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but > current state=OPENING > at > org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:158) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1514) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) > at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1227) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1738) > 2018-04-25 15:59:53,892 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: > Retryable error trying to transition: pid=142564, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList_20180331004141, > region=bd2fb2c7d39236c9b9085f350358df7c, > server=vb1122.halxg.cloudera.com,22101,1522626198450; rit=OPENING, > location=vb1122.halxg.cloudera.com,22101,1522626198450 > org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected > [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but > current state=OPENING > at > org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:158) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1514) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) > at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1227) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1738){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20492) UnassignProcedure is stuck in retry loop on region with state OPENING
Umesh Agashe created HBASE-20492: Summary: UnassignProcedure is stuck in retry loop on region with state OPENING Key: HBASE-20492 URL: https://issues.apache.org/jira/browse/HBASE-20492 Project: HBase Issue Type: Improvement Affects Versions: 2.0.0 Reporter: Umesh Agashe UnassignProcedure gets stuck in a retry loop for a region stuck in OPENING state. From logs: {code:java} 2018-04-25 15:59:53,825 WARN org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Retryable error trying to transition: pid=142564, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure table=IntegrationTestBigLinkedList_20180331004141, region=bd2fb2c7d39236c9b9085f350358df7c, server=vb1122.halxg.cloudera.com,22101,1522626198450; rit=OPENING, location=vb1122.halxg.cloudera.com,22101,1522626198450 org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but current state=OPENING at org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:158) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1514) at org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) at org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) at org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1227) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1738) 2018-04-25 15:59:53,892 WARN org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: Retryable error trying to transition: pid=142564, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure table=IntegrationTestBigLinkedList_20180331004141, region=bd2fb2c7d39236c9b9085f350358df7c, server=vb1122.halxg.cloudera.com,22101,1522626198450; rit=OPENING, location=vb1122.halxg.cloudera.com,22101,1522626198450 org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but current state=OPENING at org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:158) at org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1514) at org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) at org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) at org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1227) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1738){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20492) UnassignProcedure is stuck in retry loop on region with state OPENING
[ https://issues.apache.org/jira/browse/HBASE-20492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16453216#comment-16453216 ] Umesh Agashe commented on HBASE-20492: -- Logs get filled up with above log messages. > UnassignProcedure is stuck in retry loop on region with state OPENING > - > > Key: HBASE-20492 > URL: https://issues.apache.org/jira/browse/HBASE-20492 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Umesh Agashe >Priority: Major > > UnassignProcedure gets stuck in a retry loop for a region stuck in OPENING > state. From logs: > {code:java} > 2018-04-25 15:59:53,825 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: > Retryable error trying to transition: pid=142564, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList_20180331004141, > region=bd2fb2c7d39236c9b9085f350358df7c, > server=vb1122.halxg.cloudera.com,22101,1522626198450; rit=OPENING, > location=vb1122.halxg.cloudera.com,22101,1522626198450 > org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected > [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but > current state=OPENING > at > org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:158) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1514) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) > at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1227) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1738) > 2018-04-25 15:59:53,892 WARN > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: > Retryable error trying to transition: pid=142564, > state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure > table=IntegrationTestBigLinkedList_20180331004141, > region=bd2fb2c7d39236c9b9085f350358df7c, > server=vb1122.halxg.cloudera.com,22101,1522626198450; rit=OPENING, > location=vb1122.halxg.cloudera.com,22101,1522626198450 > org.apache.hadoop.hbase.exceptions.UnexpectedStateException: Expected > [SPLITTING, SPLIT, MERGING, OPEN, CLOSING] so could move to CLOSING but > current state=OPENING > at > org.apache.hadoop.hbase.master.assignment.RegionStates$RegionStateNode.transitionState(RegionStates.java:158) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.markRegionAsClosing(AssignmentManager.java:1514) > at > org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:179) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:309) > at > org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:85) > at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1458) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1227) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1738){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)