[jira] [Updated] (HBASE-22408) add a metric for regions OPEN on non-live servers
[ https://issues.apache.org/jira/browse/HBASE-22408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22408: - Resolution: Fixed Fix Version/s: 2.3.0 3.0.0 Status: Resolved (was: Patch Available) Committed to master and branch-2. Thanks [~Apache9] for the review. > add a metric for regions OPEN on non-live servers > - > > Key: HBASE-22408 > URL: https://issues.apache.org/jira/browse/HBASE-22408 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.0.0, 2.3.0 > > > This serves 2 purposes for monitoring: > 1) Catching when regions are on dead servers due to long WAL splitting or > other delays in SCP. At that time, the regions are not listed as RITs; we'd > like to be able to have alerts in such cases. > 2) Catching various bugs in assignment and procWAL corruption, etc. that > leave region "OPEN" on a server that no longer exists, again to alert the > administrator via a metric. > Later, it might be possible to add more logic to distinguish 1 and 2, and to > mitigate 2 automatically and also set some metric to alert the administrator > to investigate later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22289) WAL-based log splitting resubmit threshold may result in a task being stuck forever
[ https://issues.apache.org/jira/browse/HBASE-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844157#comment-16844157 ] Sergey Shelukhin commented on HBASE-22289: -- Thanks for taking it over the finish line! It shouldn't affect 2.2 and later versions, at least in this form, because this code has been replaced by procedures. They might have a similar bug but it would require a different fix. > WAL-based log splitting resubmit threshold may result in a task being stuck > forever > --- > > Key: HBASE-22289 > URL: https://issues.apache.org/jira/browse/HBASE-22289 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0, 1.5.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 2.0.6, 2.1.5 > > Attachments: HBASE-22289.01-branch-2.1.patch, > HBASE-22289.02-branch-2.1.patch, HBASE-22289.03-branch-2.1.patch, > HBASE-22289.branch-2.1.001.patch, HBASE-22289.branch-2.1.001.patch, > HBASE-22289.branch-2.1.001.patch > > > Not sure if this is handled better in procedure based WAL splitting; in any > case it affects versions before that. > The problem is not in ZK as such but in internal state tracking in master, it > seems. > Master: > {noformat} > 2019-04-21 01:49:49,584 INFO > [master/:17000.splitLogManager..Chore.1] > coordination.SplitLogManagerCoordination: Resubmitting task > .1555831286638 > {noformat} > worker-rs, split fails > {noformat} > > 2019-04-21 02:05:31,774 INFO > [RS_LOG_REPLAY_OPS-regionserver/:17020-1] wal.WALSplitter: > Processed 24 edits across 2 regions; edits skipped=457; log > file=.1555831286638, length=2156363702, corrupted=false, progress > failed=true > {noformat} > Master (not sure about the delay of the acquired-message; at any rate it > seems to detect the failure fine from this server) > {noformat} > 2019-04-21 02:11:14,928 INFO [main-EventThread] > coordination.SplitLogManagerCoordination: Task .1555831286638 acquired > by ,17020,139815097 > 2019-04-21 02:19:41,264 INFO > [master/:17000.splitLogManager..Chore.1] > coordination.SplitLogManagerCoordination: Skipping resubmissions of task > .1555831286638 because threshold 3 reached > {noformat} > After that this task is stuck in the limbo forever with the old worker, and > never resubmitted. > RS never logs anything else for this task. > Killing the RS on the worker unblocked the task and some other server did the > split very quickly, so seems like master doesn't clear the worker name in its > internal state when hitting the threshold... master never restarted so > restarting the master might have also cleared it. > This is extracted from splitlogmanager log messages, note the times. > {noformat} > 2019-04-21 02:2 1555831286638=last_update = 1555837874928 last_version = 11 > cur_worker_name = ,17020,139815097 status = in_progress > incarnation = 3 resubmits = 3 batch = installed = 24 done = 3 error = 20, > > 2019-04-22 11:1 1555831286638=last_update = 1555837874928 last_version = 11 > cur_worker_name = ,17020,139815097 status = in_progress > incarnation = 3 resubmits = 3 batch = installed = 24 done = 3 error = 20} > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
[ https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840884#comment-16840884 ] Sergey Shelukhin edited comment on HBASE-22432 at 5/16/19 12:46 AM: Let's see how many tests this breaks. If the change in behavior breaks too many, I will add explicit flag to distinguish refresh vs shutdown and read that in ensure... and replace best-effort stuff with an atomic or smth like that. was (Author: sershe): Let's see how many tests this breaks. If the change in behavior breaks too many, I will add explicit flag to distinguish refresh vs shutdown and read that in ensure... > HRegionServer rssStub handling is incorrect and inconsistent > > > Key: HBASE-22432 > URL: https://issues.apache.org/jira/browse/HBASE-22432 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Critical > > Some methods refresh stub on null, some assume (incorrectly) server is > shutting down. > Most methods reset stub to null on error, and also now we do it when ZK > changes. > One of the latter can cause server reports to not be sent until one of the > former methods (e.g. reporting procedure completion) executes and happens to > restore the stub for them. > Also, reset on error is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
[ https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22432: - Description: Some methods refresh stub on null, some assume (incorrectly) server is shutting down. Most methods reset stub to null on error, and also now we do it when ZK changes. One of the latter can cause server reports to not be sent until one of the former methods (e.g. reporting procedure completion) executes and happens to restore the stub for them. Also, reset on error is done sometimes with and sometimes without a check. was: Some methods refresh stub on null, some assume (incorrectly) server is shutting down. Most methods reset stub to null on error, and also now we do it when ZK changes. One of the latter can cause server reports to not be sent until one of the former methods (e.g. reporting procedure completion) executes and happens to restore the stub for them. Reset is done sometimes with and sometimes without a check. > HRegionServer rssStub handling is incorrect and inconsistent > > > Key: HBASE-22432 > URL: https://issues.apache.org/jira/browse/HBASE-22432 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Critical > > Some methods refresh stub on null, some assume (incorrectly) server is > shutting down. > Most methods reset stub to null on error, and also now we do it when ZK > changes. > One of the latter can cause server reports to not be sent until one of the > former methods (e.g. reporting procedure completion) executes and happens to > restore the stub for them. > Also, reset on error is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
[ https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22432: - Affects Version/s: 3.0.0 > HRegionServer rssStub handling is incorrect and inconsistent > > > Key: HBASE-22432 > URL: https://issues.apache.org/jira/browse/HBASE-22432 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Critical > > Some methods refresh stub on null, some assume (incorrectly) server is > shutting down. > Most methods reset stub to null on error, and also now we do it when ZK > changes. > One of the latter can cause server reports to not be sent until one of the > former methods executes and happens to restore the stub. > Reset is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
[ https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22432: - Description: Some methods refresh stub on null, some assume (incorrectly) server is shutting down. Most methods reset stub to null on error, and also now we do it when ZK changes. One of the latter can cause server reports to not be sent until one of the former methods (e.g. reporting procedure completion) executes and happens to restore the stub for them. Reset is done sometimes with and sometimes without a check. was: Some methods refresh stub on null, some assume (incorrectly) server is shutting down. Most methods reset stub to null on error, and also now we do it when ZK changes. One of the latter can cause server reports to not be sent until one of the former methods executes and happens to restore the stub. Reset is done sometimes with and sometimes without a check. > HRegionServer rssStub handling is incorrect and inconsistent > > > Key: HBASE-22432 > URL: https://issues.apache.org/jira/browse/HBASE-22432 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Critical > > Some methods refresh stub on null, some assume (incorrectly) server is > shutting down. > Most methods reset stub to null on error, and also now we do it when ZK > changes. > One of the latter can cause server reports to not be sent until one of the > former methods (e.g. reporting procedure completion) executes and happens to > restore the stub for them. > Reset is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
[ https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840884#comment-16840884 ] Sergey Shelukhin edited comment on HBASE-22432 at 5/16/19 12:44 AM: Let's see how many tests this break. If the change in behavior breaks too many, I will add explicit flag to distinguish refresh vs shutdown and read that in ensure... was (Author: sershe): Let's see how many tests this break. If the change in behavior breaks too many, I will add explicit flag to distinguish refresh vs shutdown and read that in ensure... > HRegionServer rssStub handling is incorrect and inconsistent > > > Key: HBASE-22432 > URL: https://issues.apache.org/jira/browse/HBASE-22432 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Critical > > Some methods refresh stub on null, some assume (incorrectly) server is > shutting down. > Most methods reset stub to null on error, and also now we do it when ZK > changes. > One of the latter can cause server reports to not be sent until one of the > former methods executes and happens to restore the stub. > Reset is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
[ https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22432: - Description: Some methods refresh stub on null, some assume (incorrectly) server is shutting down. Most methods reset stub to null on error, and also now we do it when ZK changes. On of the latter can cause server reports to not be sent until one of the former methods executes and happens to restore the stub. Reset is done sometimes with and sometimes without a check. was: Some methods refresh stub on null, some assume (incorrectly) server is shutting down. The latter can cause server reports to not be sent until one of the former methods executes and happens to restore the stub. Reset is done sometimes with and sometimes without a check. > HRegionServer rssStub handling is incorrect and inconsistent > > > Key: HBASE-22432 > URL: https://issues.apache.org/jira/browse/HBASE-22432 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > Some methods refresh stub on null, some assume (incorrectly) server is > shutting down. > Most methods reset stub to null on error, and also now we do it when ZK > changes. > On of the latter can cause server reports to not be sent until one of the > former methods executes and happens to restore the stub. > Reset is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
[ https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22432: - Status: Patch Available (was: Open) Let's see how many tests this break. If the change in behavior breaks too many, I will add explicit flag to distinguish refresh vs shutdown and read that in ensure... > HRegionServer rssStub handling is incorrect and inconsistent > > > Key: HBASE-22432 > URL: https://issues.apache.org/jira/browse/HBASE-22432 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > Some methods refresh stub on null, some assume (incorrectly) server is > shutting down. > The latter can cause server reports to not be sent until one of the former > methods executes and happens to restore the stub. > Reset is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
[ https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840884#comment-16840884 ] Sergey Shelukhin edited comment on HBASE-22432 at 5/16/19 12:45 AM: Let's see how many tests this breaks. If the change in behavior breaks too many, I will add explicit flag to distinguish refresh vs shutdown and read that in ensure... was (Author: sershe): Let's see how many tests this break. If the change in behavior breaks too many, I will add explicit flag to distinguish refresh vs shutdown and read that in ensure... > HRegionServer rssStub handling is incorrect and inconsistent > > > Key: HBASE-22432 > URL: https://issues.apache.org/jira/browse/HBASE-22432 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Critical > > Some methods refresh stub on null, some assume (incorrectly) server is > shutting down. > Most methods reset stub to null on error, and also now we do it when ZK > changes. > One of the latter can cause server reports to not be sent until one of the > former methods executes and happens to restore the stub. > Reset is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
[ https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22432: - Description: Some methods refresh stub on null, some assume (incorrectly) server is shutting down. Most methods reset stub to null on error, and also now we do it when ZK changes. One of the latter can cause server reports to not be sent until one of the former methods executes and happens to restore the stub. Reset is done sometimes with and sometimes without a check. was: Some methods refresh stub on null, some assume (incorrectly) server is shutting down. Most methods reset stub to null on error, and also now we do it when ZK changes. On of the latter can cause server reports to not be sent until one of the former methods executes and happens to restore the stub. Reset is done sometimes with and sometimes without a check. > HRegionServer rssStub handling is incorrect and inconsistent > > > Key: HBASE-22432 > URL: https://issues.apache.org/jira/browse/HBASE-22432 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Critical > > Some methods refresh stub on null, some assume (incorrectly) server is > shutting down. > Most methods reset stub to null on error, and also now we do it when ZK > changes. > One of the latter can cause server reports to not be sent until one of the > former methods executes and happens to restore the stub. > Reset is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
[ https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22432: - Priority: Critical (was: Major) > HRegionServer rssStub handling is incorrect and inconsistent > > > Key: HBASE-22432 > URL: https://issues.apache.org/jira/browse/HBASE-22432 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Critical > > Some methods refresh stub on null, some assume (incorrectly) server is > shutting down. > Most methods reset stub to null on error, and also now we do it when ZK > changes. > On of the latter can cause server reports to not be sent until one of the > former methods executes and happens to restore the stub. > Reset is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
[ https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HBASE-22432: Assignee: Sergey Shelukhin > HRegionServer rssStub handling is incorrect and inconsistent > > > Key: HBASE-22432 > URL: https://issues.apache.org/jira/browse/HBASE-22432 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > Some methods refresh stub on null, some assume (incorrectly) server is > shutting down. > The latter can cause server reports to not be sent until one of the former > methods executes and happens to restore the stub. > Reset is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent
Sergey Shelukhin created HBASE-22432: Summary: HRegionServer rssStub handling is incorrect and inconsistent Key: HBASE-22432 URL: https://issues.apache.org/jira/browse/HBASE-22432 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Some methods refresh stub on null, some assume (incorrectly) server is shutting down. The latter can cause server reports to not be sent until one of the former methods executes and happens to restore the stub. Reset is done sometimes with and sometimes without a check. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22428) better client-side throttling for dropped calls
Sergey Shelukhin created HBASE-22428: Summary: better client-side throttling for dropped calls Key: HBASE-22428 URL: https://issues.apache.org/jira/browse/HBASE-22428 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Not sure yet how to implement this better. Either when we get CallTimeoutException on the client, or by having the timeout on the server be less than RPC timeout to be able to actually respond to client, we could do better job of throttling retries. Right now if multiple clients are overloading a server and calls start to be dropped, they just all retry and keep the server overloaded. The server might have to track when requests from a client timed out to fail more aggressively when processing time is high. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22410) add the notion of the expected # of servers for non-fixed server sets; report an alternative dead server metric
[ https://issues.apache.org/jira/browse/HBASE-22410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22410: - Status: Patch Available (was: Open) [~andrew.purt...@gmail.com] [~busbey] I cloned HBASE-22107 to add a better metric for the compute/etc scenarios... cleaning dead region server list after a timeout still wouldn't provide a reliable number, although it could still be done for maintainability in the original JIRA. > add the notion of the expected # of servers for non-fixed server sets; report > an alternative dead server metric > --- > > Key: HBASE-22410 > URL: https://issues.apache.org/jira/browse/HBASE-22410 > Project: HBase > Issue Type: Improvement > Components: Operability >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > dead servers appear to only be cleaned up when a server comes up on the same > host and port; however, if HBase is running on smth like YARN with many more > hosts than RSes, RS may come up on a different server and the dead one will > never be cleaned. > The metric should be improved to account for that... it will potentially > require configuring master with expected number of region servers, so that > the metric could be output based on that. > Dead server list should also be expired based on timestamp in such cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22410) add the notion of the expected # of servers for non-fixed server sets; report an alternative dead server metric
[ https://issues.apache.org/jira/browse/HBASE-22410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22410: - Summary: add the notion of the expected # of servers for non-fixed server sets; report an alternative dead server metric (was: add the notion of the expected # of servers and report a metric as an alternative to dead server metric for non-fixed server sets ) > add the notion of the expected # of servers for non-fixed server sets; report > an alternative dead server metric > --- > > Key: HBASE-22410 > URL: https://issues.apache.org/jira/browse/HBASE-22410 > Project: HBase > Issue Type: Improvement > Components: Operability >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > dead servers appear to only be cleaned up when a server comes up on the same > host and port; however, if HBase is running on smth like YARN with many more > hosts than RSes, RS may come up on a different server and the dead one will > never be cleaned. > The metric should be improved to account for that... it will potentially > require configuring master with expected number of region servers, so that > the metric could be output based on that. > Dead server list should also be expired based on timestamp in such cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22410) add the notion of the expected # of servers and report a metric as an alternative to dead server metric for non-fixed server sets
Sergey Shelukhin created HBASE-22410: Summary: add the notion of the expected # of servers and report a metric as an alternative to dead server metric for non-fixed server sets Key: HBASE-22410 URL: https://issues.apache.org/jira/browse/HBASE-22410 Project: HBase Issue Type: Improvement Components: Operability Reporter: Sergey Shelukhin dead servers appear to only be cleaned up when a server comes up on the same host and port; however, if HBase is running on smth like YARN with many more hosts than RSes, RS may come up on a different server and the dead one will never be cleaned. The metric should be improved to account for that... it will potentially require configuring master with expected number of region servers, so that the metric could be output based on that. Dead server list should also be expired based on timestamp in such cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22410) add the notion of the expected # of servers and report a metric as an alternative to dead server metric for non-fixed server sets
[ https://issues.apache.org/jira/browse/HBASE-22410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22410: - Priority: Major (was: Minor) > add the notion of the expected # of servers and report a metric as an > alternative to dead server metric for non-fixed server sets > -- > > Key: HBASE-22410 > URL: https://issues.apache.org/jira/browse/HBASE-22410 > Project: HBase > Issue Type: Improvement > Components: Operability >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > dead servers appear to only be cleaned up when a server comes up on the same > host and port; however, if HBase is running on smth like YARN with many more > hosts than RSes, RS may come up on a different server and the dead one will > never be cleaned. > The metric should be improved to account for that... it will potentially > require configuring master with expected number of region servers, so that > the metric could be output based on that. > Dead server list should also be expired based on timestamp in such cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-22410) add the notion of the expected # of servers and report a metric as an alternative to dead server metric for non-fixed server sets
[ https://issues.apache.org/jira/browse/HBASE-22410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HBASE-22410: Assignee: Sergey Shelukhin > add the notion of the expected # of servers and report a metric as an > alternative to dead server metric for non-fixed server sets > -- > > Key: HBASE-22410 > URL: https://issues.apache.org/jira/browse/HBASE-22410 > Project: HBase > Issue Type: Improvement > Components: Operability >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Minor > > dead servers appear to only be cleaned up when a server comes up on the same > host and port; however, if HBase is running on smth like YARN with many more > hosts than RSes, RS may come up on a different server and the dead one will > never be cleaned. > The metric should be improved to account for that... it will potentially > require configuring master with expected number of region servers, so that > the metric could be output based on that. > Dead server list should also be expired based on timestamp in such cases. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22408) add a metric for regions OPEN on non-live servers
[ https://issues.apache.org/jira/browse/HBASE-22408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22408: - Status: Patch Available (was: Open) Posted a PR. [~Apache9] do you mind taking a look? these metrics should be useful to catch the assignment issues and delays > add a metric for regions OPEN on non-live servers > - > > Key: HBASE-22408 > URL: https://issues.apache.org/jira/browse/HBASE-22408 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > This serves 2 purposes for monitoring: > 1) Catching when regions are on dead servers due to long WAL splitting or > other delays in SCP. At that time, the regions are not listed as RITs; we'd > like to be able to have alerts in such cases. > 2) Catching various bugs in assignment and procWAL corruption, etc. that > leave region "OPEN" on a server that no longer exists, again to alert the > administrator via a metric. > Later, it might be possible to add more logic to distinguish 1 and 2, and to > mitigate 2 automatically and also set some metric to alert the administrator > to investigate later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22408) add a metric for regions OPEN on non-live servers
Sergey Shelukhin created HBASE-22408: Summary: add a metric for regions OPEN on non-live servers Key: HBASE-22408 URL: https://issues.apache.org/jira/browse/HBASE-22408 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin This serves 2 purposes for monitoring: 1) Catching when regions are on dead servers due to long WAL splitting or other delays in SCP; at that time, the regions are not listed as RITs; we'd like to be able to have alerts in such cases. 2) Catching various bugs in assignment and procWAL corruption, etc. that leave region "OPEN" on a server that no longer exists, again to alert the administrator via a metric. Later, it might be possible to add more logic to distinguish 1 and 2, and add logic to mitigate 2 automatically and also set some metric to alert the administrator to investigate later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22408) add a metric for regions OPEN on non-live servers
[ https://issues.apache.org/jira/browse/HBASE-22408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22408: - Description: This serves 2 purposes for monitoring: 1) Catching when regions are on dead servers due to long WAL splitting or other delays in SCP. At that time, the regions are not listed as RITs; we'd like to be able to have alerts in such cases. 2) Catching various bugs in assignment and procWAL corruption, etc. that leave region "OPEN" on a server that no longer exists, again to alert the administrator via a metric. Later, it might be possible to add more logic to distinguish 1 and 2, and add logic to mitigate 2 automatically and also set some metric to alert the administrator to investigate later. was: This serves 2 purposes for monitoring: 1) Catching when regions are on dead servers due to long WAL splitting or other delays in SCP; at that time, the regions are not listed as RITs; we'd like to be able to have alerts in such cases. 2) Catching various bugs in assignment and procWAL corruption, etc. that leave region "OPEN" on a server that no longer exists, again to alert the administrator via a metric. Later, it might be possible to add more logic to distinguish 1 and 2, and add logic to mitigate 2 automatically and also set some metric to alert the administrator to investigate later. > add a metric for regions OPEN on non-live servers > - > > Key: HBASE-22408 > URL: https://issues.apache.org/jira/browse/HBASE-22408 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > This serves 2 purposes for monitoring: > 1) Catching when regions are on dead servers due to long WAL splitting or > other delays in SCP. At that time, the regions are not listed as RITs; we'd > like to be able to have alerts in such cases. > 2) Catching various bugs in assignment and procWAL corruption, etc. that > leave region "OPEN" on a server that no longer exists, again to alert the > administrator via a metric. > Later, it might be possible to add more logic to distinguish 1 and 2, and add > logic to mitigate 2 automatically and also set some metric to alert the > administrator to investigate later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22408) add a metric for regions OPEN on non-live servers
[ https://issues.apache.org/jira/browse/HBASE-22408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22408: - Description: This serves 2 purposes for monitoring: 1) Catching when regions are on dead servers due to long WAL splitting or other delays in SCP. At that time, the regions are not listed as RITs; we'd like to be able to have alerts in such cases. 2) Catching various bugs in assignment and procWAL corruption, etc. that leave region "OPEN" on a server that no longer exists, again to alert the administrator via a metric. Later, it might be possible to add more logic to distinguish 1 and 2, and to mitigate 2 automatically and also set some metric to alert the administrator to investigate later. was: This serves 2 purposes for monitoring: 1) Catching when regions are on dead servers due to long WAL splitting or other delays in SCP. At that time, the regions are not listed as RITs; we'd like to be able to have alerts in such cases. 2) Catching various bugs in assignment and procWAL corruption, etc. that leave region "OPEN" on a server that no longer exists, again to alert the administrator via a metric. Later, it might be possible to add more logic to distinguish 1 and 2, and add logic to mitigate 2 automatically and also set some metric to alert the administrator to investigate later. > add a metric for regions OPEN on non-live servers > - > > Key: HBASE-22408 > URL: https://issues.apache.org/jira/browse/HBASE-22408 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > This serves 2 purposes for monitoring: > 1) Catching when regions are on dead servers due to long WAL splitting or > other delays in SCP. At that time, the regions are not listed as RITs; we'd > like to be able to have alerts in such cases. > 2) Catching various bugs in assignment and procWAL corruption, etc. that > leave region "OPEN" on a server that no longer exists, again to alert the > administrator via a metric. > Later, it might be possible to add more logic to distinguish 1 and 2, and to > mitigate 2 automatically and also set some metric to alert the administrator > to investigate later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22407) add an option to use Hadoop metrics tags for table metrics (and fix some issues in metrics)
[ https://issues.apache.org/jira/browse/HBASE-22407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838856#comment-16838856 ] Sergey Shelukhin commented on HBASE-22407: -- Most of the changes are actually just refactoring, like moving code into overridable methods so it could be overridden > add an option to use Hadoop metrics tags for table metrics (and fix some > issues in metrics) > --- > > Key: HBASE-22407 > URL: https://issues.apache.org/jira/browse/HBASE-22407 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22407.01.patch > > > Currently table metrics are output using custom metrics names that clutter > various metrics lists and are impossible to (sanely) aggregate. > We can use Hadoop MetricsTag to instead use tagging on a single metric (for a > given logical metric), allowing both per-table display and cross-table > aggregation on the other end. > In this JIRA (patch coming) I'd like to add the ability to do that > 1) Actual tagging in multiple paths that output table metrics. > 2) The ugliest part - preventing server-level metrics from being output in > tags case to avoid duplicate metrics. Seems like a large refactor of the > metrics is in order (not included)... > 3) Fixes for some issues where wrong metrics are output, metrics are not > output at all, exceptions like null Optional cause table metrics to not be > output forever, etc. > 4) Renaming several table-level latency metrics to be consistent with > server-level latency metrics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22407) add an option to use Hadoop metrics tags for table metrics (and fix some issues in metrics)
[ https://issues.apache.org/jira/browse/HBASE-22407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22407: - Status: Patch Available (was: Open) > add an option to use Hadoop metrics tags for table metrics (and fix some > issues in metrics) > --- > > Key: HBASE-22407 > URL: https://issues.apache.org/jira/browse/HBASE-22407 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22407.01.patch > > > Currently table metrics are output using custom metrics names that clutter > various metrics lists and are impossible to (sanely) aggregate. > We can use Hadoop MetricsTag to instead use tagging on a single metric (for a > given logical metric), allowing both per-table display and cross-table > aggregation on the other end. > In this JIRA (patch coming) I'd like to add the ability to do that > 1) Actual tagging in multiple paths that output table metrics. > 2) The ugliest part - preventing server-level metrics from being output in > tags case to avoid duplicate metrics. Seems like a large refactor of the > metrics is in order (not included)... > 3) Fixes for some issues where wrong metrics are output, metrics are not > output at all, exceptions like null Optional cause table metrics to not be > output forever, etc. > 4) Renaming several table-level latency metrics to be consistent with > server-level latency metrics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22407) add an option to use Hadoop metrics tags for table metrics (and fix some issues in metrics)
[ https://issues.apache.org/jira/browse/HBASE-22407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22407: - Attachment: HBASE-22407.01.patch > add an option to use Hadoop metrics tags for table metrics (and fix some > issues in metrics) > --- > > Key: HBASE-22407 > URL: https://issues.apache.org/jira/browse/HBASE-22407 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22407.01.patch > > > Currently table metrics are output using custom metrics names that clutter > various metrics lists and are impossible to (sanely) aggregate. > We can use Hadoop MetricsTag to instead use tagging on a single metric (for a > given logical metric), allowing both per-table display and cross-table > aggregation on the other end. > In this JIRA (patch coming) I'd like to add the ability to do that > 1) Actual tagging in multiple paths that output table metrics. > 2) The ugliest part - preventing server-level metrics from being output in > tags case to avoid duplicate metrics. Seems like a large refactor of the > metrics is in order (not included)... > 3) Fixes for some issues where wrong metrics are output, metrics are not > output at all, exceptions like null Optional cause table metrics to not be > output forever, etc. > 4) Renaming several table-level latency metrics to be consistent with > server-level latency metrics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22407) add an option to use Hadoop metrics tags for table metrics (and fix some issues in metrics)
Sergey Shelukhin created HBASE-22407: Summary: add an option to use Hadoop metrics tags for table metrics (and fix some issues in metrics) Key: HBASE-22407 URL: https://issues.apache.org/jira/browse/HBASE-22407 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Currently table metrics are output using custom metrics names that clutter various metrics lists and are impossible to (sanely) aggregate. We can use Hadoop MetricsTag to instead use tagging on a single metric (for a given logical metric), allowing both per-table display and cross-table aggregation on the other end. In this JIRA (patch coming) I'd like to add the ability to do that 1) Actual tagging in multiple paths that output table metrics. 2) The ugliest part - preventing server-level metrics from being output in tags case to avoid duplicate metrics. Seems like a large refactor of the metrics is in order (not included)... 3) Fixes for some issues where wrong metrics are output, metrics are not output at all, exceptions like null Optional cause table metrics to not be output forever, etc. 4) Renaming several table-level latency metrics to be consistent with server-level latency metrics. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22254) refactor and improve decommissioning logic
[ https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838786#comment-16838786 ] Sergey Shelukhin commented on HBASE-22254: -- The tests are now passing > refactor and improve decommissioning logic > -- > > Key: HBASE-22254 > URL: https://issues.apache.org/jira/browse/HBASE-22254 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, > HBASE-22254.03.patch, HBASE-22254.patch > > > Making some changes needed to support better decommissioning on large > clusters and with container mode; to test those and add clarify I moved parts > of decommissioning logic from HMaster, Draining tracker, and ServerManager > into a separate class. > Features added/improvements: > 1) More resilient off-loading; right now off-loading fails for a subset of > regions in case of a single region failure; is never done on master restart, > etc. > 2) Option to kill RS after off-loading (good for container mode HBase, e.g. > on YARN). > 3) Option to specify machine names only to decommission, for the API to be > usable for an external system that doesn't care about HBase server names, or > e.g. multiple RS in containers on the same node. > 4) Option to replace existing decommissioning list instead of adding to it > (the same; to avoid additionally remembering what was previously sent to > HBase). > 5) Tests, comments ;) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty
[ https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22376: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to master only; the fix in HBASE-20727 only exists on master > master can fail to start w/NPE if lastflushedseqids file is empty > - > > Key: HBASE-22376 > URL: https://issues.apache.org/jira/browse/HBASE-22376 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-22376.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty
[ https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22376: - Affects Version/s: (was: 2.2.0) > master can fail to start w/NPE if lastflushedseqids file is empty > - > > Key: HBASE-22376 > URL: https://issues.apache.org/jira/browse/HBASE-22376 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-22376.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty
[ https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837672#comment-16837672 ] Sergey Shelukhin commented on HBASE-22376: -- Thanks for the review! > master can fail to start w/NPE if lastflushedseqids file is empty > - > > Key: HBASE-22376 > URL: https://issues.apache.org/jira/browse/HBASE-22376 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-22376.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty
[ https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22376: - Fix Version/s: (was: 2.2.0) > master can fail to start w/NPE if lastflushedseqids file is empty > - > > Key: HBASE-22376 > URL: https://issues.apache.org/jira/browse/HBASE-22376 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-22376.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22254) refactor and improve decommissioning logic
[ https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22254: - Attachment: (was: HBASE-22254.03.patch) > refactor and improve decommissioning logic > -- > > Key: HBASE-22254 > URL: https://issues.apache.org/jira/browse/HBASE-22254 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, > HBASE-22254.03.patch, HBASE-22254.patch > > > Making some changes needed to support better decommissioning on large > clusters and with container mode; to test those and add clarify I moved parts > of decommissioning logic from HMaster, Draining tracker, and ServerManager > into a separate class. > Features added/improvements: > 1) More resilient off-loading; right now off-loading fails for a subset of > regions in case of a single region failure; is never done on master restart, > etc. > 2) Option to kill RS after off-loading (good for container mode HBase, e.g. > on YARN). > 3) Option to specify machine names only to decommission, for the API to be > usable for an external system that doesn't care about HBase server names, or > e.g. multiple RS in containers on the same node. > 4) Option to replace existing decommissioning list instead of adding to it > (the same; to avoid additionally remembering what was previously sent to > HBase). > 5) Tests, comments ;) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22254) refactor and improve decommissioning logic
[ https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22254: - Attachment: HBASE-22254.03.patch > refactor and improve decommissioning logic > -- > > Key: HBASE-22254 > URL: https://issues.apache.org/jira/browse/HBASE-22254 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, > HBASE-22254.03.patch, HBASE-22254.patch > > > Making some changes needed to support better decommissioning on large > clusters and with container mode; to test those and add clarify I moved parts > of decommissioning logic from HMaster, Draining tracker, and ServerManager > into a separate class. > Features added/improvements: > 1) More resilient off-loading; right now off-loading fails for a subset of > regions in case of a single region failure; is never done on master restart, > etc. > 2) Option to kill RS after off-loading (good for container mode HBase, e.g. > on YARN). > 3) Option to specify machine names only to decommission, for the API to be > usable for an external system that doesn't care about HBase server names, or > e.g. multiple RS in containers on the same node. > 4) Option to replace existing decommissioning list instead of adding to it > (the same; to avoid additionally remembering what was previously sent to > HBase). > 5) Tests, comments ;) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22254) refactor and improve decommissioning logic
[ https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837641#comment-16837641 ] Sergey Shelukhin commented on HBASE-22254: -- Fixed the admin test. > refactor and improve decommissioning logic > -- > > Key: HBASE-22254 > URL: https://issues.apache.org/jira/browse/HBASE-22254 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, > HBASE-22254.03.patch, HBASE-22254.patch > > > Making some changes needed to support better decommissioning on large > clusters and with container mode; to test those and add clarify I moved parts > of decommissioning logic from HMaster, Draining tracker, and ServerManager > into a separate class. > Features added/improvements: > 1) More resilient off-loading; right now off-loading fails for a subset of > regions in case of a single region failure; is never done on master restart, > etc. > 2) Option to kill RS after off-loading (good for container mode HBase, e.g. > on YARN). > 3) Option to specify machine names only to decommission, for the API to be > usable for an external system that doesn't care about HBase server names, or > e.g. multiple RS in containers on the same node. > 4) Option to replace existing decommissioning list instead of adding to it > (the same; to avoid additionally remembering what was previously sent to > HBase). > 5) Tests, comments ;) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22254) refactor and improve decommissioning logic
[ https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22254: - Attachment: HBASE-22254.03.patch > refactor and improve decommissioning logic > -- > > Key: HBASE-22254 > URL: https://issues.apache.org/jira/browse/HBASE-22254 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, > HBASE-22254.03.patch, HBASE-22254.patch > > > Making some changes needed to support better decommissioning on large > clusters and with container mode; to test those and add clarify I moved parts > of decommissioning logic from HMaster, Draining tracker, and ServerManager > into a separate class. > Features added/improvements: > 1) More resilient off-loading; right now off-loading fails for a subset of > regions in case of a single region failure; is never done on master restart, > etc. > 2) Option to kill RS after off-loading (good for container mode HBase, e.g. > on YARN). > 3) Option to specify machine names only to decommission, for the API to be > usable for an external system that doesn't care about HBase server names, or > e.g. multiple RS in containers on the same node. > 4) Option to replace existing decommissioning list instead of adding to it > (the same; to avoid additionally remembering what was previously sent to > HBase). > 5) Tests, comments ;) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22254) refactor and improve decommissioning logic
[ https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837585#comment-16837585 ] Sergey Shelukhin commented on HBASE-22254: -- Most test failures look spurious, admin one looks real. Apparently offload cannot be tested (and there's no existing test for it) cause there's only one server > refactor and improve decommissioning logic > -- > > Key: HBASE-22254 > URL: https://issues.apache.org/jira/browse/HBASE-22254 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, > HBASE-22254.patch > > > Making some changes needed to support better decommissioning on large > clusters and with container mode; to test those and add clarify I moved parts > of decommissioning logic from HMaster, Draining tracker, and ServerManager > into a separate class. > Features added/improvements: > 1) More resilient off-loading; right now off-loading fails for a subset of > regions in case of a single region failure; is never done on master restart, > etc. > 2) Option to kill RS after off-loading (good for container mode HBase, e.g. > on YARN). > 3) Option to specify machine names only to decommission, for the API to be > usable for an external system that doesn't care about HBase server names, or > e.g. multiple RS in containers on the same node. > 4) Option to replace existing decommissioning list instead of adding to it > (the same; to avoid additionally remembering what was previously sent to > HBase). > 5) Tests, comments ;) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22254) refactor and improve decommissioning logic
[ https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837491#comment-16837491 ] Sergey Shelukhin commented on HBASE-22254: -- The Ruby warnings in the code that I basically copy-pasted, so I'm going to ignore most of them. Looking at the rest.. > refactor and improve decommissioning logic > -- > > Key: HBASE-22254 > URL: https://issues.apache.org/jira/browse/HBASE-22254 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, > HBASE-22254.patch > > > Making some changes needed to support better decommissioning on large > clusters and with container mode; to test those and add clarify I moved parts > of decommissioning logic from HMaster, Draining tracker, and ServerManager > into a separate class. > Features added/improvements: > 1) More resilient off-loading; right now off-loading fails for a subset of > regions in case of a single region failure; is never done on master restart, > etc. > 2) Option to kill RS after off-loading (good for container mode HBase, e.g. > on YARN). > 3) Option to specify machine names only to decommission, for the API to be > usable for an external system that doesn't care about HBase server names, or > e.g. multiple RS in containers on the same node. > 4) Option to replace existing decommissioning list instead of adding to it > (the same; to avoid additionally remembering what was previously sent to > HBase). > 5) Tests, comments ;) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22254) refactor and improve decommissioning logic
[ https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836817#comment-16836817 ] Sergey Shelukhin commented on HBASE-22254: -- Btw, these APIs were added in HBASE-17370 but the book wasn't updated, it still relies on znode creation there... I wonder if the book should be updated w/this patch > refactor and improve decommissioning logic > -- > > Key: HBASE-22254 > URL: https://issues.apache.org/jira/browse/HBASE-22254 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, > HBASE-22254.patch > > > Making some changes needed to support better decommissioning on large > clusters and with container mode; to test those and add clarify I moved parts > of decommissioning logic from HMaster, Draining tracker, and ServerManager > into a separate class. > Features added/improvements: > 1) More resilient off-loading; right now off-loading fails for a subset of > regions in case of a single region failure; is never done on master restart, > etc. > 2) Option to kill RS after off-loading (good for container mode HBase, e.g. > on YARN). > 3) Option to specify machine names only to decommission, for the API to be > usable for an external system that doesn't care about HBase server names, or > e.g. multiple RS in containers on the same node. > 4) Option to replace existing decommissioning list instead of adding to it > (the same; to avoid additionally remembering what was previously sent to > HBase). > 5) Tests, comments ;) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22254) refactor and improve decommissioning logic
[ https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22254: - Status: Patch Available (was: Open) Fixed the test, also made the client changes to make new API features usable not just directly via a request. > refactor and improve decommissioning logic > -- > > Key: HBASE-22254 > URL: https://issues.apache.org/jira/browse/HBASE-22254 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, > HBASE-22254.patch > > > Making some changes needed to support better decommissioning on large > clusters and with container mode; to test those and add clarify I moved parts > of decommissioning logic from HMaster, Draining tracker, and ServerManager > into a separate class. > Features added/improvements: > 1) More resilient off-loading; right now off-loading fails for a subset of > regions in case of a single region failure; is never done on master restart, > etc. > 2) Option to kill RS after off-loading (good for container mode HBase, e.g. > on YARN). > 3) Option to specify machine names only to decommission, for the API to be > usable for an external system that doesn't care about HBase server names, or > e.g. multiple RS in containers on the same node. > 4) Option to replace existing decommissioning list instead of adding to it > (the same; to avoid additionally remembering what was previously sent to > HBase). > 5) Tests, comments ;) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22254) refactor and improve decommissioning logic
[ https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22254: - Attachment: HBASE-22254.02.patch > refactor and improve decommissioning logic > -- > > Key: HBASE-22254 > URL: https://issues.apache.org/jira/browse/HBASE-22254 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, > HBASE-22254.patch > > > Making some changes needed to support better decommissioning on large > clusters and with container mode; to test those and add clarify I moved parts > of decommissioning logic from HMaster, Draining tracker, and ServerManager > into a separate class. > Features added/improvements: > 1) More resilient off-loading; right now off-loading fails for a subset of > regions in case of a single region failure; is never done on master restart, > etc. > 2) Option to kill RS after off-loading (good for container mode HBase, e.g. > on YARN). > 3) Option to specify machine names only to decommission, for the API to be > usable for an external system that doesn't care about HBase server names, or > e.g. multiple RS in containers on the same node. > 4) Option to replace existing decommissioning list instead of adding to it > (the same; to avoid additionally remembering what was previously sent to > HBase). > 5) Tests, comments ;) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-17370) Fix or provide shell scripts to drain and decommission region server
[ https://issues.apache.org/jira/browse/HBASE-17370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836783#comment-16836783 ] Sergey Shelukhin commented on HBASE-17370: -- Should the book be updated for this? Looks like it still suggests creating znodes to decommission region servers. > Fix or provide shell scripts to drain and decommission region server > > > Key: HBASE-17370 > URL: https://issues.apache.org/jira/browse/HBASE-17370 > Project: HBase > Issue Type: Sub-task >Reporter: Jerry He >Assignee: Nihal Jain >Priority: Major > Labels: operability > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-17370.branch-2.001.patch, > HBASE-17370.master.001.patch, HBASE-17370.master.002.patch > > > 1. Update the existing shell scripts to use the new drain related API. > 2 Or provide new shell scripts. > 3. Provide a 'decommission' shell tool that puts the server in drain mode and > offload the server. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty
[ https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835951#comment-16835951 ] Sergey Shelukhin commented on HBASE-22376: -- [~psomogyi] [~Apache9] can you take a look? tiny fix. The caller of this code already catches and ignores IOEx, but in case of an empty file PB returns null > master can fail to start w/NPE if lastflushedseqids file is empty > - > > Key: HBASE-22376 > URL: https://issues.apache.org/jira/browse/HBASE-22376 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-22376.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22385) Consider "programmatic" HFiles
[ https://issues.apache.org/jira/browse/HBASE-22385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835830#comment-16835830 ] Sergey Shelukhin commented on HBASE-22385: -- Could the refactoring be used to allow multi-level splits by splitting references (preferably via a new modified reference, not multi-level references)? > Consider "programmatic" HFiles > -- > > Key: HBASE-22385 > URL: https://issues.apache.org/jira/browse/HBASE-22385 > Project: HBase > Issue Type: Brainstorming >Reporter: Lars Hofhansl >Priority: Major > > For various use cases (among others there is mass deletes) it would be great > if HBase had a mechanism for programmatic HFiles. I.e. HFiles (Reader) that > produce KeyValues just like any other old HFile, but the key values produced > are generated or produced by some other means rather than being physically > read from some storage medium. > In fact this could be a generalization for the various HFiles we have: > (Normal) HFiles, HFileLinks, HalfStoreFiles, etc. > A simple way could be to allow for storing a classname into the HFile. Upon > reading the HFile HBase would instantiate an instance of that class and that > instance is responsible for all further interaction with that HFile. For > normal HFiles it would just be the normal HFileReaderVx. For that we'd also > need to StoreFile.Reader into an interface (or a more basic base class) that > can be properly implemented. > (Remember this is Brainstorming :) ) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22360) Abort timer doesn't set when abort is called during graceful shutdown process
[ https://issues.apache.org/jira/browse/HBASE-22360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22360: - Resolution: Fixed Fix Version/s: 2.2.0 3.0.0 Status: Resolved (was: Patch Available) Committed to master and branch-2 > Abort timer doesn't set when abort is called during graceful shutdown process > - > > Key: HBASE-22360 > URL: https://issues.apache.org/jira/browse/HBASE-22360 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 3.0.0, 2.2.0 >Reporter: Bahram Chehrazy >Assignee: Bahram Chehrazy >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: Set-the-abortMonitor-timer-in-the-abort-function-01.patch > > > The abort timer only get set when the server is aborted. But if the server is > being gracefully stopped and something goes wrong causing an abort, the timer > may not get set, and the shutdown process could take a very long time or > completely stuck the server. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty
[ https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22376: - Attachment: HBASE-22376.patch > master can fail to start w/NPE if lastflushedseqids file is empty > - > > Key: HBASE-22376 > URL: https://issues.apache.org/jira/browse/HBASE-22376 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-22376.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty
[ https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22376: - Status: Patch Available (was: Open) > master can fail to start w/NPE if lastflushedseqids file is empty > - > > Key: HBASE-22376 > URL: https://issues.apache.org/jira/browse/HBASE-22376 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-22376.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty
[ https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22376: - Affects Version/s: 2.2.0 3.0.0 > master can fail to start w/NPE if lastflushedseqids file is empty > - > > Key: HBASE-22376 > URL: https://issues.apache.org/jira/browse/HBASE-22376 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty
[ https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22376: - Fix Version/s: 2.2.0 3.0.0 > master can fail to start w/NPE if lastflushedseqids file is empty > - > > Key: HBASE-22376 > URL: https://issues.apache.org/jira/browse/HBASE-22376 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.0.0, 2.2.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty
Sergey Shelukhin created HBASE-22376: Summary: master can fail to start w/NPE if lastflushedseqids file is empty Key: HBASE-22376 URL: https://issues.apache.org/jira/browse/HBASE-22376 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22360) Abort timer doesn't set when abort is called during graceful shutdown process
[ https://issues.apache.org/jira/browse/HBASE-22360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834289#comment-16834289 ] Sergey Shelukhin commented on HBASE-22360: -- +1 > Abort timer doesn't set when abort is called during graceful shutdown process > - > > Key: HBASE-22360 > URL: https://issues.apache.org/jira/browse/HBASE-22360 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 3.0.0, 2.2.0 >Reporter: Bahram Chehrazy >Assignee: Bahram Chehrazy >Priority: Major > Attachments: Set-the-abortMonitor-timer-in-the-abort-function-01.patch > > > The abort timer only get set when the server is aborted. But if the server is > being gracefully stopped and something goes wrong causing an abort, the timer > may not get set, and the shutdown process could take a very long time or > completely stuck the server. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners
[ https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22346: - Attachment: HBASE-22346.01.patch > scanner priorities/deadline units are invalid for non-huge scanners > --- > > Key: HBASE-22346 > URL: https://issues.apache.org/jira/browse/HBASE-22346 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22346.01.patch, HBASE-22346.patch > > > I was looking at using the priority (deadline) queue for scanner requests; > what I see is that AnnotationReadingPriorityFunction, the only impl of the > deadline function available, implements getDeadline as sqrt of the number of > next() calls, from HBASE-10993. > However, CallPriorityComparator.compare, its only caller, adds that > "deadline" value to the callA.getReceiveTime() in milliseconds... > That results in some sort of a meaningless value that I assume only make > sense "by coincidence" for telling apart broad and specific classes of > scanners... in practice next calls must be in the 1000s before it becomes > meaningful vs small differences in ReceivedTime > When there's contention from many scanners, e.g. small scanners for meta, or > just users creating tons of scanners to the point where requests queue up, > the actual deadline is not accounted for and the priority function itself is > meaningless... In fact as queueing increases, it becomes worse because > receivedtime differences grow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it
[ https://issues.apache.org/jira/browse/HBASE-22354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22354: - Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to master and branch-2. Thanks for the review! > master never sets abortRequested, and thus abort timeout doesn't work for it > > > Key: HBASE-22354 > URL: https://issues.apache.org/jira/browse/HBASE-22354 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-22354.patch > > > Discovered w/HBASE-22353 netty deadlock. > The property is not set, so the abort timer is not started. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-22353) update non-shaded netty for Hadoop 2 to a more recent version of 3.6
[ https://issues.apache.org/jira/browse/HBASE-22353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HBASE-22353: Assignee: Sergey Shelukhin > update non-shaded netty for Hadoop 2 to a more recent version of 3.6 > > > Key: HBASE-22353 > URL: https://issues.apache.org/jira/browse/HBASE-22353 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22353.patch > > > When using Netty socket for ZK, we got this deadlock. > Appears to be https://github.com/netty/netty/issues/1181 (or one of similar > tickets before that). > We are using Netty 3.6.2 for Hadoop 2, seems like it should be safe to > upgrade to 3.6.10, assuming it's purely a bugfix release for 3.6.2 and they > are compatible? > {noformat} > Java stack information for the threads listed above: > === > "main-SendThread(...)": >at org.jboss.netty.handler.ssl.SslHandler.wrap(SslHandler.java:958) >- waiting to lock <0xc91d8848> (a java.lang.Object) >- locked <0xcdcc7740> (a java.util.LinkedList) >at > org.jboss.netty.handler.ssl.SslHandler.handleDownstream(SslHandler.java:627) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:587) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:578) >at org.jboss.netty.channel.Channels.write(Channels.java:704) >at org.jboss.netty.channel.Channels.write(Channels.java:671) >at > org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:248) >at > org.apache.zookeeper.ClientCnxnSocketNetty.sendPkt(ClientCnxnSocketNetty.java:268) >at > org.apache.zookeeper.ClientCnxnSocketNetty.doWrite(ClientCnxnSocketNetty.java:291) >at > org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:249) >at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146) > "New I/O worker #3": >at > org.jboss.netty.handler.ssl.SslHandler.channelClosed(SslHandler.java:1554) >- waiting to lock <0xcdcc7740> (a java.util.LinkedList) >at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555) >at > org.jboss.netty.channel.Channels.fireChannelClosed(Channels.java:468) >at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:351) >at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:254) >- locked <0xc91d8770> (a java.lang.Object) >at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:145) >at > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:83) >at > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:775) >at org.jboss.netty.channel.Channels.write(Channels.java:725) >at org.jboss.netty.channel.Channels.write(Channels.java:686) >at > org.jboss.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:1140) >at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1229) >- locked <0xc91d8848> (a java.lang.Object) >at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:910) >at > org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425) >at > org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) >at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555) >at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) >at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) >at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) >at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107) >at >
[jira] [Updated] (HBASE-22353) update non-shaded netty for Hadoop 2 to a more recent version of 3.6
[ https://issues.apache.org/jira/browse/HBASE-22353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22353: - Attachment: HBASE-22353.patch > update non-shaded netty for Hadoop 2 to a more recent version of 3.6 > > > Key: HBASE-22353 > URL: https://issues.apache.org/jira/browse/HBASE-22353 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22353.patch > > > When using Netty socket for ZK, we got this deadlock. > Appears to be https://github.com/netty/netty/issues/1181 (or one of similar > tickets before that). > We are using Netty 3.6.2 for Hadoop 2, seems like it should be safe to > upgrade to 3.6.10, assuming it's purely a bugfix release for 3.6.2 and they > are compatible? > {noformat} > Java stack information for the threads listed above: > === > "main-SendThread(...)": >at org.jboss.netty.handler.ssl.SslHandler.wrap(SslHandler.java:958) >- waiting to lock <0xc91d8848> (a java.lang.Object) >- locked <0xcdcc7740> (a java.util.LinkedList) >at > org.jboss.netty.handler.ssl.SslHandler.handleDownstream(SslHandler.java:627) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:587) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:578) >at org.jboss.netty.channel.Channels.write(Channels.java:704) >at org.jboss.netty.channel.Channels.write(Channels.java:671) >at > org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:248) >at > org.apache.zookeeper.ClientCnxnSocketNetty.sendPkt(ClientCnxnSocketNetty.java:268) >at > org.apache.zookeeper.ClientCnxnSocketNetty.doWrite(ClientCnxnSocketNetty.java:291) >at > org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:249) >at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146) > "New I/O worker #3": >at > org.jboss.netty.handler.ssl.SslHandler.channelClosed(SslHandler.java:1554) >- waiting to lock <0xcdcc7740> (a java.util.LinkedList) >at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555) >at > org.jboss.netty.channel.Channels.fireChannelClosed(Channels.java:468) >at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:351) >at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:254) >- locked <0xc91d8770> (a java.lang.Object) >at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:145) >at > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:83) >at > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:775) >at org.jboss.netty.channel.Channels.write(Channels.java:725) >at org.jboss.netty.channel.Channels.write(Channels.java:686) >at > org.jboss.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:1140) >at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1229) >- locked <0xc91d8848> (a java.lang.Object) >at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:910) >at > org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425) >at > org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) >at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555) >at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) >at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) >at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) >at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107) >at > org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) >at >
[jira] [Updated] (HBASE-22353) update non-shaded netty for Hadoop 2 to a more recent version of 3.6
[ https://issues.apache.org/jira/browse/HBASE-22353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22353: - Status: Patch Available (was: Open) > update non-shaded netty for Hadoop 2 to a more recent version of 3.6 > > > Key: HBASE-22353 > URL: https://issues.apache.org/jira/browse/HBASE-22353 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22353.patch > > > When using Netty socket for ZK, we got this deadlock. > Appears to be https://github.com/netty/netty/issues/1181 (or one of similar > tickets before that). > We are using Netty 3.6.2 for Hadoop 2, seems like it should be safe to > upgrade to 3.6.10, assuming it's purely a bugfix release for 3.6.2 and they > are compatible? > {noformat} > Java stack information for the threads listed above: > === > "main-SendThread(...)": >at org.jboss.netty.handler.ssl.SslHandler.wrap(SslHandler.java:958) >- waiting to lock <0xc91d8848> (a java.lang.Object) >- locked <0xcdcc7740> (a java.util.LinkedList) >at > org.jboss.netty.handler.ssl.SslHandler.handleDownstream(SslHandler.java:627) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:587) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:578) >at org.jboss.netty.channel.Channels.write(Channels.java:704) >at org.jboss.netty.channel.Channels.write(Channels.java:671) >at > org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:248) >at > org.apache.zookeeper.ClientCnxnSocketNetty.sendPkt(ClientCnxnSocketNetty.java:268) >at > org.apache.zookeeper.ClientCnxnSocketNetty.doWrite(ClientCnxnSocketNetty.java:291) >at > org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:249) >at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146) > "New I/O worker #3": >at > org.jboss.netty.handler.ssl.SslHandler.channelClosed(SslHandler.java:1554) >- waiting to lock <0xcdcc7740> (a java.util.LinkedList) >at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555) >at > org.jboss.netty.channel.Channels.fireChannelClosed(Channels.java:468) >at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:351) >at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:254) >- locked <0xc91d8770> (a java.lang.Object) >at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:145) >at > org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:83) >at > org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:775) >at org.jboss.netty.channel.Channels.write(Channels.java:725) >at org.jboss.netty.channel.Channels.write(Channels.java:686) >at > org.jboss.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:1140) >at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1229) >- locked <0xc91d8848> (a java.lang.Object) >at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:910) >at > org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425) >at > org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) >at > org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) >at > org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555) >at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) >at > org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) >at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) >at > org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107) >at >
[jira] [Updated] (HBASE-22348) allow one to actually disable replication svc
[ https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22348: - Fix Version/s: 2.2.0 > allow one to actually disable replication svc > - > > Key: HBASE-22348 > URL: https://issues.apache.org/jira/browse/HBASE-22348 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 2.2.0 > > Attachments: HBASE-22348.patch > > > Minor, but it does create extra ZK traffic for no reason and there's no way > to disable that it appears. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it
[ https://issues.apache.org/jira/browse/HBASE-22354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22354: - Fix Version/s: 2.2.0 > master never sets abortRequested, and thus abort timeout doesn't work for it > > > Key: HBASE-22354 > URL: https://issues.apache.org/jira/browse/HBASE-22354 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Fix For: 2.2.0 > > Attachments: HBASE-22354.patch > > > Discovered w/HBASE-22353 netty deadlock. > The property is not set, so the abort timer is not started. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22348) allow one to actually disable replication svc
[ https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22348: - Affects Version/s: 2.2.0 > allow one to actually disable replication svc > - > > Key: HBASE-22348 > URL: https://issues.apache.org/jira/browse/HBASE-22348 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22348.patch > > > Minor, but it does create extra ZK traffic for no reason and there's no way > to disable that it appears. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it
[ https://issues.apache.org/jira/browse/HBASE-22354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22354: - Affects Version/s: 2.2.0 > master never sets abortRequested, and thus abort timeout doesn't work for it > > > Key: HBASE-22354 > URL: https://issues.apache.org/jira/browse/HBASE-22354 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22354.patch > > > Discovered w/HBASE-22353 netty deadlock. > The property is not set, so the abort timer is not started. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it
Sergey Shelukhin created HBASE-22354: Summary: master never sets abortRequested, and thus abort timeout doesn't work for it Key: HBASE-22354 URL: https://issues.apache.org/jira/browse/HBASE-22354 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Discovered w/HBASE-22353 netty deadlock. The property is not set, so the abort timer is not started. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it
[ https://issues.apache.org/jira/browse/HBASE-22354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HBASE-22354: Assignee: Sergey Shelukhin > master never sets abortRequested, and thus abort timeout doesn't work for it > > > Key: HBASE-22354 > URL: https://issues.apache.org/jira/browse/HBASE-22354 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > Discovered w/HBASE-22353 netty deadlock. > The property is not set, so the abort timer is not started. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it
[ https://issues.apache.org/jira/browse/HBASE-22354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22354: - Attachment: HBASE-22354.patch > master never sets abortRequested, and thus abort timeout doesn't work for it > > > Key: HBASE-22354 > URL: https://issues.apache.org/jira/browse/HBASE-22354 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22354.patch > > > Discovered w/HBASE-22353 netty deadlock. > The property is not set, so the abort timer is not started. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it
[ https://issues.apache.org/jira/browse/HBASE-22354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22354: - Status: Patch Available (was: Open) Tiny patch... > master never sets abortRequested, and thus abort timeout doesn't work for it > > > Key: HBASE-22354 > URL: https://issues.apache.org/jira/browse/HBASE-22354 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22354.patch > > > Discovered w/HBASE-22353 netty deadlock. > The property is not set, so the abort timer is not started. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22353) update non-shaded netty for Hadoop 2 to a more recent version of 3.6
Sergey Shelukhin created HBASE-22353: Summary: update non-shaded netty for Hadoop 2 to a more recent version of 3.6 Key: HBASE-22353 URL: https://issues.apache.org/jira/browse/HBASE-22353 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin When using Netty socket for ZK, we got this deadlock. Appears to be https://github.com/netty/netty/issues/1181 (or one of similar tickets before that). We are using Netty 3.6.2 for Hadoop 2, seems like it should be safe to upgrade to 3.6.10, assuming it's purely a bugfix release for 3.6.2 and they are compatible? {noformat} Java stack information for the threads listed above: === "main-SendThread(...)": at org.jboss.netty.handler.ssl.SslHandler.wrap(SslHandler.java:958) - waiting to lock <0xc91d8848> (a java.lang.Object) - locked <0xcdcc7740> (a java.util.LinkedList) at org.jboss.netty.handler.ssl.SslHandler.handleDownstream(SslHandler.java:627) at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:587) at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:578) at org.jboss.netty.channel.Channels.write(Channels.java:704) at org.jboss.netty.channel.Channels.write(Channels.java:671) at org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:248) at org.apache.zookeeper.ClientCnxnSocketNetty.sendPkt(ClientCnxnSocketNetty.java:268) at org.apache.zookeeper.ClientCnxnSocketNetty.doWrite(ClientCnxnSocketNetty.java:291) at org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:249) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146) "New I/O worker #3": at org.jboss.netty.handler.ssl.SslHandler.channelClosed(SslHandler.java:1554) - waiting to lock <0xcdcc7740> (a java.util.LinkedList) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555) at org.jboss.netty.channel.Channels.fireChannelClosed(Channels.java:468) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:351) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:254) - locked <0xc91d8770> (a java.lang.Object) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:145) at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:83) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:775) at org.jboss.netty.channel.Channels.write(Channels.java:725) at org.jboss.netty.channel.Channels.write(Channels.java:686) at org.jboss.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:1140) at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1229) - locked <0xc91d8848> (a java.lang.Object) at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:910) at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425) at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107) at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312) at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at
[jira] [Moved] (HBASE-22352) use a system table as an alternative proc store
[ https://issues.apache.org/jira/browse/HBASE-22352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin moved HIVE-21676 to HBASE-22352: - Key: HBASE-22352 (was: HIVE-21676) Project: HBase (was: Hive) > use a system table as an alternative proc store > --- > > Key: HBASE-22352 > URL: https://issues.apache.org/jira/browse/HBASE-22352 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > We keep hitting these issues: > {noformat} > 2019-04-30 23:41:52,164 INFO [master/master:17000:becomeActiveMaster] > procedure2.ProcedureExecutor: Starting 16 core workers (bigger of cpus/4 or > 16) with max (burst) worker count=160 > 2019-04-30 23:41:52,171 INFO [master/master:17000:becomeActiveMaster] > util.FSHDFSUtils: Recover lease on dfs file > .../MasterProcWALs/pv2-0481.log > 2019-04-30 23:41:52,176 INFO [master/master:17000:becomeActiveMaster] > util.FSHDFSUtils: Recovered lease, attempt=0 on > file=.../MasterProcWALs/pv2-0481.log after 5ms > 2019-04-30 23:41:52,288 INFO [master/master:17000:becomeActiveMaster] > util.FSHDFSUtils: Recover lease on dfs file > .../MasterProcWALs/pv2-0482.log > 2019-04-30 23:41:52,289 INFO [master/master:17000:becomeActiveMaster] > util.FSHDFSUtils: Recovered lease, attempt=0 on > file=.../MasterProcWALs/pv2-0482.log after 1ms > 2019-04-30 23:41:52,373 INFO [master/master:17000:becomeActiveMaster] > wal.WALProcedureStore: Rolled new Procedure Store WAL, id=483 > 2019-04-30 23:41:52,375 INFO [master/master:17000:becomeActiveMaster] > procedure2.ProcedureExecutor: Recovered WALProcedureStore lease in 206msec > 2019-04-30 23:41:52,782 INFO [master/master:17000:becomeActiveMaster] > wal.ProcedureWALFormatReader: Read 1556 entries in > .../MasterProcWALs/pv2-0482.log > 2019-04-30 23:41:55,370 INFO [master/master:17000:becomeActiveMaster] > wal.ProcedureWALFormatReader: Read 28113 entries in > .../MasterProcWALs/pv2-0481.log > 2019-04-30 23:41:55,384 ERROR [master/master:17000:becomeActiveMaster] > wal.WALProcedureTree: Missing stack id 166, max stack id is 181, root > procedure is Procedure(pid=289380, ppid=-1, > class=org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure) > 2019-04-30 23:41:55,384 ERROR [master/master:17000:becomeActiveMaster] > wal.WALProcedureTree: Missing stack id 178, max stack id is 181, root > procedure is Procedure(pid=289380, ppid=-1, > class=org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure) > 2019-04-30 23:41:55,389 ERROR [master/master:17000:becomeActiveMaster] > wal.WALProcedureTree: Missing stack id 359, max stack id is 360, root > procedure is Procedure(pid=285640, ppid=-1, > class=org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure) > {noformat} > After which the procedure(s) is/are lost and cluster is stuck permanently. > There were no errors writing these files in the log, and no issues reading > them from HDFS, so it's purely a data loss issue in the structure. > I was thinking about debugging it, but on 2nd thought what we are trying to > store is some PB blob, by key. > Coincidentally, we have an "HBase" facility that we already deploy, that does > just that... and it even has a WAL implementation. I don't know why we cannot > use it for procedure state and have to invent another complex implementation > of a KV store inside a KV store. > In all/most cases, we don't even support rollback and use the latest state, > but if we need multiple versions, this HBase product even supports that! > I think we should add a hbase:proc table that would be maintained similar to > meta. The latter part esp. given the existing code for meta should be much > more simple than a separate store impl. > This should be pluggable and optional via ProcStore interface (made more > abstract as relevant - update state, scan state, get) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22348) allow one to actually disable replication svc
[ https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22348: - Description: Minor, but it does create extra ZK traffic for no reason and there's no way to disable that it appears. (was: Minor, but it does create extra ZK traffic for now reason and there's no way to disable that it appears. ) > allow one to actually disable replication svc > - > > Key: HBASE-22348 > URL: https://issues.apache.org/jira/browse/HBASE-22348 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22348.patch > > > Minor, but it does create extra ZK traffic for no reason and there's no way > to disable that it appears. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22348) allow one to actually disable replication svc
[ https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831225#comment-16831225 ] Sergey Shelukhin commented on HBASE-22348: -- Other places appear to either have null checks already, or are within things like bulk replication coprocessor, so they don't need one because they have to be enabled explicitly when using replication. > allow one to actually disable replication svc > - > > Key: HBASE-22348 > URL: https://issues.apache.org/jira/browse/HBASE-22348 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22348.patch > > > Minor, but it does create extra ZK traffic for now reason and there's no way > to disable that it appears. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-22348) allow one to actually disable replication svc
[ https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HBASE-22348: Assignee: Sergey Shelukhin > allow one to actually disable replication svc > - > > Key: HBASE-22348 > URL: https://issues.apache.org/jira/browse/HBASE-22348 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22348.patch > > > Minor, but it does create extra ZK traffic for now reason and there's no way > to disable that it appears. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22348) allow one to actually disable replication svc
[ https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22348: - Status: Patch Available (was: Open) > allow one to actually disable replication svc > - > > Key: HBASE-22348 > URL: https://issues.apache.org/jira/browse/HBASE-22348 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22348.patch > > > Minor, but it does create extra ZK traffic for now reason and there's no way > to disable that it appears. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22348) allow one to actually disable replication svc
[ https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22348: - Attachment: HBASE-22348.patch > allow one to actually disable replication svc > - > > Key: HBASE-22348 > URL: https://issues.apache.org/jira/browse/HBASE-22348 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22348.patch > > > Minor, but it does create extra ZK traffic for now reason and there's no way > to disable that it appears. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22348) allow one to actually disable replication svc
Sergey Shelukhin created HBASE-22348: Summary: allow one to actually disable replication svc Key: HBASE-22348 URL: https://issues.apache.org/jira/browse/HBASE-22348 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Minor, but it does create extra ZK traffic for now reason and there's no way to disable that it appears. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22347) try to archive WALs when closing a region or when shutting down RS
[ https://issues.apache.org/jira/browse/HBASE-22347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22347: - Description: When RS shuts down in an orderly manner due to an upgrade or decom, even it has 0 regions (discovered when testing HBASE-22254), it still dies with some active WALs. WALs are then split by master, and in the 0-region case the recovered edits are not used for anything. This splitting is a waste of time... if some region is moved away from the server it might also make sense to archive the WALs to avoid reading the extras. RS shutdown should archive WALs if possible after flushing/closing regions; given that the latter can fail, perhaps once before, and once after. Closing a region via an RPC should also try to archive WAL. was: When RS shuts down in an orderly manner due to an upgrade or decom, even it has 0 regions (discovered when testing HBASE-22254), it still dies with some active WALs. WALs are then split by master, and in the 0-region case the recovered edits are not used for anything. RS shutdown should archive WALs if possible after flushing/closing regions; given that the latter can fail, perhaps once before, and once after. Closing a region via an RPC should also try to archive WAL. > try to archive WALs when closing a region or when shutting down RS > -- > > Key: HBASE-22347 > URL: https://issues.apache.org/jira/browse/HBASE-22347 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > When RS shuts down in an orderly manner due to an upgrade or decom, even it > has 0 regions (discovered when testing HBASE-22254), it still dies with some > active WALs. > WALs are then split by master, and in the 0-region case the recovered edits > are not used for anything. This splitting is a waste of time... if some > region is moved away from the server it might also make sense to archive the > WALs to avoid reading the extras. > RS shutdown should archive WALs if possible after flushing/closing regions; > given that the latter can fail, perhaps once before, and once after. > Closing a region via an RPC should also try to archive WAL. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22347) try to archive WALs when closing a region or when shutting down RS
Sergey Shelukhin created HBASE-22347: Summary: try to archive WALs when closing a region or when shutting down RS Key: HBASE-22347 URL: https://issues.apache.org/jira/browse/HBASE-22347 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin When RS shuts down in an orderly manner due to an upgrade or decom, even it has 0 regions (discovered when testing HBASE-22254), it still dies with some active WALs. WALs are then split by master, and in the 0-region case the recovered edits are not used for anything. RS shutdown should archive WALs if possible after flushing/closing regions; given that the latter can fail, perhaps once before, and once after. Closing a region via an RPC should also try to archive WAL. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners
[ https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22346: - Status: Patch Available (was: Open) > scanner priorities/deadline units are invalid for non-huge scanners > --- > > Key: HBASE-22346 > URL: https://issues.apache.org/jira/browse/HBASE-22346 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22346.patch > > > I was looking at using the priority (deadline) queue for scanner requests; > what I see is that AnnotationReadingPriorityFunction, the only impl of the > deadline function available, implements getDeadline as sqrt of the number of > next() calls, from HBASE-10993. > However, CallPriorityComparator.compare, its only caller, adds that > "deadline" value to the callA.getReceiveTime() in milliseconds... > That results in some sort of a meaningless value that I assume only make > sense "by coincidence" for telling apart broad and specific classes of > scanners... in practice next calls must be in the 1000s before it becomes > meaningful vs small differences in ReceivedTime > When there's contention from many scanners, e.g. small scanners for meta, or > just users creating tons of scanners to the point where requests queue up, > the actual deadline is not accounted for and the priority function itself is > meaningless... In fact as queueing increases, it becomes worse because > receivedtime differences grow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners
[ https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22346: - Attachment: HBASE-22346.patch > scanner priorities/deadline units are invalid for non-huge scanners > --- > > Key: HBASE-22346 > URL: https://issues.apache.org/jira/browse/HBASE-22346 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22346.patch > > > I was looking at using the priority (deadline) queue for scanner requests; > what I see is that AnnotationReadingPriorityFunction, the only impl of the > deadline function available, implements getDeadline as sqrt of the number of > next() calls, from HBASE-10993. > However, CallPriorityComparator.compare, its only caller, adds that > "deadline" value to the callA.getReceiveTime() in milliseconds... > That results in some sort of a meaningless value that I assume only make > sense "by coincidence" for telling apart broad and specific classes of > scanners... in practice next calls must be in the 1000s before it becomes > meaningful vs small differences in ReceivedTime > When there's contention from many scanners, e.g. small scanners for meta, or > just users creating tons of scanners to the point where requests queue up, > the actual deadline is not accounted for and the priority function itself is > meaningless... In fact as queueing increases, it becomes worse because > receivedtime differences grow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners
[ https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830826#comment-16830826 ] Sergey Shelukhin commented on HBASE-22346: -- [~stack] [~mbertozzi] does this make sense to you? preserves the old behavior with low/no overhead when unset. We will probably run this for meta only on our cluster and see how it goes. > scanner priorities/deadline units are invalid for non-huge scanners > --- > > Key: HBASE-22346 > URL: https://issues.apache.org/jira/browse/HBASE-22346 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > I was looking at using the priority (deadline) queue for scanner requests; > what I see is that AnnotationReadingPriorityFunction, the only impl of the > deadline function available, implements getDeadline as sqrt of the number of > next() calls, from HBASE-10993. > However, CallPriorityComparator.compare, its only caller, adds that > "deadline" value to the callA.getReceiveTime() in milliseconds... > That results in some sort of a meaningless value that I assume only make > sense "by coincidence" for telling apart broad and specific classes of > scanners... in practice next calls must be in the 1000s before it becomes > meaningful vs small differences in ReceivedTime > When there's contention from many scanners, e.g. small scanners for meta, or > just users creating tons of scanners to the point where requests queue up, > the actual deadline is not accounted for and the priority function itself is > meaningless... In fact as queueing increases, it becomes worse because > receivedtime differences grow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners
[ https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HBASE-22346: Assignee: Sergey Shelukhin > scanner priorities/deadline units are invalid for non-huge scanners > --- > > Key: HBASE-22346 > URL: https://issues.apache.org/jira/browse/HBASE-22346 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > > I was looking at using the priority (deadline) queue for scanner requests; > what I see is that AnnotationReadingPriorityFunction, the only impl of the > deadline function available, implements getDeadline as sqrt of the number of > next() calls, from HBASE-10993. > However, CallPriorityComparator.compare, its only caller, adds that > "deadline" value to the callA.getReceiveTime() in milliseconds... > That results in some sort of a meaningless value that I assume only make > sense "by coincidence" for telling apart broad and specific classes of > scanners... in practice next calls must be in the 1000s before it becomes > meaningful vs small differences in ReceivedTime > When there's contention from many scanners, e.g. small scanners for meta, or > just users creating tons of scanners to the point where requests queue up, > the actual deadline is not accounted for and the priority function itself is > meaningless... In fact as queueing increases, it becomes worse because > receivedtime differences grow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22081) master shutdown: close RpcServer and procWAL first thing
[ https://issues.apache.org/jira/browse/HBASE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830810#comment-16830810 ] Sergey Shelukhin commented on HBASE-22081: -- [~Apache9] does this patch make sense to you? it moves Rpc server and proc closing to the beginning of the shutdown to limit potential race conditions with incorrect state/new requests. > master shutdown: close RpcServer and procWAL first thing > > > Key: HBASE-22081 > URL: https://issues.apache.org/jira/browse/HBASE-22081 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22081.01.patch, HBASE-22081.02.patch, > HBASE-22081.03.patch, HBASE-22081.patch > > > I had a master get stuck due to HBASE-22079 and noticed it was logging RS > abort messages during shutdown. > [~bahramch] found some issues where messages are processed by old master > during shutdown due to a race condition in RS cache (or it could also happen > due to a network race). > Previously I found some bug where SCP was created during master shutdown that > had incorrect state (because some structures already got cleaned). > I think before master fencing is implemented we can at least make these > issues much less likely by thinking about shutdown order. > 1) First kill RCP server so we don't receive any more messages. There's no > need to receive messages when we are shutting down. Server heartbeats could > be impacted I guess, but I don't think they will be cause we currently only > kill RS on ZK timeout. > 2) Then do whatever cleanup we think is needed that requires proc wal. > 3) Then close proc WAL so no errant threads can create more procs. > 4) Then do whatever other cleanup. > 5) Finally delete znode. > Right now znode is deleted somewhat early I think, and RpcServer is closed > very late. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22081) master shutdown: close RpcServer and procWAL first thing
[ https://issues.apache.org/jira/browse/HBASE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830659#comment-16830659 ] Sergey Shelukhin commented on HBASE-22081: -- Interesting... tests pass in the JIRA and locally, but not in the PR. > master shutdown: close RpcServer and procWAL first thing > > > Key: HBASE-22081 > URL: https://issues.apache.org/jira/browse/HBASE-22081 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22081.01.patch, HBASE-22081.02.patch, > HBASE-22081.03.patch, HBASE-22081.patch > > > I had a master get stuck due to HBASE-22079 and noticed it was logging RS > abort messages during shutdown. > [~bahramch] found some issues where messages are processed by old master > during shutdown due to a race condition in RS cache (or it could also happen > due to a network race). > Previously I found some bug where SCP was created during master shutdown that > had incorrect state (because some structures already got cleaned). > I think before master fencing is implemented we can at least make these > issues much less likely by thinking about shutdown order. > 1) First kill RCP server so we don't receive any more messages. There's no > need to receive messages when we are shutting down. Server heartbeats could > be impacted I guess, but I don't think they will be cause we currently only > kill RS on ZK timeout. > 2) Then do whatever cleanup we think is needed that requires proc wal. > 3) Then close proc WAL so no errant threads can create more procs. > 4) Then do whatever other cleanup. > 5) Finally delete znode. > Right now znode is deleted somewhat early I think, and RpcServer is closed > very late. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22081) master shutdown: close RpcServer and procWAL first thing
[ https://issues.apache.org/jira/browse/HBASE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22081: - Attachment: HBASE-22081.03.patch > master shutdown: close RpcServer and procWAL first thing > > > Key: HBASE-22081 > URL: https://issues.apache.org/jira/browse/HBASE-22081 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22081.01.patch, HBASE-22081.02.patch, > HBASE-22081.03.patch, HBASE-22081.patch > > > I had a master get stuck due to HBASE-22079 and noticed it was logging RS > abort messages during shutdown. > [~bahramch] found some issues where messages are processed by old master > during shutdown due to a race condition in RS cache (or it could also happen > due to a network race). > Previously I found some bug where SCP was created during master shutdown that > had incorrect state (because some structures already got cleaned). > I think before master fencing is implemented we can at least make these > issues much less likely by thinking about shutdown order. > 1) First kill RCP server so we don't receive any more messages. There's no > need to receive messages when we are shutting down. Server heartbeats could > be impacted I guess, but I don't think they will be cause we currently only > kill RS on ZK timeout. > 2) Then do whatever cleanup we think is needed that requires proc wal. > 3) Then close proc WAL so no errant threads can create more procs. > 4) Then do whatever other cleanup. > 5) Finally delete znode. > Right now znode is deleted somewhat early I think, and RpcServer is closed > very late. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners
[ https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22346: - Description: I was looking at using the priority (deadline) queue for scanner requests; what I see is that AnnotationReadingPriorityFunction, the only impl of the deadline function available, implements getDeadline as sqrt of the number of next() calls, from HBASE-10993. However, CallPriorityComparator.compare, its only caller, adds that "deadline" value to the callA.getReceiveTime() in milliseconds... That results in some sort of a meaningless value that I assume only make sense "by coincidence" for telling apart broad and specific classes of scanners... in practice next calls must be in the 1000s before it becomes meaningful vs small differences in ReceivedTime When there's contention from many scanners, e.g. small scanners for meta, or just users creating tons of scanners to the point where requests queue up, the actual deadline is not accounted for and the priority function itself is meaningless... In fact as queueing increases, it becomes worse because receivedtime differences grow. was: I was looking at using the priority (deadline) queue for scanner requests; what I see is that AnnotationReadingPriorityFunction, the only impl of the deadline function available, implements getDeadline as sqrt of the number of next() calls, from HBASE-10993. However, CallPriorityComparator.compare, its only caller, adds that "deadline" value to the callA.getReceiveTime() in milliseconds... That results in some sort of a meaningless value that I assume only make sense "by coincidence" for telling apart broad and specific classes of scanners... in practice next calls must be in the 1000s before it becomes meaningful vs small differences in ReceivedTime When there's contention for many scanners, e.g. small scanners for meta, or just users creating tons of scanners to the point where requests queue up, the actual deadline is not accounted for and the priority function itself is meaningless... In fact as queueing increases, it becomes worse because receivedtime differences grow. > scanner priorities/deadline units are invalid for non-huge scanners > --- > > Key: HBASE-22346 > URL: https://issues.apache.org/jira/browse/HBASE-22346 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > I was looking at using the priority (deadline) queue for scanner requests; > what I see is that AnnotationReadingPriorityFunction, the only impl of the > deadline function available, implements getDeadline as sqrt of the number of > next() calls, from HBASE-10993. > However, CallPriorityComparator.compare, its only caller, adds that > "deadline" value to the callA.getReceiveTime() in milliseconds... > That results in some sort of a meaningless value that I assume only make > sense "by coincidence" for telling apart broad and specific classes of > scanners... in practice next calls must be in the 1000s before it becomes > meaningful vs small differences in ReceivedTime > When there's contention from many scanners, e.g. small scanners for meta, or > just users creating tons of scanners to the point where requests queue up, > the actual deadline is not accounted for and the priority function itself is > meaningless... In fact as queueing increases, it becomes worse because > receivedtime differences grow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners
[ https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22346: - Description: I was looking at using the priority (deadline) queue for scanner requests; what I see is that AnnotationReadingPriorityFunction, the only impl of the deadline function available, implements getDeadline as sqrt of the number of next() calls, from HBASE-10993. However, CallPriorityComparator.compare, its only caller, adds that "deadline" value to the callA.getReceiveTime() in milliseconds... That results in some sort of a meaningless value that I assume only make sense "by coincidence" for telling apart broad and specific classes of scanners... in practice next calls must be in the 1000s before it becomes meaningful vs small differences in ReceivedTime When there's contention for many scanners, e.g. small scanners for meta, or just users creating tons of scanners to the point where requests queue up, the actual deadline is not accounted for and the priority function itself is meaningless... In fact as queueing increases, it becomes worse because receivedtime differences grow. was: I was looking at using the priority (deadline) queue for scanner requests; what I see is that AnnotationReadingPriorityFunction, the only impl of the deadline function available, implements getDeadline as sqrt of the number of next() calls, from HBASE-10993. However, CallPriorityComparator.compare, its only caller, adds that "deadline" value to the callA.getReceiveTime() in milliseconds... That results in some sort of a meaningless value that I assume only make sense by coincidence for telling apart broad and specific classes of scanners... in practice next calls must be in the 1000s before it becomes meaningful vs small differences in ReceivedTime When there's contention for many scanners, e.g. small scanners for meta, or just users creating tons of scanners to the point where requests queue up, the actual deadline is not accounted for and the priority function itself is meaningless... In fact as queueing increases, it becomes worse because receivedtime differences grow. > scanner priorities/deadline units are invalid for non-huge scanners > --- > > Key: HBASE-22346 > URL: https://issues.apache.org/jira/browse/HBASE-22346 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > I was looking at using the priority (deadline) queue for scanner requests; > what I see is that AnnotationReadingPriorityFunction, the only impl of the > deadline function available, implements getDeadline as sqrt of the number of > next() calls, from HBASE-10993. > However, CallPriorityComparator.compare, its only caller, adds that > "deadline" value to the callA.getReceiveTime() in milliseconds... > That results in some sort of a meaningless value that I assume only make > sense "by coincidence" for telling apart broad and specific classes of > scanners... in practice next calls must be in the 1000s before it becomes > meaningful vs small differences in ReceivedTime > When there's contention for many scanners, e.g. small scanners for meta, or > just users creating tons of scanners to the point where requests queue up, > the actual deadline is not accounted for and the priority function itself is > meaningless... In fact as queueing increases, it becomes worse because > receivedtime differences grow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners
[ https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830647#comment-16830647 ] Sergey Shelukhin commented on HBASE-22346: -- cc [~mbertozzi] was adding the number to received time intentional? > scanner priorities/deadline units are invalid for non-huge scanners > --- > > Key: HBASE-22346 > URL: https://issues.apache.org/jira/browse/HBASE-22346 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > I was looking at using the priority (deadline) queue for scanner requests; > what I see is that AnnotationReadingPriorityFunction, the only impl of the > deadline function available, implements getDeadline as sqrt of the number of > next() calls, from HBASE-10993. > However, CallPriorityComparator.compare, its only caller, adds that > "deadline" value to the callA.getReceiveTime() in milliseconds... > That results in some sort of a meaningless value that I assume only make > sense "by coincidence" for telling apart broad and specific classes of > scanners... in practice next calls must be in the 1000s before it becomes > meaningful vs small differences in ReceivedTime > When there's contention from many scanners, e.g. small scanners for meta, or > just users creating tons of scanners to the point where requests queue up, > the actual deadline is not accounted for and the priority function itself is > meaningless... In fact as queueing increases, it becomes worse because > receivedtime differences grow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners
Sergey Shelukhin created HBASE-22346: Summary: scanner priorities/deadline units are invalid for non-huge scanners Key: HBASE-22346 URL: https://issues.apache.org/jira/browse/HBASE-22346 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin I was looking at using the priority (deadline) queue for scanner requests; what I see is that AnnotationReadingPriorityFunction, the only impl of the deadline function available, implements getDeadline as sqrt of the number of next() calls, from HBASE-10993. However, CallPriorityComparator.compare, its only caller, adds that "deadline" value to the callA.getReceiveTime() in milliseconds... That results in some sort of a meaningless value that I assume only make sense by coincidence for telling apart broad and specific classes of scanners... in practice next calls must be in the 1000s before it becomes meaningful vs small differences in ReceivedTime When there's contention for many scanners, e.g. small scanners for meta, or just users creating tons of scanners to the point where requests queue up, the actual deadline is not accounted for and the priority function itself is meaningless... In fact as queueing increases, it becomes worse because receivedtime differences grow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow
[ https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830463#comment-16830463 ] Sergey Shelukhin commented on HBASE-22301: -- It may do so anyway, due to HDFS-14387 What is the thing that prevents it from picking local node +1 > Consider rolling the WAL if the HDFS write pipeline is slow > --- > > Key: HBASE-22301 > URL: https://issues.apache.org/jira/browse/HBASE-22301 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.3.0 > > Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch > > > Consider the case when a subset of the HDFS fleet is unhealthy but suffering > a gray failure not an outright outage. HDFS operations, notably syncs, are > abnormally slow on pipelines which include this subset of hosts. If the > regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be > consumed waiting for acks from the datanodes in the pipeline (recall that > some of them are sick). Imagine a write heavy application distributing load > uniformly over the cluster at a fairly high rate. With the WAL subsystem > slowed by HDFS level issues, all handlers can be blocked waiting to append to > the WAL. Once all handlers are blocked, the application will experience > backpressure. All (HBase) clients eventually have too many outstanding writes > and block. > Because the application is distributing writes near uniformly in the > keyspace, the probability any given service endpoint will dispatch a request > to an impacted regionserver, even a single regionserver, approaches 1.0. So > the probability that all service endpoints will be affected approaches 1.0. > In order to break the logjam, we need to remove the slow datanodes. Although > there is HDFS level monitoring, mechanisms, and procedures for this, we > should also attempt to take mitigating action at the HBase layer as soon as > we find ourselves in trouble. It would be enough to remove the affected > datanodes from the writer pipelines. A super simple strategy that can be > effective is described below: > This is with branch-1 code. I think branch-2's async WAL can mitigate but > still can be susceptible. branch-2 sync WAL is susceptible. > We already roll the WAL writer if the pipeline suffers the failure of a > datanode and the replication factor on the pipeline is too low. We should > also consider how much time it took for the write pipeline to complete a sync > the last time we measured it, or the max over the interval from now to the > last time we checked. If the sync time exceeds a configured threshold, roll > the log writer then too. Fortunately we don't need to know which datanode is > making the WAL write pipeline slow, only that syncs on the pipeline are too > slow and exceeding a threshold. This is enough information to know when to > roll it. Once we roll it, we will get three new randomly selected datanodes. > On most clusters the probability the new pipeline includes the slow datanode > will be low. (And if for some reason it does end up with a problematic > datanode again, we roll again.) > This is not a silver bullet but this can be a reasonably effective mitigation. > Provide a metric for tracking when log roll is requested (and for what > reason). > Emit a log line at log roll time that includes datanode pipeline details for > further debugging and analysis, similar to the existing slow FSHLog sync log > line. > If we roll too many times within a short interval of time this probably means > there is a widespread problem with the fleet and so our mitigation is not > helping and may be exacerbating those problems or operator difficulties. > Ensure log roll requests triggered by this new feature happen infrequently > enough to not cause difficulties under either normal or abnormal conditions. > A very simple strategy that could work well under both normal and abnormal > conditions is to define a fairly lengthy interval, default 5 minutes, and > then insure we do not roll more than once during this interval for this > reason. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22081) master shutdown: close RpcServer and procWAL first thing
[ https://issues.apache.org/jira/browse/HBASE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829874#comment-16829874 ] Sergey Shelukhin commented on HBASE-22081: -- This patch is getting more and more interesting. Looks like some procedures do not handle interruptedioexception correctly, retrying it forever, which in the case of minicluster, prevents it from shutting down. Not sure how the order of termination affected it, probably procwal terminating early just catches the proc in the test in a different state than it did before. > master shutdown: close RpcServer and procWAL first thing > > > Key: HBASE-22081 > URL: https://issues.apache.org/jira/browse/HBASE-22081 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22081.01.patch, HBASE-22081.02.patch, > HBASE-22081.patch > > > I had a master get stuck due to HBASE-22079 and noticed it was logging RS > abort messages during shutdown. > [~bahramch] found some issues where messages are processed by old master > during shutdown due to a race condition in RS cache (or it could also happen > due to a network race). > Previously I found some bug where SCP was created during master shutdown that > had incorrect state (because some structures already got cleaned). > I think before master fencing is implemented we can at least make these > issues much less likely by thinking about shutdown order. > 1) First kill RCP server so we don't receive any more messages. There's no > need to receive messages when we are shutting down. Server heartbeats could > be impacted I guess, but I don't think they will be cause we currently only > kill RS on ZK timeout. > 2) Then do whatever cleanup we think is needed that requires proc wal. > 3) Then close proc WAL so no errant threads can create more procs. > 4) Then do whatever other cleanup. > 5) Finally delete znode. > Right now znode is deleted somewhat early I think, and RpcServer is closed > very late. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22081) master shutdown: close RpcServer and procWAL first thing
[ https://issues.apache.org/jira/browse/HBASE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22081: - Attachment: HBASE-22081.02.patch > master shutdown: close RpcServer and procWAL first thing > > > Key: HBASE-22081 > URL: https://issues.apache.org/jira/browse/HBASE-22081 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22081.01.patch, HBASE-22081.02.patch, > HBASE-22081.patch > > > I had a master get stuck due to HBASE-22079 and noticed it was logging RS > abort messages during shutdown. > [~bahramch] found some issues where messages are processed by old master > during shutdown due to a race condition in RS cache (or it could also happen > due to a network race). > Previously I found some bug where SCP was created during master shutdown that > had incorrect state (because some structures already got cleaned). > I think before master fencing is implemented we can at least make these > issues much less likely by thinking about shutdown order. > 1) First kill RCP server so we don't receive any more messages. There's no > need to receive messages when we are shutting down. Server heartbeats could > be impacted I guess, but I don't think they will be cause we currently only > kill RS on ZK timeout. > 2) Then do whatever cleanup we think is needed that requires proc wal. > 3) Then close proc WAL so no errant threads can create more procs. > 4) Then do whatever other cleanup. > 5) Finally delete znode. > Right now znode is deleted somewhat early I think, and RpcServer is closed > very late. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow
[ https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829803#comment-16829803 ] Sergey Shelukhin edited comment on HBASE-22301 at 4/29/19 10:59 PM: Well I meant Consider rolling the WAL if the HDFS write pipeline is slow > --- > > Key: HBASE-22301 > URL: https://issues.apache.org/jira/browse/HBASE-22301 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.3.0 > > Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch > > > Consider the case when a subset of the HDFS fleet is unhealthy but suffering > a gray failure not an outright outage. HDFS operations, notably syncs, are > abnormally slow on pipelines which include this subset of hosts. If the > regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be > consumed waiting for acks from the datanodes in the pipeline (recall that > some of them are sick). Imagine a write heavy application distributing load > uniformly over the cluster at a fairly high rate. With the WAL subsystem > slowed by HDFS level issues, all handlers can be blocked waiting to append to > the WAL. Once all handlers are blocked, the application will experience > backpressure. All (HBase) clients eventually have too many outstanding writes > and block. > Because the application is distributing writes near uniformly in the > keyspace, the probability any given service endpoint will dispatch a request > to an impacted regionserver, even a single regionserver, approaches 1.0. So > the probability that all service endpoints will be affected approaches 1.0. > In order to break the logjam, we need to remove the slow datanodes. Although > there is HDFS level monitoring, mechanisms, and procedures for this, we > should also attempt to take mitigating action at the HBase layer as soon as > we find ourselves in trouble. It would be enough to remove the affected > datanodes from the writer pipelines. A super simple strategy that can be > effective is described below: > This is with branch-1 code. I think branch-2's async WAL can mitigate but > still can be susceptible. branch-2 sync WAL is susceptible. > We already roll the WAL writer if the pipeline suffers the failure of a > datanode and the replication factor on the pipeline is too low. We should > also consider how much time it took for the write pipeline to complete a sync > the last time we measured it, or the max over the interval from now to the > last time we checked. If the sync time exceeds a configured threshold, roll > the log writer then too. Fortunately we don't need to know which datanode is > making the WAL write pipeline slow, only that syncs on the pipeline are too > slow and exceeding a threshold. This is enough information to know when to > roll it. Once we roll it, we will get three new randomly selected datanodes. > On most clusters the probability the new pipeline includes the slow datanode > will be low. (And if for some reason it does end up with a problematic > datanode again, we roll again.) > This is not a silver bullet but this can be a reasonably effective mitigation. > Provide a metric for tracking when log roll is requested (and for what > reason). > Emit a log line at log roll time that includes datanode pipeline details for > further debugging and analysis, similar to the existing slow FSHLog sync log > line. > If we roll too many times within a short interval of time this probably means > there is a widespread problem with the fleet and so our mitigation is not > helping and may be exacerbating those problems or operator difficulties. > Ensure log roll requests triggered by this new feature happen infrequently > enough to not cause difficulties under either normal or abnormal conditions. > A very simple strategy that could work well under both normal and abnormal > conditions is to define a fairly lengthy interval, default 5 minutes, and > then insure we do not roll more than once during this interval for this > reason. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow
[ https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829803#comment-16829803 ] Sergey Shelukhin commented on HBASE-22301: -- Well I meant Consider rolling the WAL if the HDFS write pipeline is slow > --- > > Key: HBASE-22301 > URL: https://issues.apache.org/jira/browse/HBASE-22301 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.3.0 > > Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch > > > Consider the case when a subset of the HDFS fleet is unhealthy but suffering > a gray failure not an outright outage. HDFS operations, notably syncs, are > abnormally slow on pipelines which include this subset of hosts. If the > regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be > consumed waiting for acks from the datanodes in the pipeline (recall that > some of them are sick). Imagine a write heavy application distributing load > uniformly over the cluster at a fairly high rate. With the WAL subsystem > slowed by HDFS level issues, all handlers can be blocked waiting to append to > the WAL. Once all handlers are blocked, the application will experience > backpressure. All (HBase) clients eventually have too many outstanding writes > and block. > Because the application is distributing writes near uniformly in the > keyspace, the probability any given service endpoint will dispatch a request > to an impacted regionserver, even a single regionserver, approaches 1.0. So > the probability that all service endpoints will be affected approaches 1.0. > In order to break the logjam, we need to remove the slow datanodes. Although > there is HDFS level monitoring, mechanisms, and procedures for this, we > should also attempt to take mitigating action at the HBase layer as soon as > we find ourselves in trouble. It would be enough to remove the affected > datanodes from the writer pipelines. A super simple strategy that can be > effective is described below: > This is with branch-1 code. I think branch-2's async WAL can mitigate but > still can be susceptible. branch-2 sync WAL is susceptible. > We already roll the WAL writer if the pipeline suffers the failure of a > datanode and the replication factor on the pipeline is too low. We should > also consider how much time it took for the write pipeline to complete a sync > the last time we measured it, or the max over the interval from now to the > last time we checked. If the sync time exceeds a configured threshold, roll > the log writer then too. Fortunately we don't need to know which datanode is > making the WAL write pipeline slow, only that syncs on the pipeline are too > slow and exceeding a threshold. This is enough information to know when to > roll it. Once we roll it, we will get three new randomly selected datanodes. > On most clusters the probability the new pipeline includes the slow datanode > will be low. (And if for some reason it does end up with a problematic > datanode again, we roll again.) > This is not a silver bullet but this can be a reasonably effective mitigation. > Provide a metric for tracking when log roll is requested (and for what > reason). > Emit a log line at log roll time that includes datanode pipeline details for > further debugging and analysis, similar to the existing slow FSHLog sync log > line. > If we roll too many times within a short interval of time this probably means > there is a widespread problem with the fleet and so our mitigation is not > helping and may be exacerbating those problems or operator difficulties. > Ensure log roll requests triggered by this new feature happen infrequently > enough to not cause difficulties under either normal or abnormal conditions. > A very simple strategy that could work well under both normal and abnormal > conditions is to define a fairly lengthy interval, default 5 minutes, and > then insure we do not roll more than once during this interval for this > reason. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow
[ https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829795#comment-16829795 ] Sergey Shelukhin edited comment on HBASE-22301 at 4/29/19 10:50 PM: In our case though the problem was that each slow sync would take (edit: up to) 10s of seconds, so with current DEFAULT_SLOW_SYNC_ROLL_THRESHOLD as far as I can tell from the patch the condition would not trigger for a very long time. Should the rolling simply be based on a single value that is a total/weighted sync time accumulated over a N latest syncs, with no minimum threshold by count? That way it can accumulate the offending amount of sync over a single bad one or multiple somewhat-bad ones. was (Author: sershe): In our case though the problem was that each slow sync would take 10s of seconds, so with current DEFAULT_SLOW_SYNC_ROLL_THRESHOLD as far as I can tell from the patch the condition would not trigger for a very long time. Should the rolling simply based on a single value that is a total/weighted sync time accumulated over a N latest syncs, with no minimum threshold by count? That way it can accumulate the offending amount of sync over a single bad one or multiple somewhat-bad ones. > Consider rolling the WAL if the HDFS write pipeline is slow > --- > > Key: HBASE-22301 > URL: https://issues.apache.org/jira/browse/HBASE-22301 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.3.0 > > Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch > > > Consider the case when a subset of the HDFS fleet is unhealthy but suffering > a gray failure not an outright outage. HDFS operations, notably syncs, are > abnormally slow on pipelines which include this subset of hosts. If the > regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be > consumed waiting for acks from the datanodes in the pipeline (recall that > some of them are sick). Imagine a write heavy application distributing load > uniformly over the cluster at a fairly high rate. With the WAL subsystem > slowed by HDFS level issues, all handlers can be blocked waiting to append to > the WAL. Once all handlers are blocked, the application will experience > backpressure. All (HBase) clients eventually have too many outstanding writes > and block. > Because the application is distributing writes near uniformly in the > keyspace, the probability any given service endpoint will dispatch a request > to an impacted regionserver, even a single regionserver, approaches 1.0. So > the probability that all service endpoints will be affected approaches 1.0. > In order to break the logjam, we need to remove the slow datanodes. Although > there is HDFS level monitoring, mechanisms, and procedures for this, we > should also attempt to take mitigating action at the HBase layer as soon as > we find ourselves in trouble. It would be enough to remove the affected > datanodes from the writer pipelines. A super simple strategy that can be > effective is described below: > This is with branch-1 code. I think branch-2's async WAL can mitigate but > still can be susceptible. branch-2 sync WAL is susceptible. > We already roll the WAL writer if the pipeline suffers the failure of a > datanode and the replication factor on the pipeline is too low. We should > also consider how much time it took for the write pipeline to complete a sync > the last time we measured it, or the max over the interval from now to the > last time we checked. If the sync time exceeds a configured threshold, roll > the log writer then too. Fortunately we don't need to know which datanode is > making the WAL write pipeline slow, only that syncs on the pipeline are too > slow and exceeding a threshold. This is enough information to know when to > roll it. Once we roll it, we will get three new randomly selected datanodes. > On most clusters the probability the new pipeline includes the slow datanode > will be low. (And if for some reason it does end up with a problematic > datanode again, we roll again.) > This is not a silver bullet but this can be a reasonably effective mitigation. > Provide a metric for tracking when log roll is requested (and for what > reason). > Emit a log line at log roll time that includes datanode pipeline details for > further debugging and analysis, similar to the existing slow FSHLog sync log > line. > If we roll too many times within a short interval of time this probably means > there is a widespread problem with the fleet and so our
[jira] [Commented] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow
[ https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829797#comment-16829797 ] Sergey Shelukhin commented on HBASE-22301: -- Sorry about the master only. I just started contributing to HBase again and was assuming that we should move forwards, not backwards ;) I saw that branch-2 is still very much alive now so I'm committing recent fixes there too. > Consider rolling the WAL if the HDFS write pipeline is slow > --- > > Key: HBASE-22301 > URL: https://issues.apache.org/jira/browse/HBASE-22301 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.3.0 > > Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch > > > Consider the case when a subset of the HDFS fleet is unhealthy but suffering > a gray failure not an outright outage. HDFS operations, notably syncs, are > abnormally slow on pipelines which include this subset of hosts. If the > regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be > consumed waiting for acks from the datanodes in the pipeline (recall that > some of them are sick). Imagine a write heavy application distributing load > uniformly over the cluster at a fairly high rate. With the WAL subsystem > slowed by HDFS level issues, all handlers can be blocked waiting to append to > the WAL. Once all handlers are blocked, the application will experience > backpressure. All (HBase) clients eventually have too many outstanding writes > and block. > Because the application is distributing writes near uniformly in the > keyspace, the probability any given service endpoint will dispatch a request > to an impacted regionserver, even a single regionserver, approaches 1.0. So > the probability that all service endpoints will be affected approaches 1.0. > In order to break the logjam, we need to remove the slow datanodes. Although > there is HDFS level monitoring, mechanisms, and procedures for this, we > should also attempt to take mitigating action at the HBase layer as soon as > we find ourselves in trouble. It would be enough to remove the affected > datanodes from the writer pipelines. A super simple strategy that can be > effective is described below: > This is with branch-1 code. I think branch-2's async WAL can mitigate but > still can be susceptible. branch-2 sync WAL is susceptible. > We already roll the WAL writer if the pipeline suffers the failure of a > datanode and the replication factor on the pipeline is too low. We should > also consider how much time it took for the write pipeline to complete a sync > the last time we measured it, or the max over the interval from now to the > last time we checked. If the sync time exceeds a configured threshold, roll > the log writer then too. Fortunately we don't need to know which datanode is > making the WAL write pipeline slow, only that syncs on the pipeline are too > slow and exceeding a threshold. This is enough information to know when to > roll it. Once we roll it, we will get three new randomly selected datanodes. > On most clusters the probability the new pipeline includes the slow datanode > will be low. (And if for some reason it does end up with a problematic > datanode again, we roll again.) > This is not a silver bullet but this can be a reasonably effective mitigation. > Provide a metric for tracking when log roll is requested (and for what > reason). > Emit a log line at log roll time that includes datanode pipeline details for > further debugging and analysis, similar to the existing slow FSHLog sync log > line. > If we roll too many times within a short interval of time this probably means > there is a widespread problem with the fleet and so our mitigation is not > helping and may be exacerbating those problems or operator difficulties. > Ensure log roll requests triggered by this new feature happen infrequently > enough to not cause difficulties under either normal or abnormal conditions. > A very simple strategy that could work well under both normal and abnormal > conditions is to define a fairly lengthy interval, default 5 minutes, and > then insure we do not roll more than once during this interval for this > reason. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow
[ https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829795#comment-16829795 ] Sergey Shelukhin commented on HBASE-22301: -- In our case though the problem was that each slow sync would take 10s of seconds, so with current DEFAULT_SLOW_SYNC_ROLL_THRESHOLD as far as I can tell from the patch the condition would not trigger for a very long time. Should the rolling simply based on a single value that is a total/weighted sync time accumulated over a N latest syncs, with no minimum threshold by count? That way it can accumulate the offending amount of sync over a single bad one or multiple somewhat-bad ones. > Consider rolling the WAL if the HDFS write pipeline is slow > --- > > Key: HBASE-22301 > URL: https://issues.apache.org/jira/browse/HBASE-22301 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.3.0 > > Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch > > > Consider the case when a subset of the HDFS fleet is unhealthy but suffering > a gray failure not an outright outage. HDFS operations, notably syncs, are > abnormally slow on pipelines which include this subset of hosts. If the > regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be > consumed waiting for acks from the datanodes in the pipeline (recall that > some of them are sick). Imagine a write heavy application distributing load > uniformly over the cluster at a fairly high rate. With the WAL subsystem > slowed by HDFS level issues, all handlers can be blocked waiting to append to > the WAL. Once all handlers are blocked, the application will experience > backpressure. All (HBase) clients eventually have too many outstanding writes > and block. > Because the application is distributing writes near uniformly in the > keyspace, the probability any given service endpoint will dispatch a request > to an impacted regionserver, even a single regionserver, approaches 1.0. So > the probability that all service endpoints will be affected approaches 1.0. > In order to break the logjam, we need to remove the slow datanodes. Although > there is HDFS level monitoring, mechanisms, and procedures for this, we > should also attempt to take mitigating action at the HBase layer as soon as > we find ourselves in trouble. It would be enough to remove the affected > datanodes from the writer pipelines. A super simple strategy that can be > effective is described below: > This is with branch-1 code. I think branch-2's async WAL can mitigate but > still can be susceptible. branch-2 sync WAL is susceptible. > We already roll the WAL writer if the pipeline suffers the failure of a > datanode and the replication factor on the pipeline is too low. We should > also consider how much time it took for the write pipeline to complete a sync > the last time we measured it, or the max over the interval from now to the > last time we checked. If the sync time exceeds a configured threshold, roll > the log writer then too. Fortunately we don't need to know which datanode is > making the WAL write pipeline slow, only that syncs on the pipeline are too > slow and exceeding a threshold. This is enough information to know when to > roll it. Once we roll it, we will get three new randomly selected datanodes. > On most clusters the probability the new pipeline includes the slow datanode > will be low. (And if for some reason it does end up with a problematic > datanode again, we roll again.) > This is not a silver bullet but this can be a reasonably effective mitigation. > Provide a metric for tracking when log roll is requested (and for what > reason). > Emit a log line at log roll time that includes datanode pipeline details for > further debugging and analysis, similar to the existing slow FSHLog sync log > line. > If we roll too many times within a short interval of time this probably means > there is a widespread problem with the fleet and so our mitigation is not > helping and may be exacerbating those problems or operator difficulties. > Ensure log roll requests triggered by this new feature happen infrequently > enough to not cause difficulties under either normal or abnormal conditions. > A very simple strategy that could work well under both normal and abnormal > conditions is to define a fairly lengthy interval, default 5 minutes, and > then insure we do not roll more than once during this interval for this > reason. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow
[ https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829784#comment-16829784 ] Sergey Shelukhin commented on HBASE-22301: -- Should this augment/be similar to HBASE-21806? > Consider rolling the WAL if the HDFS write pipeline is slow > --- > > Key: HBASE-22301 > URL: https://issues.apache.org/jira/browse/HBASE-22301 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Minor > Fix For: 3.0.0, 1.5.0, 2.3.0 > > Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, > HBASE-22301-branch-1.patch > > > Consider the case when a subset of the HDFS fleet is unhealthy but suffering > a gray failure not an outright outage. HDFS operations, notably syncs, are > abnormally slow on pipelines which include this subset of hosts. If the > regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be > consumed waiting for acks from the datanodes in the pipeline (recall that > some of them are sick). Imagine a write heavy application distributing load > uniformly over the cluster at a fairly high rate. With the WAL subsystem > slowed by HDFS level issues, all handlers can be blocked waiting to append to > the WAL. Once all handlers are blocked, the application will experience > backpressure. All (HBase) clients eventually have too many outstanding writes > and block. > Because the application is distributing writes near uniformly in the > keyspace, the probability any given service endpoint will dispatch a request > to an impacted regionserver, even a single regionserver, approaches 1.0. So > the probability that all service endpoints will be affected approaches 1.0. > In order to break the logjam, we need to remove the slow datanodes. Although > there is HDFS level monitoring, mechanisms, and procedures for this, we > should also attempt to take mitigating action at the HBase layer as soon as > we find ourselves in trouble. It would be enough to remove the affected > datanodes from the writer pipelines. A super simple strategy that can be > effective is described below: > This is with branch-1 code. I think branch-2's async WAL can mitigate but > still can be susceptible. branch-2 sync WAL is susceptible. > We already roll the WAL writer if the pipeline suffers the failure of a > datanode and the replication factor on the pipeline is too low. We should > also consider how much time it took for the write pipeline to complete a sync > the last time we measured it, or the max over the interval from now to the > last time we checked. If the sync time exceeds a configured threshold, roll > the log writer then too. Fortunately we don't need to know which datanode is > making the WAL write pipeline slow, only that syncs on the pipeline are too > slow and exceeding a threshold. This is enough information to know when to > roll it. Once we roll it, we will get three new randomly selected datanodes. > On most clusters the probability the new pipeline includes the slow datanode > will be low. (And if for some reason it does end up with a problematic > datanode again, we roll again.) > This is not a silver bullet but this can be a reasonably effective mitigation. > Provide a metric for tracking when log roll is requested (and for what > reason). > Emit a log line at log roll time that includes datanode pipeline details for > further debugging and analysis, similar to the existing slow FSHLog sync log > line. > If we roll too many times within a short interval of time this probably means > there is a widespread problem with the fleet and so our mitigation is not > helping and may be exacerbating those problems or operator difficulties. > Ensure log roll requests triggered by this new feature happen infrequently > enough to not cause difficulties under either normal or abnormal conditions. > A very simple strategy that could work well under both normal and abnormal > conditions is to define a fairly lengthy interval, default 5 minutes, and > then insure we do not roll more than once during this interval for this > reason. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22334) handle blocking RPC threads better (time out calls? )
Sergey Shelukhin created HBASE-22334: Summary: handle blocking RPC threads better (time out calls? ) Key: HBASE-22334 URL: https://issues.apache.org/jira/browse/HBASE-22334 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin Combined with HBASE-22333, we had the case where user sent lots of create table requests with pre-split for the same table (because the tasks of some job would try to create table opportunistically if it doesn't exist, and there were many such tasks); these requests took up all the RPC threads and caused large call queue to form; then, the first call got stuck because RS calls to report an opened region were stuck in queue. All the other calls were stuck here: {noformat} submitProcedure( new CreateTableProcedure(procedureExecutor.getEnvironment(), desc, newRegions, latch)); latch.await(); {noformat} The procedures in this case were stuck for hours; even if the other issue was resolved, assigning 1000s of regions can take a long time and cause lots of delay before it unblocks the the other procedures and allows them to release the latch. In general, waiting on RPC thread is not a good idea. I wonder if it would make sense to fail client requests taking up the RPC thread based on timeout; or if they are not making progress (e.g. in this case, the procedure is not getting updated; might need to be handled on case by case basis). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-22333) move certain internal RPCs to high priority threadpool
[ https://issues.apache.org/jira/browse/HBASE-22333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-22333: - Summary: move certain internal RPCs to high priority threadpool (was: move certain internal RPCs to high priority level) > move certain internal RPCs to high priority threadpool > -- > > Key: HBASE-22333 > URL: https://issues.apache.org/jira/browse/HBASE-22333 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > User calls can inadvertently DDoS master (and potentially RS), causing issues > (e.g. CallQueueTooBig) for important system calls like > reportRegionStateTransition. > These calls should be moved to high pri level... I wonder if all the > low-volume internal calls (i.e. except heartbeats and maybe WAL splitting > stuff) should have higher pri (e.g. 20 QoS in HConstants). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-22333) move certain internal RPCs to high priority level
Sergey Shelukhin created HBASE-22333: Summary: move certain internal RPCs to high priority level Key: HBASE-22333 URL: https://issues.apache.org/jira/browse/HBASE-22333 Project: HBase Issue Type: Bug Reporter: Sergey Shelukhin User calls can inadvertently DDoS master (and potentially RS), causing issues (e.g. CallQueueTooBig) for important system calls like reportRegionStateTransition. These calls should be moved to high pri level... I wonder if all the low-volume internal calls (i.e. except heartbeats and maybe WAL splitting stuff) should have higher pri (e.g. 20 QoS in HConstants). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22333) move certain internal RPCs to high priority level
[ https://issues.apache.org/jira/browse/HBASE-22333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829761#comment-16829761 ] Sergey Shelukhin commented on HBASE-22333: -- cc [~bahramch] > move certain internal RPCs to high priority level > - > > Key: HBASE-22333 > URL: https://issues.apache.org/jira/browse/HBASE-22333 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Priority: Major > > User calls can inadvertently DDoS master (and potentially RS), causing issues > (e.g. CallQueueTooBig) for important system calls like > reportRegionStateTransition. > These calls should be moved to high pri level... I wonder if all the > low-volume internal calls (i.e. except heartbeats and maybe WAL splitting > stuff) should have higher pri (e.g. 20 QoS in HConstants). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-22081) master shutdown: close RpcServer and procWAL first thing
[ https://issues.apache.org/jira/browse/HBASE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827407#comment-16827407 ] Sergey Shelukhin commented on HBASE-22081: -- Before the patch, it is mere coincidence that while all the other stuff is shutting down, the rpc that caused it has a chance to return. Caller could get unlucky and stop RPC would fail because RPC server was closed... now that we shut down RPC server first thing, it happens almost all the time. Added a small sleep before starting shutdown if it was triggered by and RPC request 0_o Unfortunately it doesn't seem to be possible externally to wait for RPC(s) responses to finish. > master shutdown: close RpcServer and procWAL first thing > > > Key: HBASE-22081 > URL: https://issues.apache.org/jira/browse/HBASE-22081 > Project: HBase > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Major > Attachments: HBASE-22081.01.patch, HBASE-22081.patch > > > I had a master get stuck due to HBASE-22079 and noticed it was logging RS > abort messages during shutdown. > [~bahramch] found some issues where messages are processed by old master > during shutdown due to a race condition in RS cache (or it could also happen > due to a network race). > Previously I found some bug where SCP was created during master shutdown that > had incorrect state (because some structures already got cleaned). > I think before master fencing is implemented we can at least make these > issues much less likely by thinking about shutdown order. > 1) First kill RCP server so we don't receive any more messages. There's no > need to receive messages when we are shutting down. Server heartbeats could > be impacted I guess, but I don't think they will be cause we currently only > kill RS on ZK timeout. > 2) Then do whatever cleanup we think is needed that requires proc wal. > 3) Then close proc WAL so no errant threads can create more procs. > 4) Then do whatever other cleanup. > 5) Finally delete znode. > Right now znode is deleted somewhat early I think, and RpcServer is closed > very late. -- This message was sent by Atlassian JIRA (v7.6.3#76005)