[jira] [Updated] (HBASE-22408) add a metric for regions OPEN on non-live servers

2019-06-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22408:
-
   Resolution: Fixed
Fix Version/s: 2.3.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master and branch-2.
Thanks [~Apache9] for the review. 


> add a metric for regions OPEN on non-live servers
> -
>
> Key: HBASE-22408
> URL: https://issues.apache.org/jira/browse/HBASE-22408
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0, 2.3.0
>
>
> This serves 2 purposes for monitoring:
> 1) Catching when regions are on dead servers due to long WAL splitting or 
> other delays in SCP. At that time, the regions are not listed as RITs; we'd 
> like to be able to have alerts in such cases.
> 2) Catching various bugs in assignment and procWAL corruption, etc. that 
> leave region "OPEN" on a server that no longer exists, again to alert the 
> administrator via a metric.
> Later, it might be possible to add more logic to distinguish 1 and 2, and to 
> mitigate 2 automatically and also set some metric to alert the administrator 
> to investigate later.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22289) WAL-based log splitting resubmit threshold may result in a task being stuck forever

2019-05-20 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16844157#comment-16844157
 ] 

Sergey Shelukhin commented on HBASE-22289:
--

Thanks for taking it over the finish line! 
It shouldn't affect 2.2 and later versions, at least in this form, because this 
code has been replaced by procedures. They might have a similar bug but it 
would require a different fix.

> WAL-based log splitting resubmit threshold may result in a task being stuck 
> forever
> ---
>
> Key: HBASE-22289
> URL: https://issues.apache.org/jira/browse/HBASE-22289
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 1.5.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 2.0.6, 2.1.5
>
> Attachments: HBASE-22289.01-branch-2.1.patch, 
> HBASE-22289.02-branch-2.1.patch, HBASE-22289.03-branch-2.1.patch, 
> HBASE-22289.branch-2.1.001.patch, HBASE-22289.branch-2.1.001.patch, 
> HBASE-22289.branch-2.1.001.patch
>
>
> Not sure if this is handled better in procedure based WAL splitting; in any 
> case it affects versions before that.
> The problem is not in ZK as such but in internal state tracking in master, it 
> seems.
> Master:
> {noformat}
> 2019-04-21 01:49:49,584 INFO  
> [master/:17000.splitLogManager..Chore.1] 
> coordination.SplitLogManagerCoordination: Resubmitting task 
> .1555831286638
> {noformat}
> worker-rs, split fails 
> {noformat}
> 
> 2019-04-21 02:05:31,774 INFO  
> [RS_LOG_REPLAY_OPS-regionserver/:17020-1] wal.WALSplitter: 
> Processed 24 edits across 2 regions; edits skipped=457; log 
> file=.1555831286638, length=2156363702, corrupted=false, progress 
> failed=true
> {noformat}
> Master (not sure about the delay of the acquired-message; at any rate it 
> seems to detect the failure fine from this server)
> {noformat}
> 2019-04-21 02:11:14,928 INFO  [main-EventThread] 
> coordination.SplitLogManagerCoordination: Task .1555831286638 acquired 
> by ,17020,139815097
> 2019-04-21 02:19:41,264 INFO  
> [master/:17000.splitLogManager..Chore.1] 
> coordination.SplitLogManagerCoordination: Skipping resubmissions of task 
> .1555831286638 because threshold 3 reached
> {noformat}
> After that this task is stuck in the limbo forever with the old worker, and 
> never resubmitted. 
> RS never logs anything else for this task.
> Killing the RS on the worker unblocked the task and some other server did the 
> split very quickly, so seems like master doesn't clear the worker name in its 
> internal state when hitting the threshold... master never restarted so 
> restarting the master might have also cleared it.
> This is extracted from splitlogmanager log messages, note the times.
> {noformat}
> 2019-04-21 02:2   1555831286638=last_update = 1555837874928 last_version = 11 
> cur_worker_name = ,17020,139815097 status = in_progress 
> incarnation = 3 resubmits = 3 batch = installed = 24 done = 3 error = 20, 
> 
> 2019-04-22 11:1   1555831286638=last_update = 1555837874928 last_version = 11 
> cur_worker_name = ,17020,139815097 status = in_progress 
> incarnation = 3 resubmits = 3 batch = installed = 24 done = 3 error = 20}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent

2019-05-15 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840884#comment-16840884
 ] 

Sergey Shelukhin edited comment on HBASE-22432 at 5/16/19 12:46 AM:


Let's see how many tests this breaks. If the change in behavior breaks too 
many, I will add explicit flag to distinguish refresh vs shutdown and read that 
in ensure... and replace best-effort stuff with an atomic or smth like that.


was (Author: sershe):
Let's see how many tests this breaks. If the change in behavior breaks too 
many, I will add explicit flag to distinguish refresh vs shutdown and read that 
in ensure...

> HRegionServer rssStub handling is incorrect and inconsistent
> 
>
> Key: HBASE-22432
> URL: https://issues.apache.org/jira/browse/HBASE-22432
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> Some methods refresh stub on null, some assume (incorrectly) server is 
> shutting down.
> Most methods reset stub to null on error, and also now we do it when ZK 
> changes.
> One of the latter can cause server reports to not be sent until one of the 
> former methods (e.g. reporting procedure completion) executes and happens to 
> restore the stub for them.
> Also, reset on error is done sometimes with and sometimes without a check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent

2019-05-15 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22432:
-
Description: 
Some methods refresh stub on null, some assume (incorrectly) server is shutting 
down.
Most methods reset stub to null on error, and also now we do it when ZK changes.
One of the latter can cause server reports to not be sent until one of the 
former methods (e.g. reporting procedure completion) executes and happens to 
restore the stub for them.
Also, reset on error is done sometimes with and sometimes without a check.

  was:
Some methods refresh stub on null, some assume (incorrectly) server is shutting 
down.
Most methods reset stub to null on error, and also now we do it when ZK changes.
One of the latter can cause server reports to not be sent until one of the 
former methods (e.g. reporting procedure completion) executes and happens to 
restore the stub for them.
Reset is done sometimes with and sometimes without a check.


> HRegionServer rssStub handling is incorrect and inconsistent
> 
>
> Key: HBASE-22432
> URL: https://issues.apache.org/jira/browse/HBASE-22432
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> Some methods refresh stub on null, some assume (incorrectly) server is 
> shutting down.
> Most methods reset stub to null on error, and also now we do it when ZK 
> changes.
> One of the latter can cause server reports to not be sent until one of the 
> former methods (e.g. reporting procedure completion) executes and happens to 
> restore the stub for them.
> Also, reset on error is done sometimes with and sometimes without a check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent

2019-05-15 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22432:
-
Affects Version/s: 3.0.0

> HRegionServer rssStub handling is incorrect and inconsistent
> 
>
> Key: HBASE-22432
> URL: https://issues.apache.org/jira/browse/HBASE-22432
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> Some methods refresh stub on null, some assume (incorrectly) server is 
> shutting down.
> Most methods reset stub to null on error, and also now we do it when ZK 
> changes.
> One of the latter can cause server reports to not be sent until one of the 
> former methods executes and happens to restore the stub.
> Reset is done sometimes with and sometimes without a check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent

2019-05-15 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22432:
-
Description: 
Some methods refresh stub on null, some assume (incorrectly) server is shutting 
down.
Most methods reset stub to null on error, and also now we do it when ZK changes.
One of the latter can cause server reports to not be sent until one of the 
former methods (e.g. reporting procedure completion) executes and happens to 
restore the stub for them.
Reset is done sometimes with and sometimes without a check.

  was:
Some methods refresh stub on null, some assume (incorrectly) server is shutting 
down.
Most methods reset stub to null on error, and also now we do it when ZK changes.
One of the latter can cause server reports to not be sent until one of the 
former methods executes and happens to restore the stub.
Reset is done sometimes with and sometimes without a check.


> HRegionServer rssStub handling is incorrect and inconsistent
> 
>
> Key: HBASE-22432
> URL: https://issues.apache.org/jira/browse/HBASE-22432
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> Some methods refresh stub on null, some assume (incorrectly) server is 
> shutting down.
> Most methods reset stub to null on error, and also now we do it when ZK 
> changes.
> One of the latter can cause server reports to not be sent until one of the 
> former methods (e.g. reporting procedure completion) executes and happens to 
> restore the stub for them.
> Reset is done sometimes with and sometimes without a check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent

2019-05-15 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840884#comment-16840884
 ] 

Sergey Shelukhin edited comment on HBASE-22432 at 5/16/19 12:44 AM:


Let's see how many tests this break. If the change in behavior breaks too many, 
I will add explicit flag to distinguish refresh vs shutdown and read that in 
ensure...


was (Author: sershe):
Let's see how many tests this break. If the change in behavior breaks too many, 
I will add explicit flag to distinguish refresh vs shutdown and read that in 
ensure...

> HRegionServer rssStub handling is incorrect and inconsistent
> 
>
> Key: HBASE-22432
> URL: https://issues.apache.org/jira/browse/HBASE-22432
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> Some methods refresh stub on null, some assume (incorrectly) server is 
> shutting down.
> Most methods reset stub to null on error, and also now we do it when ZK 
> changes.
> One of the latter can cause server reports to not be sent until one of the 
> former methods executes and happens to restore the stub.
> Reset is done sometimes with and sometimes without a check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent

2019-05-15 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22432:
-
Description: 
Some methods refresh stub on null, some assume (incorrectly) server is shutting 
down.
Most methods reset stub to null on error, and also now we do it when ZK changes.
On of the latter can cause server reports to not be sent until one of the 
former methods executes and happens to restore the stub.
Reset is done sometimes with and sometimes without a check.

  was:
Some methods refresh stub on null, some assume (incorrectly) server is shutting 
down.
The latter can cause server reports to not be sent until one of the former 
methods executes and happens to restore the stub.
Reset is done sometimes with and sometimes without a check.


> HRegionServer rssStub handling is incorrect and inconsistent
> 
>
> Key: HBASE-22432
> URL: https://issues.apache.org/jira/browse/HBASE-22432
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> Some methods refresh stub on null, some assume (incorrectly) server is 
> shutting down.
> Most methods reset stub to null on error, and also now we do it when ZK 
> changes.
> On of the latter can cause server reports to not be sent until one of the 
> former methods executes and happens to restore the stub.
> Reset is done sometimes with and sometimes without a check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent

2019-05-15 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22432:
-
Status: Patch Available  (was: Open)

Let's see how many tests this break. If the change in behavior breaks too many, 
I will add explicit flag to distinguish refresh vs shutdown and read that in 
ensure...

> HRegionServer rssStub handling is incorrect and inconsistent
> 
>
> Key: HBASE-22432
> URL: https://issues.apache.org/jira/browse/HBASE-22432
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> Some methods refresh stub on null, some assume (incorrectly) server is 
> shutting down.
> The latter can cause server reports to not be sent until one of the former 
> methods executes and happens to restore the stub.
> Reset is done sometimes with and sometimes without a check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent

2019-05-15 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840884#comment-16840884
 ] 

Sergey Shelukhin edited comment on HBASE-22432 at 5/16/19 12:45 AM:


Let's see how many tests this breaks. If the change in behavior breaks too 
many, I will add explicit flag to distinguish refresh vs shutdown and read that 
in ensure...


was (Author: sershe):
Let's see how many tests this break. If the change in behavior breaks too many, 
I will add explicit flag to distinguish refresh vs shutdown and read that in 
ensure...

> HRegionServer rssStub handling is incorrect and inconsistent
> 
>
> Key: HBASE-22432
> URL: https://issues.apache.org/jira/browse/HBASE-22432
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> Some methods refresh stub on null, some assume (incorrectly) server is 
> shutting down.
> Most methods reset stub to null on error, and also now we do it when ZK 
> changes.
> One of the latter can cause server reports to not be sent until one of the 
> former methods executes and happens to restore the stub.
> Reset is done sometimes with and sometimes without a check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent

2019-05-15 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22432:
-
Description: 
Some methods refresh stub on null, some assume (incorrectly) server is shutting 
down.
Most methods reset stub to null on error, and also now we do it when ZK changes.
One of the latter can cause server reports to not be sent until one of the 
former methods executes and happens to restore the stub.
Reset is done sometimes with and sometimes without a check.

  was:
Some methods refresh stub on null, some assume (incorrectly) server is shutting 
down.
Most methods reset stub to null on error, and also now we do it when ZK changes.
On of the latter can cause server reports to not be sent until one of the 
former methods executes and happens to restore the stub.
Reset is done sometimes with and sometimes without a check.


> HRegionServer rssStub handling is incorrect and inconsistent
> 
>
> Key: HBASE-22432
> URL: https://issues.apache.org/jira/browse/HBASE-22432
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> Some methods refresh stub on null, some assume (incorrectly) server is 
> shutting down.
> Most methods reset stub to null on error, and also now we do it when ZK 
> changes.
> One of the latter can cause server reports to not be sent until one of the 
> former methods executes and happens to restore the stub.
> Reset is done sometimes with and sometimes without a check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent

2019-05-15 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22432:
-
Priority: Critical  (was: Major)

> HRegionServer rssStub handling is incorrect and inconsistent
> 
>
> Key: HBASE-22432
> URL: https://issues.apache.org/jira/browse/HBASE-22432
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Critical
>
> Some methods refresh stub on null, some assume (incorrectly) server is 
> shutting down.
> Most methods reset stub to null on error, and also now we do it when ZK 
> changes.
> On of the latter can cause server reports to not be sent until one of the 
> former methods executes and happens to restore the stub.
> Reset is done sometimes with and sometimes without a check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent

2019-05-15 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HBASE-22432:


Assignee: Sergey Shelukhin

> HRegionServer rssStub handling is incorrect and inconsistent
> 
>
> Key: HBASE-22432
> URL: https://issues.apache.org/jira/browse/HBASE-22432
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> Some methods refresh stub on null, some assume (incorrectly) server is 
> shutting down.
> The latter can cause server reports to not be sent until one of the former 
> methods executes and happens to restore the stub.
> Reset is done sometimes with and sometimes without a check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22432) HRegionServer rssStub handling is incorrect and inconsistent

2019-05-15 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22432:


 Summary: HRegionServer rssStub handling is incorrect and 
inconsistent
 Key: HBASE-22432
 URL: https://issues.apache.org/jira/browse/HBASE-22432
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


Some methods refresh stub on null, some assume (incorrectly) server is shutting 
down.
The latter can cause server reports to not be sent until one of the former 
methods executes and happens to restore the stub.
Reset is done sometimes with and sometimes without a check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22428) better client-side throttling for dropped calls

2019-05-15 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22428:


 Summary: better client-side throttling for dropped calls
 Key: HBASE-22428
 URL: https://issues.apache.org/jira/browse/HBASE-22428
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


Not sure yet how to implement this better. Either when we get 
CallTimeoutException on the client, or by having the timeout on the server be 
less than RPC timeout to be able to actually respond to client, we could do 
better job of throttling retries.
Right now if multiple clients are overloading a server and calls start to be 
dropped, they just all retry and keep the server overloaded. The server might 
have to track when requests from a client timed out to fail more aggressively 
when processing time is high. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22410) add the notion of the expected # of servers for non-fixed server sets; report an alternative dead server metric

2019-05-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22410:
-
Status: Patch Available  (was: Open)

[~andrew.purt...@gmail.com] [~busbey] I cloned HBASE-22107 to add a better 
metric for the compute/etc scenarios... cleaning dead region server list after 
a timeout still wouldn't provide a reliable number, although it could still be 
done for maintainability in the original JIRA.

> add the notion of the expected # of servers for non-fixed server sets; report 
> an alternative dead server metric
> ---
>
> Key: HBASE-22410
> URL: https://issues.apache.org/jira/browse/HBASE-22410
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> dead servers appear to only be cleaned up when a server comes up on the same 
> host and port; however, if HBase is running on smth like YARN with many more 
> hosts than RSes, RS may come up on a different server and the dead one will 
> never be cleaned.
> The metric should be improved to account for that... it will potentially 
> require configuring master with expected number of region servers, so that 
> the metric could be output based on that.
> Dead server list should also be expired based on timestamp in such cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22410) add the notion of the expected # of servers for non-fixed server sets; report an alternative dead server metric

2019-05-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22410:
-
Summary: add the notion of the expected # of servers for non-fixed server 
sets; report an alternative dead server metric  (was: add the notion of the 
expected # of servers and report a metric as an alternative to dead server 
metric for non-fixed server sets )

> add the notion of the expected # of servers for non-fixed server sets; report 
> an alternative dead server metric
> ---
>
> Key: HBASE-22410
> URL: https://issues.apache.org/jira/browse/HBASE-22410
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> dead servers appear to only be cleaned up when a server comes up on the same 
> host and port; however, if HBase is running on smth like YARN with many more 
> hosts than RSes, RS may come up on a different server and the dead one will 
> never be cleaned.
> The metric should be improved to account for that... it will potentially 
> require configuring master with expected number of region servers, so that 
> the metric could be output based on that.
> Dead server list should also be expired based on timestamp in such cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22410) add the notion of the expected # of servers and report a metric as an alternative to dead server metric for non-fixed server sets

2019-05-13 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22410:


 Summary: add the notion of the expected # of servers and report a 
metric as an alternative to dead server metric for non-fixed server sets 
 Key: HBASE-22410
 URL: https://issues.apache.org/jira/browse/HBASE-22410
 Project: HBase
  Issue Type: Improvement
  Components: Operability
Reporter: Sergey Shelukhin


dead servers appear to only be cleaned up when a server comes up on the same 
host and port; however, if HBase is running on smth like YARN with many more 
hosts than RSes, RS may come up on a different server and the dead one will 
never be cleaned.
The metric should be improved to account for that... it will potentially 
require configuring master with expected number of region servers, so that the 
metric could be output based on that.
Dead server list should also be expired based on timestamp in such cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22410) add the notion of the expected # of servers and report a metric as an alternative to dead server metric for non-fixed server sets

2019-05-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22410:
-
Priority: Major  (was: Minor)

> add the notion of the expected # of servers and report a metric as an 
> alternative to dead server metric for non-fixed server sets 
> --
>
> Key: HBASE-22410
> URL: https://issues.apache.org/jira/browse/HBASE-22410
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> dead servers appear to only be cleaned up when a server comes up on the same 
> host and port; however, if HBase is running on smth like YARN with many more 
> hosts than RSes, RS may come up on a different server and the dead one will 
> never be cleaned.
> The metric should be improved to account for that... it will potentially 
> require configuring master with expected number of region servers, so that 
> the metric could be output based on that.
> Dead server list should also be expired based on timestamp in such cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-22410) add the notion of the expected # of servers and report a metric as an alternative to dead server metric for non-fixed server sets

2019-05-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HBASE-22410:


Assignee: Sergey Shelukhin

> add the notion of the expected # of servers and report a metric as an 
> alternative to dead server metric for non-fixed server sets 
> --
>
> Key: HBASE-22410
> URL: https://issues.apache.org/jira/browse/HBASE-22410
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
>
> dead servers appear to only be cleaned up when a server comes up on the same 
> host and port; however, if HBase is running on smth like YARN with many more 
> hosts than RSes, RS may come up on a different server and the dead one will 
> never be cleaned.
> The metric should be improved to account for that... it will potentially 
> require configuring master with expected number of region servers, so that 
> the metric could be output based on that.
> Dead server list should also be expired based on timestamp in such cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22408) add a metric for regions OPEN on non-live servers

2019-05-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22408:
-
Status: Patch Available  (was: Open)

Posted a PR. [~Apache9] do you mind taking a look? these metrics should be 
useful to catch the assignment issues and delays

> add a metric for regions OPEN on non-live servers
> -
>
> Key: HBASE-22408
> URL: https://issues.apache.org/jira/browse/HBASE-22408
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> This serves 2 purposes for monitoring:
> 1) Catching when regions are on dead servers due to long WAL splitting or 
> other delays in SCP. At that time, the regions are not listed as RITs; we'd 
> like to be able to have alerts in such cases.
> 2) Catching various bugs in assignment and procWAL corruption, etc. that 
> leave region "OPEN" on a server that no longer exists, again to alert the 
> administrator via a metric.
> Later, it might be possible to add more logic to distinguish 1 and 2, and to 
> mitigate 2 automatically and also set some metric to alert the administrator 
> to investigate later.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22408) add a metric for regions OPEN on non-live servers

2019-05-13 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22408:


 Summary: add a metric for regions OPEN on non-live servers
 Key: HBASE-22408
 URL: https://issues.apache.org/jira/browse/HBASE-22408
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


This serves 2 purposes for monitoring:
1) Catching when regions are on dead servers due to long WAL splitting or other 
delays in SCP; at that time, the regions are not listed as RITs; we'd like to 
be able to have alerts in such cases.
2) Catching various bugs in assignment and procWAL corruption, etc. that leave 
region "OPEN" on a server that no longer exists, again to alert the 
administrator via a metric.

Later, it might be possible to add more logic to distinguish 1 and 2, and add 
logic to mitigate 2 automatically and also set some metric to alert the 
administrator to investigate later.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22408) add a metric for regions OPEN on non-live servers

2019-05-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22408:
-
Description: 
This serves 2 purposes for monitoring:
1) Catching when regions are on dead servers due to long WAL splitting or other 
delays in SCP. At that time, the regions are not listed as RITs; we'd like to 
be able to have alerts in such cases.
2) Catching various bugs in assignment and procWAL corruption, etc. that leave 
region "OPEN" on a server that no longer exists, again to alert the 
administrator via a metric.

Later, it might be possible to add more logic to distinguish 1 and 2, and add 
logic to mitigate 2 automatically and also set some metric to alert the 
administrator to investigate later.

  was:
This serves 2 purposes for monitoring:
1) Catching when regions are on dead servers due to long WAL splitting or other 
delays in SCP; at that time, the regions are not listed as RITs; we'd like to 
be able to have alerts in such cases.
2) Catching various bugs in assignment and procWAL corruption, etc. that leave 
region "OPEN" on a server that no longer exists, again to alert the 
administrator via a metric.

Later, it might be possible to add more logic to distinguish 1 and 2, and add 
logic to mitigate 2 automatically and also set some metric to alert the 
administrator to investigate later.


> add a metric for regions OPEN on non-live servers
> -
>
> Key: HBASE-22408
> URL: https://issues.apache.org/jira/browse/HBASE-22408
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> This serves 2 purposes for monitoring:
> 1) Catching when regions are on dead servers due to long WAL splitting or 
> other delays in SCP. At that time, the regions are not listed as RITs; we'd 
> like to be able to have alerts in such cases.
> 2) Catching various bugs in assignment and procWAL corruption, etc. that 
> leave region "OPEN" on a server that no longer exists, again to alert the 
> administrator via a metric.
> Later, it might be possible to add more logic to distinguish 1 and 2, and add 
> logic to mitigate 2 automatically and also set some metric to alert the 
> administrator to investigate later.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22408) add a metric for regions OPEN on non-live servers

2019-05-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22408:
-
Description: 
This serves 2 purposes for monitoring:
1) Catching when regions are on dead servers due to long WAL splitting or other 
delays in SCP. At that time, the regions are not listed as RITs; we'd like to 
be able to have alerts in such cases.
2) Catching various bugs in assignment and procWAL corruption, etc. that leave 
region "OPEN" on a server that no longer exists, again to alert the 
administrator via a metric.

Later, it might be possible to add more logic to distinguish 1 and 2, and to 
mitigate 2 automatically and also set some metric to alert the administrator to 
investigate later.

  was:
This serves 2 purposes for monitoring:
1) Catching when regions are on dead servers due to long WAL splitting or other 
delays in SCP. At that time, the regions are not listed as RITs; we'd like to 
be able to have alerts in such cases.
2) Catching various bugs in assignment and procWAL corruption, etc. that leave 
region "OPEN" on a server that no longer exists, again to alert the 
administrator via a metric.

Later, it might be possible to add more logic to distinguish 1 and 2, and add 
logic to mitigate 2 automatically and also set some metric to alert the 
administrator to investigate later.


> add a metric for regions OPEN on non-live servers
> -
>
> Key: HBASE-22408
> URL: https://issues.apache.org/jira/browse/HBASE-22408
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> This serves 2 purposes for monitoring:
> 1) Catching when regions are on dead servers due to long WAL splitting or 
> other delays in SCP. At that time, the regions are not listed as RITs; we'd 
> like to be able to have alerts in such cases.
> 2) Catching various bugs in assignment and procWAL corruption, etc. that 
> leave region "OPEN" on a server that no longer exists, again to alert the 
> administrator via a metric.
> Later, it might be possible to add more logic to distinguish 1 and 2, and to 
> mitigate 2 automatically and also set some metric to alert the administrator 
> to investigate later.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22407) add an option to use Hadoop metrics tags for table metrics (and fix some issues in metrics)

2019-05-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838856#comment-16838856
 ] 

Sergey Shelukhin commented on HBASE-22407:
--

Most of the changes are actually just refactoring, like moving code into 
overridable methods so it could be overridden

> add an option to use Hadoop metrics tags for table metrics (and fix some 
> issues in metrics)
> ---
>
> Key: HBASE-22407
> URL: https://issues.apache.org/jira/browse/HBASE-22407
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22407.01.patch
>
>
> Currently table metrics are output using custom metrics names that clutter 
> various metrics lists and are impossible to (sanely) aggregate.
> We can use Hadoop MetricsTag to instead use tagging on a single metric (for a 
> given logical metric), allowing both per-table display and cross-table 
> aggregation on the other end.
> In this JIRA (patch coming) I'd like to add the ability to do that
> 1) Actual tagging in multiple paths that output table metrics.
> 2) The ugliest part - preventing server-level metrics from being output in 
> tags case to avoid duplicate metrics. Seems like a large refactor of the 
> metrics is in order (not included)...
> 3) Fixes for some issues where wrong metrics are output, metrics are not 
> output at all, exceptions like null Optional cause table metrics to not be 
> output forever, etc.
> 4) Renaming several table-level latency metrics to be consistent with 
> server-level latency metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22407) add an option to use Hadoop metrics tags for table metrics (and fix some issues in metrics)

2019-05-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22407:
-
Status: Patch Available  (was: Open)

> add an option to use Hadoop metrics tags for table metrics (and fix some 
> issues in metrics)
> ---
>
> Key: HBASE-22407
> URL: https://issues.apache.org/jira/browse/HBASE-22407
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22407.01.patch
>
>
> Currently table metrics are output using custom metrics names that clutter 
> various metrics lists and are impossible to (sanely) aggregate.
> We can use Hadoop MetricsTag to instead use tagging on a single metric (for a 
> given logical metric), allowing both per-table display and cross-table 
> aggregation on the other end.
> In this JIRA (patch coming) I'd like to add the ability to do that
> 1) Actual tagging in multiple paths that output table metrics.
> 2) The ugliest part - preventing server-level metrics from being output in 
> tags case to avoid duplicate metrics. Seems like a large refactor of the 
> metrics is in order (not included)...
> 3) Fixes for some issues where wrong metrics are output, metrics are not 
> output at all, exceptions like null Optional cause table metrics to not be 
> output forever, etc.
> 4) Renaming several table-level latency metrics to be consistent with 
> server-level latency metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22407) add an option to use Hadoop metrics tags for table metrics (and fix some issues in metrics)

2019-05-13 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22407:
-
Attachment: HBASE-22407.01.patch

> add an option to use Hadoop metrics tags for table metrics (and fix some 
> issues in metrics)
> ---
>
> Key: HBASE-22407
> URL: https://issues.apache.org/jira/browse/HBASE-22407
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22407.01.patch
>
>
> Currently table metrics are output using custom metrics names that clutter 
> various metrics lists and are impossible to (sanely) aggregate.
> We can use Hadoop MetricsTag to instead use tagging on a single metric (for a 
> given logical metric), allowing both per-table display and cross-table 
> aggregation on the other end.
> In this JIRA (patch coming) I'd like to add the ability to do that
> 1) Actual tagging in multiple paths that output table metrics.
> 2) The ugliest part - preventing server-level metrics from being output in 
> tags case to avoid duplicate metrics. Seems like a large refactor of the 
> metrics is in order (not included)...
> 3) Fixes for some issues where wrong metrics are output, metrics are not 
> output at all, exceptions like null Optional cause table metrics to not be 
> output forever, etc.
> 4) Renaming several table-level latency metrics to be consistent with 
> server-level latency metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22407) add an option to use Hadoop metrics tags for table metrics (and fix some issues in metrics)

2019-05-13 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22407:


 Summary: add an option to use Hadoop metrics tags for table 
metrics (and fix some issues in metrics)
 Key: HBASE-22407
 URL: https://issues.apache.org/jira/browse/HBASE-22407
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin


Currently table metrics are output using custom metrics names that clutter 
various metrics lists and are impossible to (sanely) aggregate.
We can use Hadoop MetricsTag to instead use tagging on a single metric (for a 
given logical metric), allowing both per-table display and cross-table 
aggregation on the other end.

In this JIRA (patch coming) I'd like to add the ability to do that
1) Actual tagging in multiple paths that output table metrics.
2) The ugliest part - preventing server-level metrics from being output in tags 
case to avoid duplicate metrics. Seems like a large refactor of the metrics is 
in order (not included)...
3) Fixes for some issues where wrong metrics are output, metrics are not output 
at all, exceptions like null Optional cause table metrics to not be output 
forever, etc.
4) Renaming several table-level latency metrics to be consistent with 
server-level latency metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22254) refactor and improve decommissioning logic

2019-05-13 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838786#comment-16838786
 ] 

Sergey Shelukhin commented on HBASE-22254:
--

The tests are now passing

> refactor and improve decommissioning logic
> --
>
> Key: HBASE-22254
> URL: https://issues.apache.org/jira/browse/HBASE-22254
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, 
> HBASE-22254.03.patch, HBASE-22254.patch
>
>
> Making some changes needed to support better decommissioning on large 
> clusters and with container mode; to test those and add clarify I moved parts 
> of decommissioning logic from HMaster, Draining tracker, and ServerManager 
> into a separate class.
> Features added/improvements:
> 1) More resilient off-loading; right now off-loading fails for a subset of 
> regions in case of a single region failure; is never done on master restart, 
> etc.
> 2) Option to kill RS after off-loading (good for container mode HBase, e.g. 
> on YARN).
> 3) Option to specify machine names only to decommission, for the API to be 
> usable for an external system that doesn't care about HBase server names, or 
> e.g. multiple RS in containers on the same node.
> 4) Option to replace existing decommissioning list instead of adding to it 
> (the same; to avoid additionally remembering what was previously sent to 
> HBase).
> 5) Tests, comments ;)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty

2019-05-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22376:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master only; the fix in HBASE-20727 only exists on master

> master can fail to start w/NPE if lastflushedseqids file is empty
> -
>
> Key: HBASE-22376
> URL: https://issues.apache.org/jira/browse/HBASE-22376
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-22376.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty

2019-05-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22376:
-
Affects Version/s: (was: 2.2.0)

> master can fail to start w/NPE if lastflushedseqids file is empty
> -
>
> Key: HBASE-22376
> URL: https://issues.apache.org/jira/browse/HBASE-22376
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-22376.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty

2019-05-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837672#comment-16837672
 ] 

Sergey Shelukhin commented on HBASE-22376:
--

Thanks for the review!

> master can fail to start w/NPE if lastflushedseqids file is empty
> -
>
> Key: HBASE-22376
> URL: https://issues.apache.org/jira/browse/HBASE-22376
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-22376.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty

2019-05-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22376:
-
Fix Version/s: (was: 2.2.0)

> master can fail to start w/NPE if lastflushedseqids file is empty
> -
>
> Key: HBASE-22376
> URL: https://issues.apache.org/jira/browse/HBASE-22376
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-22376.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22254) refactor and improve decommissioning logic

2019-05-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22254:
-
Attachment: (was: HBASE-22254.03.patch)

> refactor and improve decommissioning logic
> --
>
> Key: HBASE-22254
> URL: https://issues.apache.org/jira/browse/HBASE-22254
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, 
> HBASE-22254.03.patch, HBASE-22254.patch
>
>
> Making some changes needed to support better decommissioning on large 
> clusters and with container mode; to test those and add clarify I moved parts 
> of decommissioning logic from HMaster, Draining tracker, and ServerManager 
> into a separate class.
> Features added/improvements:
> 1) More resilient off-loading; right now off-loading fails for a subset of 
> regions in case of a single region failure; is never done on master restart, 
> etc.
> 2) Option to kill RS after off-loading (good for container mode HBase, e.g. 
> on YARN).
> 3) Option to specify machine names only to decommission, for the API to be 
> usable for an external system that doesn't care about HBase server names, or 
> e.g. multiple RS in containers on the same node.
> 4) Option to replace existing decommissioning list instead of adding to it 
> (the same; to avoid additionally remembering what was previously sent to 
> HBase).
> 5) Tests, comments ;)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22254) refactor and improve decommissioning logic

2019-05-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22254:
-
Attachment: HBASE-22254.03.patch

> refactor and improve decommissioning logic
> --
>
> Key: HBASE-22254
> URL: https://issues.apache.org/jira/browse/HBASE-22254
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, 
> HBASE-22254.03.patch, HBASE-22254.patch
>
>
> Making some changes needed to support better decommissioning on large 
> clusters and with container mode; to test those and add clarify I moved parts 
> of decommissioning logic from HMaster, Draining tracker, and ServerManager 
> into a separate class.
> Features added/improvements:
> 1) More resilient off-loading; right now off-loading fails for a subset of 
> regions in case of a single region failure; is never done on master restart, 
> etc.
> 2) Option to kill RS after off-loading (good for container mode HBase, e.g. 
> on YARN).
> 3) Option to specify machine names only to decommission, for the API to be 
> usable for an external system that doesn't care about HBase server names, or 
> e.g. multiple RS in containers on the same node.
> 4) Option to replace existing decommissioning list instead of adding to it 
> (the same; to avoid additionally remembering what was previously sent to 
> HBase).
> 5) Tests, comments ;)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22254) refactor and improve decommissioning logic

2019-05-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837641#comment-16837641
 ] 

Sergey Shelukhin commented on HBASE-22254:
--

Fixed the admin test.

> refactor and improve decommissioning logic
> --
>
> Key: HBASE-22254
> URL: https://issues.apache.org/jira/browse/HBASE-22254
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, 
> HBASE-22254.03.patch, HBASE-22254.patch
>
>
> Making some changes needed to support better decommissioning on large 
> clusters and with container mode; to test those and add clarify I moved parts 
> of decommissioning logic from HMaster, Draining tracker, and ServerManager 
> into a separate class.
> Features added/improvements:
> 1) More resilient off-loading; right now off-loading fails for a subset of 
> regions in case of a single region failure; is never done on master restart, 
> etc.
> 2) Option to kill RS after off-loading (good for container mode HBase, e.g. 
> on YARN).
> 3) Option to specify machine names only to decommission, for the API to be 
> usable for an external system that doesn't care about HBase server names, or 
> e.g. multiple RS in containers on the same node.
> 4) Option to replace existing decommissioning list instead of adding to it 
> (the same; to avoid additionally remembering what was previously sent to 
> HBase).
> 5) Tests, comments ;)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22254) refactor and improve decommissioning logic

2019-05-10 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22254:
-
Attachment: HBASE-22254.03.patch

> refactor and improve decommissioning logic
> --
>
> Key: HBASE-22254
> URL: https://issues.apache.org/jira/browse/HBASE-22254
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, 
> HBASE-22254.03.patch, HBASE-22254.patch
>
>
> Making some changes needed to support better decommissioning on large 
> clusters and with container mode; to test those and add clarify I moved parts 
> of decommissioning logic from HMaster, Draining tracker, and ServerManager 
> into a separate class.
> Features added/improvements:
> 1) More resilient off-loading; right now off-loading fails for a subset of 
> regions in case of a single region failure; is never done on master restart, 
> etc.
> 2) Option to kill RS after off-loading (good for container mode HBase, e.g. 
> on YARN).
> 3) Option to specify machine names only to decommission, for the API to be 
> usable for an external system that doesn't care about HBase server names, or 
> e.g. multiple RS in containers on the same node.
> 4) Option to replace existing decommissioning list instead of adding to it 
> (the same; to avoid additionally remembering what was previously sent to 
> HBase).
> 5) Tests, comments ;)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22254) refactor and improve decommissioning logic

2019-05-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837585#comment-16837585
 ] 

Sergey Shelukhin commented on HBASE-22254:
--

Most test failures look spurious, admin one looks real. Apparently offload 
cannot be tested (and there's no existing test for it) cause there's only one 
server

> refactor and improve decommissioning logic
> --
>
> Key: HBASE-22254
> URL: https://issues.apache.org/jira/browse/HBASE-22254
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, 
> HBASE-22254.patch
>
>
> Making some changes needed to support better decommissioning on large 
> clusters and with container mode; to test those and add clarify I moved parts 
> of decommissioning logic from HMaster, Draining tracker, and ServerManager 
> into a separate class.
> Features added/improvements:
> 1) More resilient off-loading; right now off-loading fails for a subset of 
> regions in case of a single region failure; is never done on master restart, 
> etc.
> 2) Option to kill RS after off-loading (good for container mode HBase, e.g. 
> on YARN).
> 3) Option to specify machine names only to decommission, for the API to be 
> usable for an external system that doesn't care about HBase server names, or 
> e.g. multiple RS in containers on the same node.
> 4) Option to replace existing decommissioning list instead of adding to it 
> (the same; to avoid additionally remembering what was previously sent to 
> HBase).
> 5) Tests, comments ;)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22254) refactor and improve decommissioning logic

2019-05-10 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16837491#comment-16837491
 ] 

Sergey Shelukhin commented on HBASE-22254:
--

The Ruby warnings in the code that I basically copy-pasted, so I'm going to 
ignore most of them.
Looking at the rest..

> refactor and improve decommissioning logic
> --
>
> Key: HBASE-22254
> URL: https://issues.apache.org/jira/browse/HBASE-22254
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, 
> HBASE-22254.patch
>
>
> Making some changes needed to support better decommissioning on large 
> clusters and with container mode; to test those and add clarify I moved parts 
> of decommissioning logic from HMaster, Draining tracker, and ServerManager 
> into a separate class.
> Features added/improvements:
> 1) More resilient off-loading; right now off-loading fails for a subset of 
> regions in case of a single region failure; is never done on master restart, 
> etc.
> 2) Option to kill RS after off-loading (good for container mode HBase, e.g. 
> on YARN).
> 3) Option to specify machine names only to decommission, for the API to be 
> usable for an external system that doesn't care about HBase server names, or 
> e.g. multiple RS in containers on the same node.
> 4) Option to replace existing decommissioning list instead of adding to it 
> (the same; to avoid additionally remembering what was previously sent to 
> HBase).
> 5) Tests, comments ;)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22254) refactor and improve decommissioning logic

2019-05-09 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836817#comment-16836817
 ] 

Sergey Shelukhin commented on HBASE-22254:
--

Btw, these APIs were added in HBASE-17370 but the book wasn't updated, it still 
relies on znode creation there... I wonder if the book should be updated w/this 
patch

> refactor and improve decommissioning logic
> --
>
> Key: HBASE-22254
> URL: https://issues.apache.org/jira/browse/HBASE-22254
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, 
> HBASE-22254.patch
>
>
> Making some changes needed to support better decommissioning on large 
> clusters and with container mode; to test those and add clarify I moved parts 
> of decommissioning logic from HMaster, Draining tracker, and ServerManager 
> into a separate class.
> Features added/improvements:
> 1) More resilient off-loading; right now off-loading fails for a subset of 
> regions in case of a single region failure; is never done on master restart, 
> etc.
> 2) Option to kill RS after off-loading (good for container mode HBase, e.g. 
> on YARN).
> 3) Option to specify machine names only to decommission, for the API to be 
> usable for an external system that doesn't care about HBase server names, or 
> e.g. multiple RS in containers on the same node.
> 4) Option to replace existing decommissioning list instead of adding to it 
> (the same; to avoid additionally remembering what was previously sent to 
> HBase).
> 5) Tests, comments ;)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22254) refactor and improve decommissioning logic

2019-05-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22254:
-
Status: Patch Available  (was: Open)

Fixed the test, also made the client changes to make new API features usable 
not just directly via a request.

> refactor and improve decommissioning logic
> --
>
> Key: HBASE-22254
> URL: https://issues.apache.org/jira/browse/HBASE-22254
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, 
> HBASE-22254.patch
>
>
> Making some changes needed to support better decommissioning on large 
> clusters and with container mode; to test those and add clarify I moved parts 
> of decommissioning logic from HMaster, Draining tracker, and ServerManager 
> into a separate class.
> Features added/improvements:
> 1) More resilient off-loading; right now off-loading fails for a subset of 
> regions in case of a single region failure; is never done on master restart, 
> etc.
> 2) Option to kill RS after off-loading (good for container mode HBase, e.g. 
> on YARN).
> 3) Option to specify machine names only to decommission, for the API to be 
> usable for an external system that doesn't care about HBase server names, or 
> e.g. multiple RS in containers on the same node.
> 4) Option to replace existing decommissioning list instead of adding to it 
> (the same; to avoid additionally remembering what was previously sent to 
> HBase).
> 5) Tests, comments ;)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22254) refactor and improve decommissioning logic

2019-05-09 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22254:
-
Attachment: HBASE-22254.02.patch

> refactor and improve decommissioning logic
> --
>
> Key: HBASE-22254
> URL: https://issues.apache.org/jira/browse/HBASE-22254
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22254.01.patch, HBASE-22254.02.patch, 
> HBASE-22254.patch
>
>
> Making some changes needed to support better decommissioning on large 
> clusters and with container mode; to test those and add clarify I moved parts 
> of decommissioning logic from HMaster, Draining tracker, and ServerManager 
> into a separate class.
> Features added/improvements:
> 1) More resilient off-loading; right now off-loading fails for a subset of 
> regions in case of a single region failure; is never done on master restart, 
> etc.
> 2) Option to kill RS after off-loading (good for container mode HBase, e.g. 
> on YARN).
> 3) Option to specify machine names only to decommission, for the API to be 
> usable for an external system that doesn't care about HBase server names, or 
> e.g. multiple RS in containers on the same node.
> 4) Option to replace existing decommissioning list instead of adding to it 
> (the same; to avoid additionally remembering what was previously sent to 
> HBase).
> 5) Tests, comments ;)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-17370) Fix or provide shell scripts to drain and decommission region server

2019-05-09 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836783#comment-16836783
 ] 

Sergey Shelukhin commented on HBASE-17370:
--

Should the book be updated for this? Looks like it still suggests creating 
znodes to decommission region servers.

> Fix or provide shell scripts to drain and decommission region server
> 
>
> Key: HBASE-17370
> URL: https://issues.apache.org/jira/browse/HBASE-17370
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jerry He
>Assignee: Nihal Jain
>Priority: Major
>  Labels: operability
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-17370.branch-2.001.patch, 
> HBASE-17370.master.001.patch, HBASE-17370.master.002.patch
>
>
> 1. Update the existing shell scripts to use the new drain related API.
> 2  Or provide new shell scripts.
> 3. Provide a 'decommission' shell tool that puts the server in drain mode and 
> offload the server.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty

2019-05-08 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835951#comment-16835951
 ] 

Sergey Shelukhin commented on HBASE-22376:
--

[~psomogyi] [~Apache9] can you take a look? tiny fix. The caller of this code 
already catches and ignores IOEx, but in case of an empty file PB returns null 

> master can fail to start w/NPE if lastflushedseqids file is empty
> -
>
> Key: HBASE-22376
> URL: https://issues.apache.org/jira/browse/HBASE-22376
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-22376.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22385) Consider "programmatic" HFiles

2019-05-08 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835830#comment-16835830
 ] 

Sergey Shelukhin commented on HBASE-22385:
--

Could the refactoring be used to allow multi-level splits by splitting 
references (preferably via a new modified reference, not multi-level 
references)?

> Consider "programmatic" HFiles
> --
>
> Key: HBASE-22385
> URL: https://issues.apache.org/jira/browse/HBASE-22385
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>Priority: Major
>
> For various use cases (among others there is mass deletes) it would be great 
> if HBase had a mechanism for programmatic HFiles. I.e. HFiles (Reader) that 
> produce KeyValues just like any other old HFile, but the key values produced 
> are generated or produced by some other means rather than being physically 
> read from some storage medium.
> In fact this could be a generalization for the various HFiles we have: 
> (Normal) HFiles, HFileLinks, HalfStoreFiles, etc.
> A simple way could be to allow for storing a classname into the HFile. Upon 
> reading the HFile HBase would instantiate an instance of that class and that 
> instance is responsible for all further interaction with that HFile. For 
> normal HFiles it would just be the normal HFileReaderVx. For that we'd also 
> need to StoreFile.Reader into an interface (or a more basic base class) that 
> can be properly implemented.
> (Remember this is Brainstorming :) )



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22360) Abort timer doesn't set when abort is called during graceful shutdown process

2019-05-07 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22360:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master and branch-2

> Abort timer doesn't set when abort is called during graceful shutdown process
> -
>
> Key: HBASE-22360
> URL: https://issues.apache.org/jira/browse/HBASE-22360
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Bahram Chehrazy
>Assignee: Bahram Chehrazy
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: Set-the-abortMonitor-timer-in-the-abort-function-01.patch
>
>
> The abort timer only get set when the server is aborted. But if the server is 
> being gracefully stopped and something goes wrong causing an abort, the timer 
> may not get set, and the shutdown process could take a very long time or 
> completely stuck the server.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty

2019-05-07 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22376:
-
Attachment: HBASE-22376.patch

> master can fail to start w/NPE if lastflushedseqids file is empty
> -
>
> Key: HBASE-22376
> URL: https://issues.apache.org/jira/browse/HBASE-22376
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-22376.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty

2019-05-07 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22376:
-
Status: Patch Available  (was: Open)

> master can fail to start w/NPE if lastflushedseqids file is empty
> -
>
> Key: HBASE-22376
> URL: https://issues.apache.org/jira/browse/HBASE-22376
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-22376.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty

2019-05-07 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22376:
-
Affects Version/s: 2.2.0
   3.0.0

> master can fail to start w/NPE if lastflushedseqids file is empty
> -
>
> Key: HBASE-22376
> URL: https://issues.apache.org/jira/browse/HBASE-22376
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty

2019-05-07 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22376:
-
Fix Version/s: 2.2.0
   3.0.0

> master can fail to start w/NPE if lastflushedseqids file is empty
> -
>
> Key: HBASE-22376
> URL: https://issues.apache.org/jira/browse/HBASE-22376
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22376) master can fail to start w/NPE if lastflushedseqids file is empty

2019-05-07 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22376:


 Summary: master can fail to start w/NPE if lastflushedseqids file 
is empty
 Key: HBASE-22376
 URL: https://issues.apache.org/jira/browse/HBASE-22376
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22360) Abort timer doesn't set when abort is called during graceful shutdown process

2019-05-06 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834289#comment-16834289
 ] 

Sergey Shelukhin commented on HBASE-22360:
--

+1

> Abort timer doesn't set when abort is called during graceful shutdown process
> -
>
> Key: HBASE-22360
> URL: https://issues.apache.org/jira/browse/HBASE-22360
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Bahram Chehrazy
>Assignee: Bahram Chehrazy
>Priority: Major
> Attachments: Set-the-abortMonitor-timer-in-the-abort-function-01.patch
>
>
> The abort timer only get set when the server is aborted. But if the server is 
> being gracefully stopped and something goes wrong causing an abort, the timer 
> may not get set, and the shutdown process could take a very long time or 
> completely stuck the server.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners

2019-05-06 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22346:
-
Attachment: HBASE-22346.01.patch

> scanner priorities/deadline units are invalid for non-huge scanners
> ---
>
> Key: HBASE-22346
> URL: https://issues.apache.org/jira/browse/HBASE-22346
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22346.01.patch, HBASE-22346.patch
>
>
> I was looking at using the priority (deadline) queue for scanner requests; 
> what I see is that AnnotationReadingPriorityFunction, the only impl of the 
> deadline function available, implements getDeadline as sqrt of the number of 
> next() calls, from HBASE-10993.
> However, CallPriorityComparator.compare, its only caller, adds that 
> "deadline" value to the callA.getReceiveTime() in milliseconds...
> That results in some sort of a meaningless value that I assume only make 
> sense "by coincidence" for telling apart broad and specific classes of 
> scanners... in practice next calls must be in the 1000s before it becomes 
> meaningful vs small differences in ReceivedTime
> When there's contention from many scanners, e.g. small scanners for meta, or 
> just users creating tons of scanners to the point where requests queue up, 
> the actual deadline is not accounted for and the priority function itself is 
> meaningless... In fact as queueing increases, it becomes worse because 
> receivedtime differences grow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it

2019-05-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22354:
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master and branch-2. Thanks for the review!

> master never sets abortRequested, and thus abort timeout doesn't work for it
> 
>
> Key: HBASE-22354
> URL: https://issues.apache.org/jira/browse/HBASE-22354
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-22354.patch
>
>
> Discovered w/HBASE-22353 netty deadlock.
> The property is not set, so the abort timer is not started.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-22353) update non-shaded netty for Hadoop 2 to a more recent version of 3.6

2019-05-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HBASE-22353:


Assignee: Sergey Shelukhin

> update non-shaded netty for Hadoop 2 to a more recent version of 3.6
> 
>
> Key: HBASE-22353
> URL: https://issues.apache.org/jira/browse/HBASE-22353
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22353.patch
>
>
> When using Netty socket for ZK, we got this deadlock.
> Appears to be https://github.com/netty/netty/issues/1181 (or one of similar 
> tickets before that). 
> We are using Netty 3.6.2 for Hadoop 2, seems like it should be safe to 
> upgrade to 3.6.10, assuming it's purely a bugfix release for 3.6.2 and they 
> are compatible?
> {noformat}
> Java stack information for the threads listed above:
> ===
> "main-SendThread(...)":
>at org.jboss.netty.handler.ssl.SslHandler.wrap(SslHandler.java:958)
>- waiting to lock <0xc91d8848> (a java.lang.Object)
>- locked <0xcdcc7740> (a java.util.LinkedList)
>at 
> org.jboss.netty.handler.ssl.SslHandler.handleDownstream(SslHandler.java:627)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:587)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:578)
>at org.jboss.netty.channel.Channels.write(Channels.java:704)
>at org.jboss.netty.channel.Channels.write(Channels.java:671)
>at 
> org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:248)
>at 
> org.apache.zookeeper.ClientCnxnSocketNetty.sendPkt(ClientCnxnSocketNetty.java:268)
>at 
> org.apache.zookeeper.ClientCnxnSocketNetty.doWrite(ClientCnxnSocketNetty.java:291)
>at 
> org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:249)
>at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146)
> "New I/O worker #3":
>at 
> org.jboss.netty.handler.ssl.SslHandler.channelClosed(SslHandler.java:1554)
>- waiting to lock <0xcdcc7740> (a java.util.LinkedList)
>at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
>at 
> org.jboss.netty.channel.Channels.fireChannelClosed(Channels.java:468)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:351)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:254)
>- locked <0xc91d8770> (a java.lang.Object)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:145)
>at 
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:83)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:775)
>at org.jboss.netty.channel.Channels.write(Channels.java:725)
>at org.jboss.netty.channel.Channels.write(Channels.java:686)
>at 
> org.jboss.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:1140)
>at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1229)
>- locked <0xc91d8848> (a java.lang.Object)
>at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:910)
>at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
>at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
>at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
>at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)
>at 
> 

[jira] [Updated] (HBASE-22353) update non-shaded netty for Hadoop 2 to a more recent version of 3.6

2019-05-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22353:
-
Attachment: HBASE-22353.patch

> update non-shaded netty for Hadoop 2 to a more recent version of 3.6
> 
>
> Key: HBASE-22353
> URL: https://issues.apache.org/jira/browse/HBASE-22353
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22353.patch
>
>
> When using Netty socket for ZK, we got this deadlock.
> Appears to be https://github.com/netty/netty/issues/1181 (or one of similar 
> tickets before that). 
> We are using Netty 3.6.2 for Hadoop 2, seems like it should be safe to 
> upgrade to 3.6.10, assuming it's purely a bugfix release for 3.6.2 and they 
> are compatible?
> {noformat}
> Java stack information for the threads listed above:
> ===
> "main-SendThread(...)":
>at org.jboss.netty.handler.ssl.SslHandler.wrap(SslHandler.java:958)
>- waiting to lock <0xc91d8848> (a java.lang.Object)
>- locked <0xcdcc7740> (a java.util.LinkedList)
>at 
> org.jboss.netty.handler.ssl.SslHandler.handleDownstream(SslHandler.java:627)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:587)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:578)
>at org.jboss.netty.channel.Channels.write(Channels.java:704)
>at org.jboss.netty.channel.Channels.write(Channels.java:671)
>at 
> org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:248)
>at 
> org.apache.zookeeper.ClientCnxnSocketNetty.sendPkt(ClientCnxnSocketNetty.java:268)
>at 
> org.apache.zookeeper.ClientCnxnSocketNetty.doWrite(ClientCnxnSocketNetty.java:291)
>at 
> org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:249)
>at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146)
> "New I/O worker #3":
>at 
> org.jboss.netty.handler.ssl.SslHandler.channelClosed(SslHandler.java:1554)
>- waiting to lock <0xcdcc7740> (a java.util.LinkedList)
>at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
>at 
> org.jboss.netty.channel.Channels.fireChannelClosed(Channels.java:468)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:351)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:254)
>- locked <0xc91d8770> (a java.lang.Object)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:145)
>at 
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:83)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:775)
>at org.jboss.netty.channel.Channels.write(Channels.java:725)
>at org.jboss.netty.channel.Channels.write(Channels.java:686)
>at 
> org.jboss.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:1140)
>at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1229)
>- locked <0xc91d8848> (a java.lang.Object)
>at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:910)
>at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
>at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
>at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
>at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>at 
> 

[jira] [Updated] (HBASE-22353) update non-shaded netty for Hadoop 2 to a more recent version of 3.6

2019-05-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22353:
-
Status: Patch Available  (was: Open)

> update non-shaded netty for Hadoop 2 to a more recent version of 3.6
> 
>
> Key: HBASE-22353
> URL: https://issues.apache.org/jira/browse/HBASE-22353
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22353.patch
>
>
> When using Netty socket for ZK, we got this deadlock.
> Appears to be https://github.com/netty/netty/issues/1181 (or one of similar 
> tickets before that). 
> We are using Netty 3.6.2 for Hadoop 2, seems like it should be safe to 
> upgrade to 3.6.10, assuming it's purely a bugfix release for 3.6.2 and they 
> are compatible?
> {noformat}
> Java stack information for the threads listed above:
> ===
> "main-SendThread(...)":
>at org.jboss.netty.handler.ssl.SslHandler.wrap(SslHandler.java:958)
>- waiting to lock <0xc91d8848> (a java.lang.Object)
>- locked <0xcdcc7740> (a java.util.LinkedList)
>at 
> org.jboss.netty.handler.ssl.SslHandler.handleDownstream(SslHandler.java:627)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:587)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:578)
>at org.jboss.netty.channel.Channels.write(Channels.java:704)
>at org.jboss.netty.channel.Channels.write(Channels.java:671)
>at 
> org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:248)
>at 
> org.apache.zookeeper.ClientCnxnSocketNetty.sendPkt(ClientCnxnSocketNetty.java:268)
>at 
> org.apache.zookeeper.ClientCnxnSocketNetty.doWrite(ClientCnxnSocketNetty.java:291)
>at 
> org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:249)
>at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146)
> "New I/O worker #3":
>at 
> org.jboss.netty.handler.ssl.SslHandler.channelClosed(SslHandler.java:1554)
>- waiting to lock <0xcdcc7740> (a java.util.LinkedList)
>at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
>at 
> org.jboss.netty.channel.Channels.fireChannelClosed(Channels.java:468)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:351)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:254)
>- locked <0xc91d8770> (a java.lang.Object)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:145)
>at 
> org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:83)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:775)
>at org.jboss.netty.channel.Channels.write(Channels.java:725)
>at org.jboss.netty.channel.Channels.write(Channels.java:686)
>at 
> org.jboss.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:1140)
>at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1229)
>- locked <0xc91d8848> (a java.lang.Object)
>at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:910)
>at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
>at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
>at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
>at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
>at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
>at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
>at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
>at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)
>at 
> 

[jira] [Updated] (HBASE-22348) allow one to actually disable replication svc

2019-05-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22348:
-
Fix Version/s: 2.2.0

> allow one to actually disable replication svc
> -
>
> Key: HBASE-22348
> URL: https://issues.apache.org/jira/browse/HBASE-22348
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 2.2.0
>
> Attachments: HBASE-22348.patch
>
>
> Minor, but it does create extra ZK traffic for no reason and there's no way 
> to disable that it appears. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it

2019-05-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22354:
-
Fix Version/s: 2.2.0

> master never sets abortRequested, and thus abort timeout doesn't work for it
> 
>
> Key: HBASE-22354
> URL: https://issues.apache.org/jira/browse/HBASE-22354
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 2.2.0
>
> Attachments: HBASE-22354.patch
>
>
> Discovered w/HBASE-22353 netty deadlock.
> The property is not set, so the abort timer is not started.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22348) allow one to actually disable replication svc

2019-05-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22348:
-
Affects Version/s: 2.2.0

> allow one to actually disable replication svc
> -
>
> Key: HBASE-22348
> URL: https://issues.apache.org/jira/browse/HBASE-22348
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22348.patch
>
>
> Minor, but it does create extra ZK traffic for no reason and there's no way 
> to disable that it appears. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it

2019-05-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22354:
-
Affects Version/s: 2.2.0

> master never sets abortRequested, and thus abort timeout doesn't work for it
> 
>
> Key: HBASE-22354
> URL: https://issues.apache.org/jira/browse/HBASE-22354
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22354.patch
>
>
> Discovered w/HBASE-22353 netty deadlock.
> The property is not set, so the abort timer is not started.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it

2019-05-02 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22354:


 Summary: master never sets abortRequested, and thus abort timeout 
doesn't work for it
 Key: HBASE-22354
 URL: https://issues.apache.org/jira/browse/HBASE-22354
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


Discovered w/HBASE-22353 netty deadlock.
The property is not set, so the abort timer is not started.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it

2019-05-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HBASE-22354:


Assignee: Sergey Shelukhin

> master never sets abortRequested, and thus abort timeout doesn't work for it
> 
>
> Key: HBASE-22354
> URL: https://issues.apache.org/jira/browse/HBASE-22354
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> Discovered w/HBASE-22353 netty deadlock.
> The property is not set, so the abort timer is not started.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it

2019-05-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22354:
-
Attachment: HBASE-22354.patch

> master never sets abortRequested, and thus abort timeout doesn't work for it
> 
>
> Key: HBASE-22354
> URL: https://issues.apache.org/jira/browse/HBASE-22354
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22354.patch
>
>
> Discovered w/HBASE-22353 netty deadlock.
> The property is not set, so the abort timer is not started.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22354) master never sets abortRequested, and thus abort timeout doesn't work for it

2019-05-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22354:
-
Status: Patch Available  (was: Open)

Tiny patch...

> master never sets abortRequested, and thus abort timeout doesn't work for it
> 
>
> Key: HBASE-22354
> URL: https://issues.apache.org/jira/browse/HBASE-22354
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22354.patch
>
>
> Discovered w/HBASE-22353 netty deadlock.
> The property is not set, so the abort timer is not started.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22353) update non-shaded netty for Hadoop 2 to a more recent version of 3.6

2019-05-02 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22353:


 Summary: update non-shaded netty for Hadoop 2 to a more recent 
version of 3.6
 Key: HBASE-22353
 URL: https://issues.apache.org/jira/browse/HBASE-22353
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


When using Netty socket for ZK, we got this deadlock.
Appears to be https://github.com/netty/netty/issues/1181 (or one of similar 
tickets before that). 
We are using Netty 3.6.2 for Hadoop 2, seems like it should be safe to upgrade 
to 3.6.10, assuming it's purely a bugfix release for 3.6.2 and they are 
compatible?

{noformat}
Java stack information for the threads listed above:
===
"main-SendThread(...)":
   at org.jboss.netty.handler.ssl.SslHandler.wrap(SslHandler.java:958)
   - waiting to lock <0xc91d8848> (a java.lang.Object)
   - locked <0xcdcc7740> (a java.util.LinkedList)
   at 
org.jboss.netty.handler.ssl.SslHandler.handleDownstream(SslHandler.java:627)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:587)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:578)
   at org.jboss.netty.channel.Channels.write(Channels.java:704)
   at org.jboss.netty.channel.Channels.write(Channels.java:671)
   at 
org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:248)
   at 
org.apache.zookeeper.ClientCnxnSocketNetty.sendPkt(ClientCnxnSocketNetty.java:268)
   at 
org.apache.zookeeper.ClientCnxnSocketNetty.doWrite(ClientCnxnSocketNetty.java:291)
   at 
org.apache.zookeeper.ClientCnxnSocketNetty.doTransport(ClientCnxnSocketNetty.java:249)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146)
"New I/O worker #3":
   at 
org.jboss.netty.handler.ssl.SslHandler.channelClosed(SslHandler.java:1554)
   - waiting to lock <0xcdcc7740> (a java.util.LinkedList)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:88)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
   at org.jboss.netty.channel.Channels.fireChannelClosed(Channels.java:468)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.close(AbstractNioWorker.java:351)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:254)
   - locked <0xc91d8770> (a java.lang.Object)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:145)
   at 
org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.java:83)
   at 
org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:775)
   at org.jboss.netty.channel.Channels.write(Channels.java:725)
   at org.jboss.netty.channel.Channels.write(Channels.java:686)
   at 
org.jboss.netty.handler.ssl.SslHandler.wrapNonAppData(SslHandler.java:1140)
   at org.jboss.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1229)
   - locked <0xc91d8848> (a java.lang.Object)
   at org.jboss.netty.handler.ssl.SslHandler.decode(SslHandler.java:910)
   at 
org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:425)
   at 
org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
   at 
org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
   at 
org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
   at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
   at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
   at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
   at 
org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)
   at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
   at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
   at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at 

[jira] [Moved] (HBASE-22352) use a system table as an alternative proc store

2019-05-02 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin moved HIVE-21676 to HBASE-22352:
-

Key: HBASE-22352  (was: HIVE-21676)
Project: HBase  (was: Hive)

> use a system table as an alternative proc store
> ---
>
> Key: HBASE-22352
> URL: https://issues.apache.org/jira/browse/HBASE-22352
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> We keep hitting these issues:
> {noformat}
> 2019-04-30 23:41:52,164 INFO  [master/master:17000:becomeActiveMaster] 
> procedure2.ProcedureExecutor: Starting 16 core workers (bigger of cpus/4 or 
> 16) with max (burst) worker count=160
> 2019-04-30 23:41:52,171 INFO  [master/master:17000:becomeActiveMaster] 
> util.FSHDFSUtils: Recover lease on dfs file 
> .../MasterProcWALs/pv2-0481.log
> 2019-04-30 23:41:52,176 INFO  [master/master:17000:becomeActiveMaster] 
> util.FSHDFSUtils: Recovered lease, attempt=0 on 
> file=.../MasterProcWALs/pv2-0481.log after 5ms
> 2019-04-30 23:41:52,288 INFO  [master/master:17000:becomeActiveMaster] 
> util.FSHDFSUtils: Recover lease on dfs file 
> .../MasterProcWALs/pv2-0482.log
> 2019-04-30 23:41:52,289 INFO  [master/master:17000:becomeActiveMaster] 
> util.FSHDFSUtils: Recovered lease, attempt=0 on 
> file=.../MasterProcWALs/pv2-0482.log after 1ms
> 2019-04-30 23:41:52,373 INFO  [master/master:17000:becomeActiveMaster] 
> wal.WALProcedureStore: Rolled new Procedure Store WAL, id=483
> 2019-04-30 23:41:52,375 INFO  [master/master:17000:becomeActiveMaster] 
> procedure2.ProcedureExecutor: Recovered WALProcedureStore lease in 206msec
> 2019-04-30 23:41:52,782 INFO  [master/master:17000:becomeActiveMaster] 
> wal.ProcedureWALFormatReader: Read 1556 entries in 
> .../MasterProcWALs/pv2-0482.log
> 2019-04-30 23:41:55,370 INFO  [master/master:17000:becomeActiveMaster] 
> wal.ProcedureWALFormatReader: Read 28113 entries in 
> .../MasterProcWALs/pv2-0481.log
> 2019-04-30 23:41:55,384 ERROR [master/master:17000:becomeActiveMaster] 
> wal.WALProcedureTree: Missing stack id 166, max stack id is 181, root 
> procedure is Procedure(pid=289380, ppid=-1, 
> class=org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure)
> 2019-04-30 23:41:55,384 ERROR [master/master:17000:becomeActiveMaster] 
> wal.WALProcedureTree: Missing stack id 178, max stack id is 181, root 
> procedure is Procedure(pid=289380, ppid=-1, 
> class=org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure)
> 2019-04-30 23:41:55,389 ERROR [master/master:17000:becomeActiveMaster] 
> wal.WALProcedureTree: Missing stack id 359, max stack id is 360, root 
> procedure is Procedure(pid=285640, ppid=-1, 
> class=org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure)
> {noformat}
> After which the procedure(s) is/are lost and cluster is stuck permanently.
> There were no errors writing these files in the log, and no issues reading 
> them from HDFS, so it's purely a data loss issue in the structure. 
> I was thinking about debugging it, but on 2nd thought what we are trying to 
> store is some PB blob, by key.
> Coincidentally, we have an "HBase" facility that we already deploy, that does 
> just that... and it even has a WAL implementation. I don't know why we cannot 
> use it for procedure state and have to invent another complex implementation 
> of a KV store inside a KV store.
> In all/most cases, we don't even support rollback and use the latest state, 
> but if we need multiple versions, this HBase product even supports that! 
> I think we should add a hbase:proc table that would be maintained similar to 
> meta. The latter part esp. given the existing code for meta should be much 
> more simple than a separate store impl.
> This should be pluggable and optional via ProcStore interface (made more 
> abstract as relevant - update state, scan state, get)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22348) allow one to actually disable replication svc

2019-05-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22348:
-
Description: Minor, but it does create extra ZK traffic for no reason and 
there's no way to disable that it appears.   (was: Minor, but it does create 
extra ZK traffic for now reason and there's no way to disable that it appears. )

> allow one to actually disable replication svc
> -
>
> Key: HBASE-22348
> URL: https://issues.apache.org/jira/browse/HBASE-22348
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22348.patch
>
>
> Minor, but it does create extra ZK traffic for no reason and there's no way 
> to disable that it appears. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22348) allow one to actually disable replication svc

2019-05-01 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831225#comment-16831225
 ] 

Sergey Shelukhin commented on HBASE-22348:
--

Other places appear to either have null checks already, or are within things 
like bulk replication coprocessor, so they don't need one because they have to 
be enabled explicitly when using replication.

> allow one to actually disable replication svc
> -
>
> Key: HBASE-22348
> URL: https://issues.apache.org/jira/browse/HBASE-22348
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22348.patch
>
>
> Minor, but it does create extra ZK traffic for now reason and there's no way 
> to disable that it appears. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-22348) allow one to actually disable replication svc

2019-05-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HBASE-22348:


Assignee: Sergey Shelukhin

> allow one to actually disable replication svc
> -
>
> Key: HBASE-22348
> URL: https://issues.apache.org/jira/browse/HBASE-22348
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22348.patch
>
>
> Minor, but it does create extra ZK traffic for now reason and there's no way 
> to disable that it appears. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22348) allow one to actually disable replication svc

2019-05-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22348:
-
Status: Patch Available  (was: Open)

> allow one to actually disable replication svc
> -
>
> Key: HBASE-22348
> URL: https://issues.apache.org/jira/browse/HBASE-22348
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22348.patch
>
>
> Minor, but it does create extra ZK traffic for now reason and there's no way 
> to disable that it appears. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22348) allow one to actually disable replication svc

2019-05-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22348:
-
Attachment: HBASE-22348.patch

> allow one to actually disable replication svc
> -
>
> Key: HBASE-22348
> URL: https://issues.apache.org/jira/browse/HBASE-22348
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22348.patch
>
>
> Minor, but it does create extra ZK traffic for now reason and there's no way 
> to disable that it appears. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22348) allow one to actually disable replication svc

2019-05-01 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22348:


 Summary: allow one to actually disable replication svc
 Key: HBASE-22348
 URL: https://issues.apache.org/jira/browse/HBASE-22348
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


Minor, but it does create extra ZK traffic for now reason and there's no way to 
disable that it appears. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22347) try to archive WALs when closing a region or when shutting down RS

2019-05-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22347:
-
Description: 
When RS shuts down in an orderly manner due to an upgrade or decom, even it has 
0 regions (discovered when testing HBASE-22254), it still dies with some active 
WALs. 
WALs are then split by master, and in the 0-region case the recovered edits are 
not used for anything.  This splitting is a waste of time... if some region is 
moved away from the server it might also make sense to archive the WALs to 
avoid reading the extras.
RS shutdown should archive WALs if possible after flushing/closing regions; 
given that the latter can fail, perhaps once before, and once after.
Closing a region via an RPC should also try to archive WAL.

  was:
When RS shuts down in an orderly manner due to an upgrade or decom, even it has 
0 regions (discovered when testing HBASE-22254), it still dies with some active 
WALs. 
WALs are then split by master, and in the 0-region case the recovered edits are 
not used for anything.
RS shutdown should archive WALs if possible after flushing/closing regions; 
given that the latter can fail, perhaps once before, and once after.
Closing a region via an RPC should also try to archive WAL.


> try to archive WALs when closing a region or when shutting down RS
> --
>
> Key: HBASE-22347
> URL: https://issues.apache.org/jira/browse/HBASE-22347
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> When RS shuts down in an orderly manner due to an upgrade or decom, even it 
> has 0 regions (discovered when testing HBASE-22254), it still dies with some 
> active WALs. 
> WALs are then split by master, and in the 0-region case the recovered edits 
> are not used for anything.  This splitting is a waste of time... if some 
> region is moved away from the server it might also make sense to archive the 
> WALs to avoid reading the extras.
> RS shutdown should archive WALs if possible after flushing/closing regions; 
> given that the latter can fail, perhaps once before, and once after.
> Closing a region via an RPC should also try to archive WAL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22347) try to archive WALs when closing a region or when shutting down RS

2019-05-01 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22347:


 Summary: try to archive WALs when closing a region or when 
shutting down RS
 Key: HBASE-22347
 URL: https://issues.apache.org/jira/browse/HBASE-22347
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


When RS shuts down in an orderly manner due to an upgrade or decom, even it has 
0 regions (discovered when testing HBASE-22254), it still dies with some active 
WALs. 
WALs are then split by master, and in the 0-region case the recovered edits are 
not used for anything.
RS shutdown should archive WALs if possible after flushing/closing regions; 
given that the latter can fail, perhaps once before, and once after.
Closing a region via an RPC should also try to archive WAL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners

2019-05-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22346:
-
Status: Patch Available  (was: Open)

> scanner priorities/deadline units are invalid for non-huge scanners
> ---
>
> Key: HBASE-22346
> URL: https://issues.apache.org/jira/browse/HBASE-22346
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22346.patch
>
>
> I was looking at using the priority (deadline) queue for scanner requests; 
> what I see is that AnnotationReadingPriorityFunction, the only impl of the 
> deadline function available, implements getDeadline as sqrt of the number of 
> next() calls, from HBASE-10993.
> However, CallPriorityComparator.compare, its only caller, adds that 
> "deadline" value to the callA.getReceiveTime() in milliseconds...
> That results in some sort of a meaningless value that I assume only make 
> sense "by coincidence" for telling apart broad and specific classes of 
> scanners... in practice next calls must be in the 1000s before it becomes 
> meaningful vs small differences in ReceivedTime
> When there's contention from many scanners, e.g. small scanners for meta, or 
> just users creating tons of scanners to the point where requests queue up, 
> the actual deadline is not accounted for and the priority function itself is 
> meaningless... In fact as queueing increases, it becomes worse because 
> receivedtime differences grow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners

2019-05-01 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22346:
-
Attachment: HBASE-22346.patch

> scanner priorities/deadline units are invalid for non-huge scanners
> ---
>
> Key: HBASE-22346
> URL: https://issues.apache.org/jira/browse/HBASE-22346
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22346.patch
>
>
> I was looking at using the priority (deadline) queue for scanner requests; 
> what I see is that AnnotationReadingPriorityFunction, the only impl of the 
> deadline function available, implements getDeadline as sqrt of the number of 
> next() calls, from HBASE-10993.
> However, CallPriorityComparator.compare, its only caller, adds that 
> "deadline" value to the callA.getReceiveTime() in milliseconds...
> That results in some sort of a meaningless value that I assume only make 
> sense "by coincidence" for telling apart broad and specific classes of 
> scanners... in practice next calls must be in the 1000s before it becomes 
> meaningful vs small differences in ReceivedTime
> When there's contention from many scanners, e.g. small scanners for meta, or 
> just users creating tons of scanners to the point where requests queue up, 
> the actual deadline is not accounted for and the priority function itself is 
> meaningless... In fact as queueing increases, it becomes worse because 
> receivedtime differences grow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners

2019-04-30 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830826#comment-16830826
 ] 

Sergey Shelukhin commented on HBASE-22346:
--

[~stack] [~mbertozzi] does this make sense to you? preserves the old behavior 
with low/no overhead when unset. We will probably run this for meta only on our 
cluster and see how it goes.

> scanner priorities/deadline units are invalid for non-huge scanners
> ---
>
> Key: HBASE-22346
> URL: https://issues.apache.org/jira/browse/HBASE-22346
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> I was looking at using the priority (deadline) queue for scanner requests; 
> what I see is that AnnotationReadingPriorityFunction, the only impl of the 
> deadline function available, implements getDeadline as sqrt of the number of 
> next() calls, from HBASE-10993.
> However, CallPriorityComparator.compare, its only caller, adds that 
> "deadline" value to the callA.getReceiveTime() in milliseconds...
> That results in some sort of a meaningless value that I assume only make 
> sense "by coincidence" for telling apart broad and specific classes of 
> scanners... in practice next calls must be in the 1000s before it becomes 
> meaningful vs small differences in ReceivedTime
> When there's contention from many scanners, e.g. small scanners for meta, or 
> just users creating tons of scanners to the point where requests queue up, 
> the actual deadline is not accounted for and the priority function itself is 
> meaningless... In fact as queueing increases, it becomes worse because 
> receivedtime differences grow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners

2019-04-30 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HBASE-22346:


Assignee: Sergey Shelukhin

> scanner priorities/deadline units are invalid for non-huge scanners
> ---
>
> Key: HBASE-22346
> URL: https://issues.apache.org/jira/browse/HBASE-22346
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
>
> I was looking at using the priority (deadline) queue for scanner requests; 
> what I see is that AnnotationReadingPriorityFunction, the only impl of the 
> deadline function available, implements getDeadline as sqrt of the number of 
> next() calls, from HBASE-10993.
> However, CallPriorityComparator.compare, its only caller, adds that 
> "deadline" value to the callA.getReceiveTime() in milliseconds...
> That results in some sort of a meaningless value that I assume only make 
> sense "by coincidence" for telling apart broad and specific classes of 
> scanners... in practice next calls must be in the 1000s before it becomes 
> meaningful vs small differences in ReceivedTime
> When there's contention from many scanners, e.g. small scanners for meta, or 
> just users creating tons of scanners to the point where requests queue up, 
> the actual deadline is not accounted for and the priority function itself is 
> meaningless... In fact as queueing increases, it becomes worse because 
> receivedtime differences grow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22081) master shutdown: close RpcServer and procWAL first thing

2019-04-30 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830810#comment-16830810
 ] 

Sergey Shelukhin commented on HBASE-22081:
--

[~Apache9] does this patch make sense to you? it moves Rpc server and proc 
closing to the beginning of the shutdown to limit potential race conditions 
with incorrect state/new requests.

> master shutdown: close RpcServer and procWAL first thing
> 
>
> Key: HBASE-22081
> URL: https://issues.apache.org/jira/browse/HBASE-22081
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22081.01.patch, HBASE-22081.02.patch, 
> HBASE-22081.03.patch, HBASE-22081.patch
>
>
> I had a master get stuck due to HBASE-22079 and noticed it was logging RS 
> abort messages during shutdown.
> [~bahramch] found some issues where messages are processed by old master 
> during shutdown due to a race condition in RS cache (or it could also happen 
> due to a network race).
> Previously I found some bug where SCP was created during master shutdown that 
> had incorrect state (because some structures already got cleaned).
> I think before master fencing is implemented we can at least make these 
> issues much less likely by thinking about shutdown order.
> 1) First kill RCP server so we don't receive any more messages. There's no 
> need to receive messages when we are shutting down. Server heartbeats could 
> be impacted I guess, but I don't think they will be cause we currently only 
> kill RS on ZK timeout.
> 2) Then do whatever cleanup we think is needed that requires proc wal.
> 3) Then close proc WAL so no errant threads can create more procs.
> 4) Then do whatever other cleanup.
> 5) Finally delete znode.
> Right now znode is deleted somewhat early I think, and RpcServer is closed 
> very late.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22081) master shutdown: close RpcServer and procWAL first thing

2019-04-30 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830659#comment-16830659
 ] 

Sergey Shelukhin commented on HBASE-22081:
--

Interesting... tests pass in the JIRA and locally, but not in the PR.

> master shutdown: close RpcServer and procWAL first thing
> 
>
> Key: HBASE-22081
> URL: https://issues.apache.org/jira/browse/HBASE-22081
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22081.01.patch, HBASE-22081.02.patch, 
> HBASE-22081.03.patch, HBASE-22081.patch
>
>
> I had a master get stuck due to HBASE-22079 and noticed it was logging RS 
> abort messages during shutdown.
> [~bahramch] found some issues where messages are processed by old master 
> during shutdown due to a race condition in RS cache (or it could also happen 
> due to a network race).
> Previously I found some bug where SCP was created during master shutdown that 
> had incorrect state (because some structures already got cleaned).
> I think before master fencing is implemented we can at least make these 
> issues much less likely by thinking about shutdown order.
> 1) First kill RCP server so we don't receive any more messages. There's no 
> need to receive messages when we are shutting down. Server heartbeats could 
> be impacted I guess, but I don't think they will be cause we currently only 
> kill RS on ZK timeout.
> 2) Then do whatever cleanup we think is needed that requires proc wal.
> 3) Then close proc WAL so no errant threads can create more procs.
> 4) Then do whatever other cleanup.
> 5) Finally delete znode.
> Right now znode is deleted somewhat early I think, and RpcServer is closed 
> very late.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22081) master shutdown: close RpcServer and procWAL first thing

2019-04-30 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22081:
-
Attachment: HBASE-22081.03.patch

> master shutdown: close RpcServer and procWAL first thing
> 
>
> Key: HBASE-22081
> URL: https://issues.apache.org/jira/browse/HBASE-22081
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22081.01.patch, HBASE-22081.02.patch, 
> HBASE-22081.03.patch, HBASE-22081.patch
>
>
> I had a master get stuck due to HBASE-22079 and noticed it was logging RS 
> abort messages during shutdown.
> [~bahramch] found some issues where messages are processed by old master 
> during shutdown due to a race condition in RS cache (or it could also happen 
> due to a network race).
> Previously I found some bug where SCP was created during master shutdown that 
> had incorrect state (because some structures already got cleaned).
> I think before master fencing is implemented we can at least make these 
> issues much less likely by thinking about shutdown order.
> 1) First kill RCP server so we don't receive any more messages. There's no 
> need to receive messages when we are shutting down. Server heartbeats could 
> be impacted I guess, but I don't think they will be cause we currently only 
> kill RS on ZK timeout.
> 2) Then do whatever cleanup we think is needed that requires proc wal.
> 3) Then close proc WAL so no errant threads can create more procs.
> 4) Then do whatever other cleanup.
> 5) Finally delete znode.
> Right now znode is deleted somewhat early I think, and RpcServer is closed 
> very late.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners

2019-04-30 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22346:
-
Description: 
I was looking at using the priority (deadline) queue for scanner requests; what 
I see is that AnnotationReadingPriorityFunction, the only impl of the deadline 
function available, implements getDeadline as sqrt of the number of next() 
calls, from HBASE-10993.
However, CallPriorityComparator.compare, its only caller, adds that "deadline" 
value to the callA.getReceiveTime() in milliseconds...

That results in some sort of a meaningless value that I assume only make sense 
"by coincidence" for telling apart broad and specific classes of scanners... in 
practice next calls must be in the 1000s before it becomes meaningful vs small 
differences in ReceivedTime

When there's contention from many scanners, e.g. small scanners for meta, or 
just users creating tons of scanners to the point where requests queue up, the 
actual deadline is not accounted for and the priority function itself is 
meaningless... In fact as queueing increases, it becomes worse because 
receivedtime differences grow.

  was:
I was looking at using the priority (deadline) queue for scanner requests; what 
I see is that AnnotationReadingPriorityFunction, the only impl of the deadline 
function available, implements getDeadline as sqrt of the number of next() 
calls, from HBASE-10993.
However, CallPriorityComparator.compare, its only caller, adds that "deadline" 
value to the callA.getReceiveTime() in milliseconds...

That results in some sort of a meaningless value that I assume only make sense 
"by coincidence" for telling apart broad and specific classes of scanners... in 
practice next calls must be in the 1000s before it becomes meaningful vs small 
differences in ReceivedTime

When there's contention for many scanners, e.g. small scanners for meta, or 
just users creating tons of scanners to the point where requests queue up, the 
actual deadline is not accounted for and the priority function itself is 
meaningless... In fact as queueing increases, it becomes worse because 
receivedtime differences grow.


> scanner priorities/deadline units are invalid for non-huge scanners
> ---
>
> Key: HBASE-22346
> URL: https://issues.apache.org/jira/browse/HBASE-22346
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> I was looking at using the priority (deadline) queue for scanner requests; 
> what I see is that AnnotationReadingPriorityFunction, the only impl of the 
> deadline function available, implements getDeadline as sqrt of the number of 
> next() calls, from HBASE-10993.
> However, CallPriorityComparator.compare, its only caller, adds that 
> "deadline" value to the callA.getReceiveTime() in milliseconds...
> That results in some sort of a meaningless value that I assume only make 
> sense "by coincidence" for telling apart broad and specific classes of 
> scanners... in practice next calls must be in the 1000s before it becomes 
> meaningful vs small differences in ReceivedTime
> When there's contention from many scanners, e.g. small scanners for meta, or 
> just users creating tons of scanners to the point where requests queue up, 
> the actual deadline is not accounted for and the priority function itself is 
> meaningless... In fact as queueing increases, it becomes worse because 
> receivedtime differences grow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners

2019-04-30 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22346:
-
Description: 
I was looking at using the priority (deadline) queue for scanner requests; what 
I see is that AnnotationReadingPriorityFunction, the only impl of the deadline 
function available, implements getDeadline as sqrt of the number of next() 
calls, from HBASE-10993.
However, CallPriorityComparator.compare, its only caller, adds that "deadline" 
value to the callA.getReceiveTime() in milliseconds...

That results in some sort of a meaningless value that I assume only make sense 
"by coincidence" for telling apart broad and specific classes of scanners... in 
practice next calls must be in the 1000s before it becomes meaningful vs small 
differences in ReceivedTime

When there's contention for many scanners, e.g. small scanners for meta, or 
just users creating tons of scanners to the point where requests queue up, the 
actual deadline is not accounted for and the priority function itself is 
meaningless... In fact as queueing increases, it becomes worse because 
receivedtime differences grow.

  was:
I was looking at using the priority (deadline) queue for scanner requests; what 
I see is that AnnotationReadingPriorityFunction, the only impl of the deadline 
function available, implements getDeadline as sqrt of the number of next() 
calls, from HBASE-10993.
However, CallPriorityComparator.compare, its only caller, adds that "deadline" 
value to the callA.getReceiveTime() in milliseconds...

That results in some sort of a meaningless value that I assume only make sense 
by coincidence for telling apart broad and specific classes of scanners... in 
practice next calls must be in the 1000s before it becomes meaningful vs small 
differences in ReceivedTime

When there's contention for many scanners, e.g. small scanners for meta, or 
just users creating tons of scanners to the point where requests queue up, the 
actual deadline is not accounted for and the priority function itself is 
meaningless... In fact as queueing increases, it becomes worse because 
receivedtime differences grow.


> scanner priorities/deadline units are invalid for non-huge scanners
> ---
>
> Key: HBASE-22346
> URL: https://issues.apache.org/jira/browse/HBASE-22346
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> I was looking at using the priority (deadline) queue for scanner requests; 
> what I see is that AnnotationReadingPriorityFunction, the only impl of the 
> deadline function available, implements getDeadline as sqrt of the number of 
> next() calls, from HBASE-10993.
> However, CallPriorityComparator.compare, its only caller, adds that 
> "deadline" value to the callA.getReceiveTime() in milliseconds...
> That results in some sort of a meaningless value that I assume only make 
> sense "by coincidence" for telling apart broad and specific classes of 
> scanners... in practice next calls must be in the 1000s before it becomes 
> meaningful vs small differences in ReceivedTime
> When there's contention for many scanners, e.g. small scanners for meta, or 
> just users creating tons of scanners to the point where requests queue up, 
> the actual deadline is not accounted for and the priority function itself is 
> meaningless... In fact as queueing increases, it becomes worse because 
> receivedtime differences grow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners

2019-04-30 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830647#comment-16830647
 ] 

Sergey Shelukhin commented on HBASE-22346:
--

cc [~mbertozzi] was adding the number to received time intentional?

> scanner priorities/deadline units are invalid for non-huge scanners
> ---
>
> Key: HBASE-22346
> URL: https://issues.apache.org/jira/browse/HBASE-22346
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> I was looking at using the priority (deadline) queue for scanner requests; 
> what I see is that AnnotationReadingPriorityFunction, the only impl of the 
> deadline function available, implements getDeadline as sqrt of the number of 
> next() calls, from HBASE-10993.
> However, CallPriorityComparator.compare, its only caller, adds that 
> "deadline" value to the callA.getReceiveTime() in milliseconds...
> That results in some sort of a meaningless value that I assume only make 
> sense "by coincidence" for telling apart broad and specific classes of 
> scanners... in practice next calls must be in the 1000s before it becomes 
> meaningful vs small differences in ReceivedTime
> When there's contention from many scanners, e.g. small scanners for meta, or 
> just users creating tons of scanners to the point where requests queue up, 
> the actual deadline is not accounted for and the priority function itself is 
> meaningless... In fact as queueing increases, it becomes worse because 
> receivedtime differences grow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22346) scanner priorities/deadline units are invalid for non-huge scanners

2019-04-30 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22346:


 Summary: scanner priorities/deadline units are invalid for 
non-huge scanners
 Key: HBASE-22346
 URL: https://issues.apache.org/jira/browse/HBASE-22346
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


I was looking at using the priority (deadline) queue for scanner requests; what 
I see is that AnnotationReadingPriorityFunction, the only impl of the deadline 
function available, implements getDeadline as sqrt of the number of next() 
calls, from HBASE-10993.
However, CallPriorityComparator.compare, its only caller, adds that "deadline" 
value to the callA.getReceiveTime() in milliseconds...

That results in some sort of a meaningless value that I assume only make sense 
by coincidence for telling apart broad and specific classes of scanners... in 
practice next calls must be in the 1000s before it becomes meaningful vs small 
differences in ReceivedTime

When there's contention for many scanners, e.g. small scanners for meta, or 
just users creating tons of scanners to the point where requests queue up, the 
actual deadline is not accounted for and the priority function itself is 
meaningless... In fact as queueing increases, it becomes worse because 
receivedtime differences grow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow

2019-04-30 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830463#comment-16830463
 ] 

Sergey Shelukhin commented on HBASE-22301:
--

It may do so anyway, due to HDFS-14387 
What is the thing that prevents it from picking local node
+1

> Consider rolling the WAL if the HDFS write pipeline is slow
> ---
>
> Key: HBASE-22301
> URL: https://issues.apache.org/jira/browse/HBASE-22301
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch
>
>
> Consider the case when a subset of the HDFS fleet is unhealthy but suffering 
> a gray failure not an outright outage. HDFS operations, notably syncs, are 
> abnormally slow on pipelines which include this subset of hosts. If the 
> regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be 
> consumed waiting for acks from the datanodes in the pipeline (recall that 
> some of them are sick). Imagine a write heavy application distributing load 
> uniformly over the cluster at a fairly high rate. With the WAL subsystem 
> slowed by HDFS level issues, all handlers can be blocked waiting to append to 
> the WAL. Once all handlers are blocked, the application will experience 
> backpressure. All (HBase) clients eventually have too many outstanding writes 
> and block.
> Because the application is distributing writes near uniformly in the 
> keyspace, the probability any given service endpoint will dispatch a request 
> to an impacted regionserver, even a single regionserver, approaches 1.0. So 
> the probability that all service endpoints will be affected approaches 1.0.
> In order to break the logjam, we need to remove the slow datanodes. Although 
> there is HDFS level monitoring, mechanisms, and procedures for this, we 
> should also attempt to take mitigating action at the HBase layer as soon as 
> we find ourselves in trouble. It would be enough to remove the affected 
> datanodes from the writer pipelines. A super simple strategy that can be 
> effective is described below:
> This is with branch-1 code. I think branch-2's async WAL can mitigate but 
> still can be susceptible. branch-2 sync WAL is susceptible. 
> We already roll the WAL writer if the pipeline suffers the failure of a 
> datanode and the replication factor on the pipeline is too low. We should 
> also consider how much time it took for the write pipeline to complete a sync 
> the last time we measured it, or the max over the interval from now to the 
> last time we checked. If the sync time exceeds a configured threshold, roll 
> the log writer then too. Fortunately we don't need to know which datanode is 
> making the WAL write pipeline slow, only that syncs on the pipeline are too 
> slow and exceeding a threshold. This is enough information to know when to 
> roll it. Once we roll it, we will get three new randomly selected datanodes. 
> On most clusters the probability the new pipeline includes the slow datanode 
> will be low. (And if for some reason it does end up with a problematic 
> datanode again, we roll again.)
> This is not a silver bullet but this can be a reasonably effective mitigation.
> Provide a metric for tracking when log roll is requested (and for what 
> reason).
> Emit a log line at log roll time that includes datanode pipeline details for 
> further debugging and analysis, similar to the existing slow FSHLog sync log 
> line.
> If we roll too many times within a short interval of time this probably means 
> there is a widespread problem with the fleet and so our mitigation is not 
> helping and may be exacerbating those problems or operator difficulties. 
> Ensure log roll requests triggered by this new feature happen infrequently 
> enough to not cause difficulties under either normal or abnormal conditions. 
> A very simple strategy that could work well under both normal and abnormal 
> conditions is to define a fairly lengthy interval, default 5 minutes, and 
> then insure we do not roll more than once during this interval for this 
> reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22081) master shutdown: close RpcServer and procWAL first thing

2019-04-29 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829874#comment-16829874
 ] 

Sergey Shelukhin commented on HBASE-22081:
--

This patch is getting more and more interesting.
Looks like some procedures do not handle interruptedioexception correctly, 
retrying it forever, which in the case of minicluster, prevents it from 
shutting down. Not sure how the order of termination affected it, probably 
procwal terminating early just catches the proc in the test in a different 
state than it did before.

> master shutdown: close RpcServer and procWAL first thing
> 
>
> Key: HBASE-22081
> URL: https://issues.apache.org/jira/browse/HBASE-22081
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22081.01.patch, HBASE-22081.02.patch, 
> HBASE-22081.patch
>
>
> I had a master get stuck due to HBASE-22079 and noticed it was logging RS 
> abort messages during shutdown.
> [~bahramch] found some issues where messages are processed by old master 
> during shutdown due to a race condition in RS cache (or it could also happen 
> due to a network race).
> Previously I found some bug where SCP was created during master shutdown that 
> had incorrect state (because some structures already got cleaned).
> I think before master fencing is implemented we can at least make these 
> issues much less likely by thinking about shutdown order.
> 1) First kill RCP server so we don't receive any more messages. There's no 
> need to receive messages when we are shutting down. Server heartbeats could 
> be impacted I guess, but I don't think they will be cause we currently only 
> kill RS on ZK timeout.
> 2) Then do whatever cleanup we think is needed that requires proc wal.
> 3) Then close proc WAL so no errant threads can create more procs.
> 4) Then do whatever other cleanup.
> 5) Finally delete znode.
> Right now znode is deleted somewhat early I think, and RpcServer is closed 
> very late.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22081) master shutdown: close RpcServer and procWAL first thing

2019-04-29 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22081:
-
Attachment: HBASE-22081.02.patch

> master shutdown: close RpcServer and procWAL first thing
> 
>
> Key: HBASE-22081
> URL: https://issues.apache.org/jira/browse/HBASE-22081
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22081.01.patch, HBASE-22081.02.patch, 
> HBASE-22081.patch
>
>
> I had a master get stuck due to HBASE-22079 and noticed it was logging RS 
> abort messages during shutdown.
> [~bahramch] found some issues where messages are processed by old master 
> during shutdown due to a race condition in RS cache (or it could also happen 
> due to a network race).
> Previously I found some bug where SCP was created during master shutdown that 
> had incorrect state (because some structures already got cleaned).
> I think before master fencing is implemented we can at least make these 
> issues much less likely by thinking about shutdown order.
> 1) First kill RCP server so we don't receive any more messages. There's no 
> need to receive messages when we are shutting down. Server heartbeats could 
> be impacted I guess, but I don't think they will be cause we currently only 
> kill RS on ZK timeout.
> 2) Then do whatever cleanup we think is needed that requires proc wal.
> 3) Then close proc WAL so no errant threads can create more procs.
> 4) Then do whatever other cleanup.
> 5) Finally delete znode.
> Right now znode is deleted somewhat early I think, and RpcServer is closed 
> very late.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow

2019-04-29 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829803#comment-16829803
 ] 

Sergey Shelukhin edited comment on HBASE-22301 at 4/29/19 10:59 PM:


Well I meant  Consider rolling the WAL if the HDFS write pipeline is slow
> ---
>
> Key: HBASE-22301
> URL: https://issues.apache.org/jira/browse/HBASE-22301
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch
>
>
> Consider the case when a subset of the HDFS fleet is unhealthy but suffering 
> a gray failure not an outright outage. HDFS operations, notably syncs, are 
> abnormally slow on pipelines which include this subset of hosts. If the 
> regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be 
> consumed waiting for acks from the datanodes in the pipeline (recall that 
> some of them are sick). Imagine a write heavy application distributing load 
> uniformly over the cluster at a fairly high rate. With the WAL subsystem 
> slowed by HDFS level issues, all handlers can be blocked waiting to append to 
> the WAL. Once all handlers are blocked, the application will experience 
> backpressure. All (HBase) clients eventually have too many outstanding writes 
> and block.
> Because the application is distributing writes near uniformly in the 
> keyspace, the probability any given service endpoint will dispatch a request 
> to an impacted regionserver, even a single regionserver, approaches 1.0. So 
> the probability that all service endpoints will be affected approaches 1.0.
> In order to break the logjam, we need to remove the slow datanodes. Although 
> there is HDFS level monitoring, mechanisms, and procedures for this, we 
> should also attempt to take mitigating action at the HBase layer as soon as 
> we find ourselves in trouble. It would be enough to remove the affected 
> datanodes from the writer pipelines. A super simple strategy that can be 
> effective is described below:
> This is with branch-1 code. I think branch-2's async WAL can mitigate but 
> still can be susceptible. branch-2 sync WAL is susceptible. 
> We already roll the WAL writer if the pipeline suffers the failure of a 
> datanode and the replication factor on the pipeline is too low. We should 
> also consider how much time it took for the write pipeline to complete a sync 
> the last time we measured it, or the max over the interval from now to the 
> last time we checked. If the sync time exceeds a configured threshold, roll 
> the log writer then too. Fortunately we don't need to know which datanode is 
> making the WAL write pipeline slow, only that syncs on the pipeline are too 
> slow and exceeding a threshold. This is enough information to know when to 
> roll it. Once we roll it, we will get three new randomly selected datanodes. 
> On most clusters the probability the new pipeline includes the slow datanode 
> will be low. (And if for some reason it does end up with a problematic 
> datanode again, we roll again.)
> This is not a silver bullet but this can be a reasonably effective mitigation.
> Provide a metric for tracking when log roll is requested (and for what 
> reason).
> Emit a log line at log roll time that includes datanode pipeline details for 
> further debugging and analysis, similar to the existing slow FSHLog sync log 
> line.
> If we roll too many times within a short interval of time this probably means 
> there is a widespread problem with the fleet and so our mitigation is not 
> helping and may be exacerbating those problems or operator difficulties. 
> Ensure log roll requests triggered by this new feature happen infrequently 
> enough to not cause difficulties under either normal or abnormal conditions. 
> A very simple strategy that could work well under both normal and abnormal 
> conditions is to define a fairly lengthy interval, default 5 minutes, and 
> then insure we do not roll more than once during this interval for this 
> reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow

2019-04-29 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829803#comment-16829803
 ] 

Sergey Shelukhin commented on HBASE-22301:
--

Well I meant  Consider rolling the WAL if the HDFS write pipeline is slow
> ---
>
> Key: HBASE-22301
> URL: https://issues.apache.org/jira/browse/HBASE-22301
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch
>
>
> Consider the case when a subset of the HDFS fleet is unhealthy but suffering 
> a gray failure not an outright outage. HDFS operations, notably syncs, are 
> abnormally slow on pipelines which include this subset of hosts. If the 
> regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be 
> consumed waiting for acks from the datanodes in the pipeline (recall that 
> some of them are sick). Imagine a write heavy application distributing load 
> uniformly over the cluster at a fairly high rate. With the WAL subsystem 
> slowed by HDFS level issues, all handlers can be blocked waiting to append to 
> the WAL. Once all handlers are blocked, the application will experience 
> backpressure. All (HBase) clients eventually have too many outstanding writes 
> and block.
> Because the application is distributing writes near uniformly in the 
> keyspace, the probability any given service endpoint will dispatch a request 
> to an impacted regionserver, even a single regionserver, approaches 1.0. So 
> the probability that all service endpoints will be affected approaches 1.0.
> In order to break the logjam, we need to remove the slow datanodes. Although 
> there is HDFS level monitoring, mechanisms, and procedures for this, we 
> should also attempt to take mitigating action at the HBase layer as soon as 
> we find ourselves in trouble. It would be enough to remove the affected 
> datanodes from the writer pipelines. A super simple strategy that can be 
> effective is described below:
> This is with branch-1 code. I think branch-2's async WAL can mitigate but 
> still can be susceptible. branch-2 sync WAL is susceptible. 
> We already roll the WAL writer if the pipeline suffers the failure of a 
> datanode and the replication factor on the pipeline is too low. We should 
> also consider how much time it took for the write pipeline to complete a sync 
> the last time we measured it, or the max over the interval from now to the 
> last time we checked. If the sync time exceeds a configured threshold, roll 
> the log writer then too. Fortunately we don't need to know which datanode is 
> making the WAL write pipeline slow, only that syncs on the pipeline are too 
> slow and exceeding a threshold. This is enough information to know when to 
> roll it. Once we roll it, we will get three new randomly selected datanodes. 
> On most clusters the probability the new pipeline includes the slow datanode 
> will be low. (And if for some reason it does end up with a problematic 
> datanode again, we roll again.)
> This is not a silver bullet but this can be a reasonably effective mitigation.
> Provide a metric for tracking when log roll is requested (and for what 
> reason).
> Emit a log line at log roll time that includes datanode pipeline details for 
> further debugging and analysis, similar to the existing slow FSHLog sync log 
> line.
> If we roll too many times within a short interval of time this probably means 
> there is a widespread problem with the fleet and so our mitigation is not 
> helping and may be exacerbating those problems or operator difficulties. 
> Ensure log roll requests triggered by this new feature happen infrequently 
> enough to not cause difficulties under either normal or abnormal conditions. 
> A very simple strategy that could work well under both normal and abnormal 
> conditions is to define a fairly lengthy interval, default 5 minutes, and 
> then insure we do not roll more than once during this interval for this 
> reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow

2019-04-29 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829795#comment-16829795
 ] 

Sergey Shelukhin edited comment on HBASE-22301 at 4/29/19 10:50 PM:


In our case though the problem was that each slow sync would take (edit: up to) 
10s of seconds, so with current DEFAULT_SLOW_SYNC_ROLL_THRESHOLD as far as I 
can tell from the patch the condition would not trigger for a very long time.
Should the rolling simply be based on a single value that is a total/weighted 
sync time accumulated over a N latest syncs, with no minimum threshold by 
count? That way it can accumulate the offending amount of sync over a single 
bad one or multiple somewhat-bad ones.


was (Author: sershe):
In our case though the problem was that each slow sync would take 10s of 
seconds, so with current DEFAULT_SLOW_SYNC_ROLL_THRESHOLD as far as I can tell 
from the patch the condition would not trigger for a very long time.
Should the rolling simply based on a single value that is a total/weighted sync 
time accumulated over a N latest syncs, with no minimum threshold by count? 
That way it can accumulate the offending amount of sync over a single bad one 
or multiple somewhat-bad ones.

> Consider rolling the WAL if the HDFS write pipeline is slow
> ---
>
> Key: HBASE-22301
> URL: https://issues.apache.org/jira/browse/HBASE-22301
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch
>
>
> Consider the case when a subset of the HDFS fleet is unhealthy but suffering 
> a gray failure not an outright outage. HDFS operations, notably syncs, are 
> abnormally slow on pipelines which include this subset of hosts. If the 
> regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be 
> consumed waiting for acks from the datanodes in the pipeline (recall that 
> some of them are sick). Imagine a write heavy application distributing load 
> uniformly over the cluster at a fairly high rate. With the WAL subsystem 
> slowed by HDFS level issues, all handlers can be blocked waiting to append to 
> the WAL. Once all handlers are blocked, the application will experience 
> backpressure. All (HBase) clients eventually have too many outstanding writes 
> and block.
> Because the application is distributing writes near uniformly in the 
> keyspace, the probability any given service endpoint will dispatch a request 
> to an impacted regionserver, even a single regionserver, approaches 1.0. So 
> the probability that all service endpoints will be affected approaches 1.0.
> In order to break the logjam, we need to remove the slow datanodes. Although 
> there is HDFS level monitoring, mechanisms, and procedures for this, we 
> should also attempt to take mitigating action at the HBase layer as soon as 
> we find ourselves in trouble. It would be enough to remove the affected 
> datanodes from the writer pipelines. A super simple strategy that can be 
> effective is described below:
> This is with branch-1 code. I think branch-2's async WAL can mitigate but 
> still can be susceptible. branch-2 sync WAL is susceptible. 
> We already roll the WAL writer if the pipeline suffers the failure of a 
> datanode and the replication factor on the pipeline is too low. We should 
> also consider how much time it took for the write pipeline to complete a sync 
> the last time we measured it, or the max over the interval from now to the 
> last time we checked. If the sync time exceeds a configured threshold, roll 
> the log writer then too. Fortunately we don't need to know which datanode is 
> making the WAL write pipeline slow, only that syncs on the pipeline are too 
> slow and exceeding a threshold. This is enough information to know when to 
> roll it. Once we roll it, we will get three new randomly selected datanodes. 
> On most clusters the probability the new pipeline includes the slow datanode 
> will be low. (And if for some reason it does end up with a problematic 
> datanode again, we roll again.)
> This is not a silver bullet but this can be a reasonably effective mitigation.
> Provide a metric for tracking when log roll is requested (and for what 
> reason).
> Emit a log line at log roll time that includes datanode pipeline details for 
> further debugging and analysis, similar to the existing slow FSHLog sync log 
> line.
> If we roll too many times within a short interval of time this probably means 
> there is a widespread problem with the fleet and so our 

[jira] [Commented] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow

2019-04-29 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829797#comment-16829797
 ] 

Sergey Shelukhin commented on HBASE-22301:
--

Sorry about the master only.  I just started contributing to HBase again and 
was assuming that we should move forwards, not backwards ;)
I saw that branch-2 is still very much alive now so I'm committing recent fixes 
there too.

> Consider rolling the WAL if the HDFS write pipeline is slow
> ---
>
> Key: HBASE-22301
> URL: https://issues.apache.org/jira/browse/HBASE-22301
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch
>
>
> Consider the case when a subset of the HDFS fleet is unhealthy but suffering 
> a gray failure not an outright outage. HDFS operations, notably syncs, are 
> abnormally slow on pipelines which include this subset of hosts. If the 
> regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be 
> consumed waiting for acks from the datanodes in the pipeline (recall that 
> some of them are sick). Imagine a write heavy application distributing load 
> uniformly over the cluster at a fairly high rate. With the WAL subsystem 
> slowed by HDFS level issues, all handlers can be blocked waiting to append to 
> the WAL. Once all handlers are blocked, the application will experience 
> backpressure. All (HBase) clients eventually have too many outstanding writes 
> and block.
> Because the application is distributing writes near uniformly in the 
> keyspace, the probability any given service endpoint will dispatch a request 
> to an impacted regionserver, even a single regionserver, approaches 1.0. So 
> the probability that all service endpoints will be affected approaches 1.0.
> In order to break the logjam, we need to remove the slow datanodes. Although 
> there is HDFS level monitoring, mechanisms, and procedures for this, we 
> should also attempt to take mitigating action at the HBase layer as soon as 
> we find ourselves in trouble. It would be enough to remove the affected 
> datanodes from the writer pipelines. A super simple strategy that can be 
> effective is described below:
> This is with branch-1 code. I think branch-2's async WAL can mitigate but 
> still can be susceptible. branch-2 sync WAL is susceptible. 
> We already roll the WAL writer if the pipeline suffers the failure of a 
> datanode and the replication factor on the pipeline is too low. We should 
> also consider how much time it took for the write pipeline to complete a sync 
> the last time we measured it, or the max over the interval from now to the 
> last time we checked. If the sync time exceeds a configured threshold, roll 
> the log writer then too. Fortunately we don't need to know which datanode is 
> making the WAL write pipeline slow, only that syncs on the pipeline are too 
> slow and exceeding a threshold. This is enough information to know when to 
> roll it. Once we roll it, we will get three new randomly selected datanodes. 
> On most clusters the probability the new pipeline includes the slow datanode 
> will be low. (And if for some reason it does end up with a problematic 
> datanode again, we roll again.)
> This is not a silver bullet but this can be a reasonably effective mitigation.
> Provide a metric for tracking when log roll is requested (and for what 
> reason).
> Emit a log line at log roll time that includes datanode pipeline details for 
> further debugging and analysis, similar to the existing slow FSHLog sync log 
> line.
> If we roll too many times within a short interval of time this probably means 
> there is a widespread problem with the fleet and so our mitigation is not 
> helping and may be exacerbating those problems or operator difficulties. 
> Ensure log roll requests triggered by this new feature happen infrequently 
> enough to not cause difficulties under either normal or abnormal conditions. 
> A very simple strategy that could work well under both normal and abnormal 
> conditions is to define a fairly lengthy interval, default 5 minutes, and 
> then insure we do not roll more than once during this interval for this 
> reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow

2019-04-29 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829795#comment-16829795
 ] 

Sergey Shelukhin commented on HBASE-22301:
--

In our case though the problem was that each slow sync would take 10s of 
seconds, so with current DEFAULT_SLOW_SYNC_ROLL_THRESHOLD as far as I can tell 
from the patch the condition would not trigger for a very long time.
Should the rolling simply based on a single value that is a total/weighted sync 
time accumulated over a N latest syncs, with no minimum threshold by count? 
That way it can accumulate the offending amount of sync over a single bad one 
or multiple somewhat-bad ones.

> Consider rolling the WAL if the HDFS write pipeline is slow
> ---
>
> Key: HBASE-22301
> URL: https://issues.apache.org/jira/browse/HBASE-22301
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch
>
>
> Consider the case when a subset of the HDFS fleet is unhealthy but suffering 
> a gray failure not an outright outage. HDFS operations, notably syncs, are 
> abnormally slow on pipelines which include this subset of hosts. If the 
> regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be 
> consumed waiting for acks from the datanodes in the pipeline (recall that 
> some of them are sick). Imagine a write heavy application distributing load 
> uniformly over the cluster at a fairly high rate. With the WAL subsystem 
> slowed by HDFS level issues, all handlers can be blocked waiting to append to 
> the WAL. Once all handlers are blocked, the application will experience 
> backpressure. All (HBase) clients eventually have too many outstanding writes 
> and block.
> Because the application is distributing writes near uniformly in the 
> keyspace, the probability any given service endpoint will dispatch a request 
> to an impacted regionserver, even a single regionserver, approaches 1.0. So 
> the probability that all service endpoints will be affected approaches 1.0.
> In order to break the logjam, we need to remove the slow datanodes. Although 
> there is HDFS level monitoring, mechanisms, and procedures for this, we 
> should also attempt to take mitigating action at the HBase layer as soon as 
> we find ourselves in trouble. It would be enough to remove the affected 
> datanodes from the writer pipelines. A super simple strategy that can be 
> effective is described below:
> This is with branch-1 code. I think branch-2's async WAL can mitigate but 
> still can be susceptible. branch-2 sync WAL is susceptible. 
> We already roll the WAL writer if the pipeline suffers the failure of a 
> datanode and the replication factor on the pipeline is too low. We should 
> also consider how much time it took for the write pipeline to complete a sync 
> the last time we measured it, or the max over the interval from now to the 
> last time we checked. If the sync time exceeds a configured threshold, roll 
> the log writer then too. Fortunately we don't need to know which datanode is 
> making the WAL write pipeline slow, only that syncs on the pipeline are too 
> slow and exceeding a threshold. This is enough information to know when to 
> roll it. Once we roll it, we will get three new randomly selected datanodes. 
> On most clusters the probability the new pipeline includes the slow datanode 
> will be low. (And if for some reason it does end up with a problematic 
> datanode again, we roll again.)
> This is not a silver bullet but this can be a reasonably effective mitigation.
> Provide a metric for tracking when log roll is requested (and for what 
> reason).
> Emit a log line at log roll time that includes datanode pipeline details for 
> further debugging and analysis, similar to the existing slow FSHLog sync log 
> line.
> If we roll too many times within a short interval of time this probably means 
> there is a widespread problem with the fleet and so our mitigation is not 
> helping and may be exacerbating those problems or operator difficulties. 
> Ensure log roll requests triggered by this new feature happen infrequently 
> enough to not cause difficulties under either normal or abnormal conditions. 
> A very simple strategy that could work well under both normal and abnormal 
> conditions is to define a fairly lengthy interval, default 5 minutes, and 
> then insure we do not roll more than once during this interval for this 
> reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22301) Consider rolling the WAL if the HDFS write pipeline is slow

2019-04-29 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829784#comment-16829784
 ] 

Sergey Shelukhin commented on HBASE-22301:
--

Should this augment/be similar to HBASE-21806?

> Consider rolling the WAL if the HDFS write pipeline is slow
> ---
>
> Key: HBASE-22301
> URL: https://issues.apache.org/jira/browse/HBASE-22301
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch, HBASE-22301-branch-1.patch, 
> HBASE-22301-branch-1.patch
>
>
> Consider the case when a subset of the HDFS fleet is unhealthy but suffering 
> a gray failure not an outright outage. HDFS operations, notably syncs, are 
> abnormally slow on pipelines which include this subset of hosts. If the 
> regionserver's WAL is backed by an impacted pipeline, all WAL handlers can be 
> consumed waiting for acks from the datanodes in the pipeline (recall that 
> some of them are sick). Imagine a write heavy application distributing load 
> uniformly over the cluster at a fairly high rate. With the WAL subsystem 
> slowed by HDFS level issues, all handlers can be blocked waiting to append to 
> the WAL. Once all handlers are blocked, the application will experience 
> backpressure. All (HBase) clients eventually have too many outstanding writes 
> and block.
> Because the application is distributing writes near uniformly in the 
> keyspace, the probability any given service endpoint will dispatch a request 
> to an impacted regionserver, even a single regionserver, approaches 1.0. So 
> the probability that all service endpoints will be affected approaches 1.0.
> In order to break the logjam, we need to remove the slow datanodes. Although 
> there is HDFS level monitoring, mechanisms, and procedures for this, we 
> should also attempt to take mitigating action at the HBase layer as soon as 
> we find ourselves in trouble. It would be enough to remove the affected 
> datanodes from the writer pipelines. A super simple strategy that can be 
> effective is described below:
> This is with branch-1 code. I think branch-2's async WAL can mitigate but 
> still can be susceptible. branch-2 sync WAL is susceptible. 
> We already roll the WAL writer if the pipeline suffers the failure of a 
> datanode and the replication factor on the pipeline is too low. We should 
> also consider how much time it took for the write pipeline to complete a sync 
> the last time we measured it, or the max over the interval from now to the 
> last time we checked. If the sync time exceeds a configured threshold, roll 
> the log writer then too. Fortunately we don't need to know which datanode is 
> making the WAL write pipeline slow, only that syncs on the pipeline are too 
> slow and exceeding a threshold. This is enough information to know when to 
> roll it. Once we roll it, we will get three new randomly selected datanodes. 
> On most clusters the probability the new pipeline includes the slow datanode 
> will be low. (And if for some reason it does end up with a problematic 
> datanode again, we roll again.)
> This is not a silver bullet but this can be a reasonably effective mitigation.
> Provide a metric for tracking when log roll is requested (and for what 
> reason).
> Emit a log line at log roll time that includes datanode pipeline details for 
> further debugging and analysis, similar to the existing slow FSHLog sync log 
> line.
> If we roll too many times within a short interval of time this probably means 
> there is a widespread problem with the fleet and so our mitigation is not 
> helping and may be exacerbating those problems or operator difficulties. 
> Ensure log roll requests triggered by this new feature happen infrequently 
> enough to not cause difficulties under either normal or abnormal conditions. 
> A very simple strategy that could work well under both normal and abnormal 
> conditions is to define a fairly lengthy interval, default 5 minutes, and 
> then insure we do not roll more than once during this interval for this 
> reason.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22334) handle blocking RPC threads better (time out calls? )

2019-04-29 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22334:


 Summary: handle blocking RPC threads better (time out calls? )
 Key: HBASE-22334
 URL: https://issues.apache.org/jira/browse/HBASE-22334
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


Combined with HBASE-22333, we had the case where user sent lots of create table 
requests with pre-split for the same table (because the tasks of some job would 
try to create table opportunistically if it doesn't exist, and there were many 
such tasks); these requests took up all the RPC threads and caused large call 
queue to form; then, the first call got stuck because RS calls to report an 
opened region were stuck in queue. All the other calls were stuck here:
{noformat}
  submitProcedure(
new CreateTableProcedure(procedureExecutor.getEnvironment(), desc, 
newRegions, latch));
  latch.await();
{noformat}

The procedures in this case were stuck for hours; even if the other issue was 
resolved, assigning 1000s of regions can take a long time and cause lots of 
delay before it unblocks the the other procedures and allows them to release 
the latch.

In general, waiting on RPC thread is not a good idea. I wonder if it would make 
sense to fail client requests taking up the RPC thread based on timeout; or if 
they are not making progress (e.g. in this case, the procedure is not getting 
updated; might need to be handled on case by case basis).





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-22333) move certain internal RPCs to high priority threadpool

2019-04-29 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-22333:
-
Summary: move certain internal RPCs to high priority threadpool  (was: move 
certain internal RPCs to high priority level)

> move certain internal RPCs to high priority threadpool
> --
>
> Key: HBASE-22333
> URL: https://issues.apache.org/jira/browse/HBASE-22333
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> User calls can inadvertently DDoS master (and potentially RS), causing issues 
> (e.g. CallQueueTooBig) for important system calls like 
> reportRegionStateTransition.
> These calls should be moved to high pri level... I wonder if all the 
> low-volume internal calls (i.e. except heartbeats and maybe WAL splitting 
> stuff) should have higher pri (e.g. 20 QoS in HConstants). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22333) move certain internal RPCs to high priority level

2019-04-29 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-22333:


 Summary: move certain internal RPCs to high priority level
 Key: HBASE-22333
 URL: https://issues.apache.org/jira/browse/HBASE-22333
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


User calls can inadvertently DDoS master (and potentially RS), causing issues 
(e.g. CallQueueTooBig) for important system calls like 
reportRegionStateTransition.
These calls should be moved to high pri level... I wonder if all the low-volume 
internal calls (i.e. except heartbeats and maybe WAL splitting stuff) should 
have higher pri (e.g. 20 QoS in HConstants). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22333) move certain internal RPCs to high priority level

2019-04-29 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829761#comment-16829761
 ] 

Sergey Shelukhin commented on HBASE-22333:
--

cc [~bahramch]

> move certain internal RPCs to high priority level
> -
>
> Key: HBASE-22333
> URL: https://issues.apache.org/jira/browse/HBASE-22333
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Priority: Major
>
> User calls can inadvertently DDoS master (and potentially RS), causing issues 
> (e.g. CallQueueTooBig) for important system calls like 
> reportRegionStateTransition.
> These calls should be moved to high pri level... I wonder if all the 
> low-volume internal calls (i.e. except heartbeats and maybe WAL splitting 
> stuff) should have higher pri (e.g. 20 QoS in HConstants). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-22081) master shutdown: close RpcServer and procWAL first thing

2019-04-26 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16827407#comment-16827407
 ] 

Sergey Shelukhin commented on HBASE-22081:
--


Before the patch, it is mere coincidence that while all the other stuff is 
shutting down, the rpc that caused it has a chance to return. 
Caller could get unlucky and stop RPC would fail because RPC server was 
closed... now that we shut down RPC server first thing, it happens almost all 
the time.
Added a small sleep before starting shutdown if it was triggered by and RPC 
request 0_o Unfortunately it doesn't seem to be possible externally to wait for 
RPC(s) responses to finish.

> master shutdown: close RpcServer and procWAL first thing
> 
>
> Key: HBASE-22081
> URL: https://issues.apache.org/jira/browse/HBASE-22081
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-22081.01.patch, HBASE-22081.patch
>
>
> I had a master get stuck due to HBASE-22079 and noticed it was logging RS 
> abort messages during shutdown.
> [~bahramch] found some issues where messages are processed by old master 
> during shutdown due to a race condition in RS cache (or it could also happen 
> due to a network race).
> Previously I found some bug where SCP was created during master shutdown that 
> had incorrect state (because some structures already got cleaned).
> I think before master fencing is implemented we can at least make these 
> issues much less likely by thinking about shutdown order.
> 1) First kill RCP server so we don't receive any more messages. There's no 
> need to receive messages when we are shutting down. Server heartbeats could 
> be impacted I guess, but I don't think they will be cause we currently only 
> kill RS on ZK timeout.
> 2) Then do whatever cleanup we think is needed that requires proc wal.
> 3) Then close proc WAL so no errant threads can create more procs.
> 4) Then do whatever other cleanup.
> 5) Finally delete znode.
> Right now znode is deleted somewhat early I think, and RpcServer is closed 
> very late.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   9   10   >