[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

2016-12-01 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17114:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed into branch-1. All work done here, closing JIRA.

> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> ---
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-17114.branch-1.patch, HBASE-17114.patch, 
> HBASE-17114.v2.patch, HBASE-17114.v3.patch, HBASE-17114.v3.patch, 
> HBASE-17114.v4.patch, HBASE-17114.v5.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to add a new property in name of 
> {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer 
> pause for CallQueueTooBigException, and by default it will use the setting of 
> {{hbase.client.pause}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

2016-11-30 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17114:
--
Attachment: HBASE-17114.branch-1.patch

Uploading patch for branch-1

> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> ---
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-17114.branch-1.patch, HBASE-17114.patch, 
> HBASE-17114.v2.patch, HBASE-17114.v3.patch, HBASE-17114.v3.patch, 
> HBASE-17114.v4.patch, HBASE-17114.v5.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to add a new property in name of 
> {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer 
> pause for CallQueueTooBigException, and by default it will use the setting of 
> {{hbase.client.pause}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

2016-11-30 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17114:
--
Attachment: HBASE-17114.v5.patch

New patch addresses new review comments.

> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> ---
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-17114.patch, HBASE-17114.v2.patch, 
> HBASE-17114.v3.patch, HBASE-17114.v3.patch, HBASE-17114.v4.patch, 
> HBASE-17114.v5.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to add a new property in name of 
> {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer 
> pause for CallQueueTooBigException, and by default it will use the setting of 
> {{hbase.client.pause}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

2016-11-29 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17114:
--
Attachment: HBASE-17114.v4.patch

Fix white space.

> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> ---
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-17114.patch, HBASE-17114.v2.patch, 
> HBASE-17114.v3.patch, HBASE-17114.v3.patch, HBASE-17114.v4.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to add a new property in name of 
> {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer 
> pause for CallQueueTooBigException, and by default it will use the setting of 
> {{hbase.client.pause}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

2016-11-29 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17114:
--
Attachment: HBASE-17114.v3.patch

> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> ---
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-17114.patch, HBASE-17114.v2.patch, 
> HBASE-17114.v3.patch, HBASE-17114.v3.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to add a new property in name of 
> {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer 
> pause for CallQueueTooBigException, and by default it will use the setting of 
> {{hbase.client.pause}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

2016-11-29 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17114:
--
Release Note: In HBASE-17114 we introduced a new property 
hbase.client.pause.cqtbe which makes it possible to set a longer pause for 
CallQueueTooBigException (CQTBE), it's disabled by default and 
hbase.client.pause will still be used for CQTBE. Set this property to a higher 
value if you observe frequent CQTBE from the sameRegionServer and the 
call queue there keeps full  (was: In HBASE-17114 we introduced a new property 
hbase.client.pause.cqtbe which makes it possible to set a longer pause for 
CallQueueTooBigException (CQTBE), it's disabled by default and 
{{hbase.client.pause}} will still be used for CQTBE. Set this property to a 
higher value if you observe frequent CQTBE from the same  RegionServer and 
the call queue there keeps full)

> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> ---
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-17114.patch, HBASE-17114.v2.patch, 
> HBASE-17114.v3.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to add a new property in name of 
> {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer 
> pause for CallQueueTooBigException, and by default it will use the setting of 
> {{hbase.client.pause}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

2016-11-29 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17114:
--
Release Note: In HBASE-17114 we introduced a new property 
hbase.client.pause.cqtbe which makes it possible to set a longer pause for 
CallQueueTooBigException (CQTBE), it's disabled by default and 
{{hbase.client.pause}} will still be used for CQTBE. Set this property to a 
higher value if you observe frequent CQTBE from the sameRegionServer and 
the call queue there keeps full
 Description: 
As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} instead 
of dead-wait. This is good for performance for most cases but might cause a 
side-effect that if too many clients connect to the busy RS, that the retry 
requests may come over and over again and RS never got the chance for 
recovering, and the issue will become especially critical when the target 
region is META.

So here in this JIRA we propose to add a new property in name of 
{{hbase.client.pause.cqtbe}} to make it possible to set a special-longer pause 
for CallQueueTooBigException, and by default it will use the setting of 
{{hbase.client.pause}}

  was:
As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} instead 
of dead-wait. This is good for performance for most cases but might cause a 
side-effect that if too many clients connect to the busy RS, that the retry 
requests may come over and over again and RS never got the chance for 
recovering, and the issue will become especially critical when the target 
region is META.

So here in this JIRA we propose to supply some special retry pause for CQTBE in 
name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 
times of {{hbase.client.pause}} default value)


Update description and add release note

> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> ---
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-17114.patch, HBASE-17114.v2.patch, 
> HBASE-17114.v3.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to add a new property in name of 
> {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer 
> pause for CallQueueTooBigException, and by default it will use the setting of 
> {{hbase.client.pause}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

2016-11-29 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17114:
--
Attachment: HBASE-17114.v3.patch

Add property and description into hbase-default.xml, and check HadoopQA again 
to see how the newly-added UT is going.

> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> ---
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-17114.patch, HBASE-17114.v2.patch, 
> HBASE-17114.v3.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to supply some special retry pause for CQTBE 
> in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 
> times of {{hbase.client.pause}} default value)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

2016-11-25 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17114:
--
Status: Patch Available  (was: Open)

> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> ---
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-17114.patch, HBASE-17114.v2.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to supply some special retry pause for CQTBE 
> in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 
> times of {{hbase.client.pause}} default value)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

2016-11-25 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17114:
--
Attachment: HBASE-17114.v2.patch

OK, back for this one, sorry for the lag.

Update patch to address review comments.

Regarding UT design, I think we still need to check the *real* execution time, 
and there's already some design to avoid it to be flaky. The same UT case has 
been executed daily in our private Jenkins and no intermittent failure 
observed. Let's see what HadoopQA will say.

> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> ---
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-17114.patch, HBASE-17114.v2.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to supply some special retry pause for CQTBE 
> in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 
> times of {{hbase.client.pause}} default value)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

2016-11-17 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17114:
--
Attachment: HBASE-17114.patch

Here comes the patch for review.

> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> ---
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
> Attachments: HBASE-17114.patch
>
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to supply some special retry pause for CQTBE 
> in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 
> times of {{hbase.client.pause}} default value)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException

2016-11-16 Thread Yu Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated HBASE-17114:
--
Description: 
As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} instead 
of dead-wait. This is good for performance for most cases but might cause a 
side-effect that if too many clients connect to the busy RS, that the retry 
requests may come over and over again and RS never got the chance for 
recovering, and the issue will become especially critical when the target 
region is META.

So here in this JIRA we propose to supply some special retry pause for CQTBE in 
name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 
times of {{hbase.client.pause}} default value)

  was:
As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} instead 
of dead-wait. This is good for performance for most cases but might cause a 
side-effect that if too many clients connect to the busy, the retry requests 
may come over and over again and RS never got the chance for recovering, and 
the issue will become especially critical when the target region is META.

So here in this JIRA we propose to supply some special retry pause for CQTBE in 
name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 
times of {{hbase.client.pause}} default value)


> Add an option to set special retry pause when encountering 
> CallQueueTooBigException
> ---
>
> Key: HBASE-17114
> URL: https://issues.apache.org/jira/browse/HBASE-17114
> Project: HBase
>  Issue Type: Bug
>Reporter: Yu Li
>Assignee: Yu Li
>
> As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} 
> instead of dead-wait. This is good for performance for most cases but might 
> cause a side-effect that if too many clients connect to the busy RS, that the 
> retry requests may come over and over again and RS never got the chance for 
> recovering, and the issue will become especially critical when the target 
> region is META.
> So here in this JIRA we propose to supply some special retry pause for CQTBE 
> in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 
> times of {{hbase.client.pause}} default value)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)