[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
[ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-17114: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed into branch-1. All work done here, closing JIRA. > Add an option to set special retry pause when encountering > CallQueueTooBigException > --- > > Key: HBASE-17114 > URL: https://issues.apache.org/jira/browse/HBASE-17114 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-17114.branch-1.patch, HBASE-17114.patch, > HBASE-17114.v2.patch, HBASE-17114.v3.patch, HBASE-17114.v3.patch, > HBASE-17114.v4.patch, HBASE-17114.v5.patch > > > As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} > instead of dead-wait. This is good for performance for most cases but might > cause a side-effect that if too many clients connect to the busy RS, that the > retry requests may come over and over again and RS never got the chance for > recovering, and the issue will become especially critical when the target > region is META. > So here in this JIRA we propose to add a new property in name of > {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer > pause for CallQueueTooBigException, and by default it will use the setting of > {{hbase.client.pause}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
[ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-17114: -- Attachment: HBASE-17114.branch-1.patch Uploading patch for branch-1 > Add an option to set special retry pause when encountering > CallQueueTooBigException > --- > > Key: HBASE-17114 > URL: https://issues.apache.org/jira/browse/HBASE-17114 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-17114.branch-1.patch, HBASE-17114.patch, > HBASE-17114.v2.patch, HBASE-17114.v3.patch, HBASE-17114.v3.patch, > HBASE-17114.v4.patch, HBASE-17114.v5.patch > > > As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} > instead of dead-wait. This is good for performance for most cases but might > cause a side-effect that if too many clients connect to the busy RS, that the > retry requests may come over and over again and RS never got the chance for > recovering, and the issue will become especially critical when the target > region is META. > So here in this JIRA we propose to add a new property in name of > {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer > pause for CallQueueTooBigException, and by default it will use the setting of > {{hbase.client.pause}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
[ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-17114: -- Attachment: HBASE-17114.v5.patch New patch addresses new review comments. > Add an option to set special retry pause when encountering > CallQueueTooBigException > --- > > Key: HBASE-17114 > URL: https://issues.apache.org/jira/browse/HBASE-17114 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-17114.patch, HBASE-17114.v2.patch, > HBASE-17114.v3.patch, HBASE-17114.v3.patch, HBASE-17114.v4.patch, > HBASE-17114.v5.patch > > > As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} > instead of dead-wait. This is good for performance for most cases but might > cause a side-effect that if too many clients connect to the busy RS, that the > retry requests may come over and over again and RS never got the chance for > recovering, and the issue will become especially critical when the target > region is META. > So here in this JIRA we propose to add a new property in name of > {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer > pause for CallQueueTooBigException, and by default it will use the setting of > {{hbase.client.pause}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
[ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-17114: -- Attachment: HBASE-17114.v4.patch Fix white space. > Add an option to set special retry pause when encountering > CallQueueTooBigException > --- > > Key: HBASE-17114 > URL: https://issues.apache.org/jira/browse/HBASE-17114 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-17114.patch, HBASE-17114.v2.patch, > HBASE-17114.v3.patch, HBASE-17114.v3.patch, HBASE-17114.v4.patch > > > As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} > instead of dead-wait. This is good for performance for most cases but might > cause a side-effect that if too many clients connect to the busy RS, that the > retry requests may come over and over again and RS never got the chance for > recovering, and the issue will become especially critical when the target > region is META. > So here in this JIRA we propose to add a new property in name of > {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer > pause for CallQueueTooBigException, and by default it will use the setting of > {{hbase.client.pause}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
[ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-17114: -- Attachment: HBASE-17114.v3.patch > Add an option to set special retry pause when encountering > CallQueueTooBigException > --- > > Key: HBASE-17114 > URL: https://issues.apache.org/jira/browse/HBASE-17114 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-17114.patch, HBASE-17114.v2.patch, > HBASE-17114.v3.patch, HBASE-17114.v3.patch > > > As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} > instead of dead-wait. This is good for performance for most cases but might > cause a side-effect that if too many clients connect to the busy RS, that the > retry requests may come over and over again and RS never got the chance for > recovering, and the issue will become especially critical when the target > region is META. > So here in this JIRA we propose to add a new property in name of > {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer > pause for CallQueueTooBigException, and by default it will use the setting of > {{hbase.client.pause}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
[ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-17114: -- Release Note: In HBASE-17114 we introduced a new property hbase.client.pause.cqtbe which makes it possible to set a longer pause for CallQueueTooBigException (CQTBE), it's disabled by default and hbase.client.pause will still be used for CQTBE. Set this property to a higher value if you observe frequent CQTBE from the sameRegionServer and the call queue there keeps full (was: In HBASE-17114 we introduced a new property hbase.client.pause.cqtbe which makes it possible to set a longer pause for CallQueueTooBigException (CQTBE), it's disabled by default and {{hbase.client.pause}} will still be used for CQTBE. Set this property to a higher value if you observe frequent CQTBE from the same RegionServer and the call queue there keeps full) > Add an option to set special retry pause when encountering > CallQueueTooBigException > --- > > Key: HBASE-17114 > URL: https://issues.apache.org/jira/browse/HBASE-17114 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-17114.patch, HBASE-17114.v2.patch, > HBASE-17114.v3.patch > > > As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} > instead of dead-wait. This is good for performance for most cases but might > cause a side-effect that if too many clients connect to the busy RS, that the > retry requests may come over and over again and RS never got the chance for > recovering, and the issue will become especially critical when the target > region is META. > So here in this JIRA we propose to add a new property in name of > {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer > pause for CallQueueTooBigException, and by default it will use the setting of > {{hbase.client.pause}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
[ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-17114: -- Release Note: In HBASE-17114 we introduced a new property hbase.client.pause.cqtbe which makes it possible to set a longer pause for CallQueueTooBigException (CQTBE), it's disabled by default and {{hbase.client.pause}} will still be used for CQTBE. Set this property to a higher value if you observe frequent CQTBE from the sameRegionServer and the call queue there keeps full Description: As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} instead of dead-wait. This is good for performance for most cases but might cause a side-effect that if too many clients connect to the busy RS, that the retry requests may come over and over again and RS never got the chance for recovering, and the issue will become especially critical when the target region is META. So here in this JIRA we propose to add a new property in name of {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer pause for CallQueueTooBigException, and by default it will use the setting of {{hbase.client.pause}} was: As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} instead of dead-wait. This is good for performance for most cases but might cause a side-effect that if too many clients connect to the busy RS, that the retry requests may come over and over again and RS never got the chance for recovering, and the issue will become especially critical when the target region is META. So here in this JIRA we propose to supply some special retry pause for CQTBE in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 times of {{hbase.client.pause}} default value) Update description and add release note > Add an option to set special retry pause when encountering > CallQueueTooBigException > --- > > Key: HBASE-17114 > URL: https://issues.apache.org/jira/browse/HBASE-17114 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-17114.patch, HBASE-17114.v2.patch, > HBASE-17114.v3.patch > > > As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} > instead of dead-wait. This is good for performance for most cases but might > cause a side-effect that if too many clients connect to the busy RS, that the > retry requests may come over and over again and RS never got the chance for > recovering, and the issue will become especially critical when the target > region is META. > So here in this JIRA we propose to add a new property in name of > {{hbase.client.pause.cqtbe}} to make it possible to set a special-longer > pause for CallQueueTooBigException, and by default it will use the setting of > {{hbase.client.pause}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
[ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-17114: -- Attachment: HBASE-17114.v3.patch Add property and description into hbase-default.xml, and check HadoopQA again to see how the newly-added UT is going. > Add an option to set special retry pause when encountering > CallQueueTooBigException > --- > > Key: HBASE-17114 > URL: https://issues.apache.org/jira/browse/HBASE-17114 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-17114.patch, HBASE-17114.v2.patch, > HBASE-17114.v3.patch > > > As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} > instead of dead-wait. This is good for performance for most cases but might > cause a side-effect that if too many clients connect to the busy RS, that the > retry requests may come over and over again and RS never got the chance for > recovering, and the issue will become especially critical when the target > region is META. > So here in this JIRA we propose to supply some special retry pause for CQTBE > in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 > times of {{hbase.client.pause}} default value) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
[ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-17114: -- Status: Patch Available (was: Open) > Add an option to set special retry pause when encountering > CallQueueTooBigException > --- > > Key: HBASE-17114 > URL: https://issues.apache.org/jira/browse/HBASE-17114 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-17114.patch, HBASE-17114.v2.patch > > > As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} > instead of dead-wait. This is good for performance for most cases but might > cause a side-effect that if too many clients connect to the busy RS, that the > retry requests may come over and over again and RS never got the chance for > recovering, and the issue will become especially critical when the target > region is META. > So here in this JIRA we propose to supply some special retry pause for CQTBE > in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 > times of {{hbase.client.pause}} default value) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
[ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-17114: -- Attachment: HBASE-17114.v2.patch OK, back for this one, sorry for the lag. Update patch to address review comments. Regarding UT design, I think we still need to check the *real* execution time, and there's already some design to avoid it to be flaky. The same UT case has been executed daily in our private Jenkins and no intermittent failure observed. Let's see what HadoopQA will say. > Add an option to set special retry pause when encountering > CallQueueTooBigException > --- > > Key: HBASE-17114 > URL: https://issues.apache.org/jira/browse/HBASE-17114 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-17114.patch, HBASE-17114.v2.patch > > > As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} > instead of dead-wait. This is good for performance for most cases but might > cause a side-effect that if too many clients connect to the busy RS, that the > retry requests may come over and over again and RS never got the chance for > recovering, and the issue will become especially critical when the target > region is META. > So here in this JIRA we propose to supply some special retry pause for CQTBE > in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 > times of {{hbase.client.pause}} default value) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
[ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-17114: -- Attachment: HBASE-17114.patch Here comes the patch for review. > Add an option to set special retry pause when encountering > CallQueueTooBigException > --- > > Key: HBASE-17114 > URL: https://issues.apache.org/jira/browse/HBASE-17114 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > Attachments: HBASE-17114.patch > > > As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} > instead of dead-wait. This is good for performance for most cases but might > cause a side-effect that if too many clients connect to the busy RS, that the > retry requests may come over and over again and RS never got the chance for > recovering, and the issue will become especially critical when the target > region is META. > So here in this JIRA we propose to supply some special retry pause for CQTBE > in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 > times of {{hbase.client.pause}} default value) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-17114) Add an option to set special retry pause when encountering CallQueueTooBigException
[ https://issues.apache.org/jira/browse/HBASE-17114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-17114: -- Description: As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} instead of dead-wait. This is good for performance for most cases but might cause a side-effect that if too many clients connect to the busy RS, that the retry requests may come over and over again and RS never got the chance for recovering, and the issue will become especially critical when the target region is META. So here in this JIRA we propose to supply some special retry pause for CQTBE in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 times of {{hbase.client.pause}} default value) was: As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} instead of dead-wait. This is good for performance for most cases but might cause a side-effect that if too many clients connect to the busy, the retry requests may come over and over again and RS never got the chance for recovering, and the issue will become especially critical when the target region is META. So here in this JIRA we propose to supply some special retry pause for CQTBE in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 times of {{hbase.client.pause}} default value) > Add an option to set special retry pause when encountering > CallQueueTooBigException > --- > > Key: HBASE-17114 > URL: https://issues.apache.org/jira/browse/HBASE-17114 > Project: HBase > Issue Type: Bug >Reporter: Yu Li >Assignee: Yu Li > > As titled, after HBASE-15146 we will throw {{CallQueueTooBigException}} > instead of dead-wait. This is good for performance for most cases but might > cause a side-effect that if too many clients connect to the busy RS, that the > retry requests may come over and over again and RS never got the chance for > recovering, and the issue will become especially critical when the target > region is META. > So here in this JIRA we propose to supply some special retry pause for CQTBE > in name of {{hbase.client.pause.special}}, and by default it will be 500ms (5 > times of {{hbase.client.pause}} default value) -- This message was sent by Atlassian JIRA (v6.3.4#6332)