[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r489018057 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + private void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); + if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) { Review comment: @infraio could you please review again ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r487273056 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: > Operation timeout is not the best choice too but better @infraio In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ? ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: @saintstack Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: @saintstack Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: > Operation timeout is not the best choice too but better @infraio In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ? ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: @saintstack Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: @saintstack Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni ## File path:
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r487273056 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: > Operation timeout is not the best choice too but better @infraio In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ? ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: @saintstack Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: @saintstack Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: > Operation timeout is not the best choice too but better @infraio In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ? ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: @saintstack Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: @saintstack Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni This is an automated
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r487274061 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: @saintstack Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r487274061 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: @saintstack Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r487273056 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: > Operation timeout is not the best choice too but better @infraio In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r486558569 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + private void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); + if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) { Review comment: > What is the resolution here @shahrs87 + @infraio ? @infraio pointed out one bug in my patch. Fixed it yesterday. Waiting for his feedback. Thank you for being so patient ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r486492130 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); Review comment: @infraio I moved the acquiring of lock inside try catch block. https://github.com/apache/hbase/blob/58cb1cdaba1dbf952647934be0f34b4bad2e71db/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L868 Also added a test case for that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r486075626 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + private void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); + if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) { Review comment: > Please check the code. The LockTimeoutException will be throwed out and terminate the loop. It is not in the try...catch code block. @infraio you are right. Fixed that in latest commit. Please review again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r485216698 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + private void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); + if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) { Review comment: > If callable.prepare throw the LockTimeoutException, IIUC the code, callable.prepare will never throw LockTimeoutException. We throw LockTimeoutException [here](https://github.com/apache/hbase/blob/f3210c8908809fa154118454e6a23f57bc222db1/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L866) At this place we haven't created ReversedClientScanner object also. There is one retry loop [here](https://github.com/apache/hbase/blob/f3210c8908809fa154118454e6a23f57bc222db1/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L848) which will retry if exception is not TNFE or retries are not exhausted. @infraio please correct me if I misunderstood anything. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r483976537 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + private void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); + if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) { Review comment: Given that we have 4 +1s on using hbase.client.scanner.timeout.period, can we commit this patch ? Also I have created https://issues.apache.org/jira/browse/HBASE-24983 for wrapping scan operation under operation timeout. @apurtell @saintstack @bharathv @virajjasani @infraio what do you guys think ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r483859942 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + private void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); + if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) { Review comment: Hi @infraio, I am not able to understand your last comment. Could you please elaborate ? Thank you ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r482175115 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + private void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); + if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) { Review comment: Let me explain my POV with one of our client example. We have an internal customer who wants a strict 15 seconds SLA for every operation. Since operation timeout is end to end timeout which includes all the retries, sleep within retries so we suggested them to use operation timeout as 15 seconds. Also we recommended them to set scanner timeout period to 7 seconds with retries config (hbase.client.retries.number) set to 2 . There is some sleep interval between each attempt and we expect the call to complete within 15 seconds and if it doesn't then operation timeout will kick in and fail the call. But we found out that getRegionLocations call is not bounded by operation timeout. Now if we set the lock timeout to same as operation timeout, then in worst case scenario call will fail in 15 (lock timeout) + 15 (lock timeout 2nd try) + 1 (assuming sleep of 1 second) = 31 seconds If we set the lock timeout to same as scanner timeout, then in worst case scenario the call will fail in 7 (scanner timeout) + 7 (scanner timeout) + 1 (sleep between tries) = 16 seconds which is closer to SLA that we promised. Hope this makes sense. If still the community wants to go forward with operation timeout, I will change it to operation timeout. @infraio @bharathv @virajjasani @saintstack @apurtell This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r482175115 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + private void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); + if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) { Review comment: We have an internal customer who wants a strict 15 seconds SLA for every operation. Since operation timeout is end to end timeout which includes all the retries, sleep within retries so we suggested them to use operation timeout as 15 seconds. Also we recommended them to set scanner timeout period to 7 seconds with retries config (hbase.client.retries.number) set to 2 . There is some sleep interval between each attempt and we expect the call to complete within 15 seconds and if it doesn't then operation timeout will kick in and fail the call. But we found out that getRegionLocations call is not bounded by operation timeout. Now if we set the lock timeout to same as operation timeout, then in worst case scenario call will fail in 15 (lock timeout) + 15 (lock timeout 2nd try) + 1 (assuming sleep of 1 second) = 31 seconds If we set the lock timeout to same as scanner timeout, then in worst case scenario the call will fail in 7 (scanner timeout) + 7 (scanner timeout) + 1 (sleep between tries) = 16 seconds which is closer to SLA that we promised. Hope this makes sense. If still the community wants to go forward with operation timeout, I will change it to operation timeout. @infraio @bharathv @virajjasani @saintstack @apurtell This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r480476762 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + private void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); + if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) { Review comment: @saintstack @apurtell @infraio We need to come to consensus on which timeout property to use ? IMHO don't think we should use operation timeout. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r480240959 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java ## @@ -0,0 +1,32 @@ +/** + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.client; + +import org.apache.hadoop.hbase.HBaseIOException; +import org.apache.yetus.audience.InterfaceAudience; + +/* + Thrown whenever we are not able to get the lock within the specified wait time. + */ +@InterfaceAudience.Public Review comment: Maybe when I create a new PR for branch-1, I can re-use the existing exception class. Would that work ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r480239457 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java ## @@ -0,0 +1,32 @@ +/** + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.client; + +import org.apache.hadoop.hbase.HBaseIOException; +import org.apache.yetus.audience.InterfaceAudience; + +/* + Thrown whenever we are not able to get the lock within the specified wait time. + */ +@InterfaceAudience.Public Review comment: > I just realized there is an exact same class LockTimeoutException under org.apache.hadoop.hbase.exceptions, switch to that? This class only exists in branch-1 and not in master/branch-2. :( This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r480197093 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + private void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); + if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) { Review comment: > I am +1 to use operation timeout here. @infraio the default value for operation timeout is 20 minutes. Do you want clients to wait for 20 minutes to acquire this lock ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r479713606 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java ## @@ -0,0 +1,32 @@ +/** + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.client; + +import org.apache.hadoop.hbase.HBaseIOException; +import org.apache.yetus.audience.InterfaceAudience; + +/* + Thrown whenever we are not able to get the lock within the specified wait time. + */ +@InterfaceAudience.Public Review comment: Actually I don't know. Since this will thrown back all the way to client so thought to make it public. But open for suggestions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r479713547 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionConfiguration.java ## @@ -117,6 +119,11 @@ this.writeRpcTimeout = conf.getInt(HConstants.HBASE_RPC_WRITE_TIMEOUT_KEY, conf.getInt(HConstants.HBASE_RPC_TIMEOUT_KEY, HConstants.DEFAULT_HBASE_RPC_TIMEOUT)); + +this.scannerTimeoutPeriod = HBaseConfiguration.getInt(conf, Review comment: Looks like HConstants.HBASE_REGIONSERVER_LEASE_PERIOD_KEY config property is deprecated after 0.96 release. So remove the deprecated config property altogether. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r479540958 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -863,13 +863,15 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } // Query the meta region long pauseBase = this.pause; - userRegionLock.lock(); + takeUserRegionLock(); try { -if (useCache) {// re-check cache after get lock - RegionLocations locations = getCachedLocation(tableName, row); - if (locations != null && locations.getRegionLocation(replicaId) != null) { -return locations; - } +// We don't need to check if useCache is enabled or not. Even if useCache is false Review comment: On my local system I created patch for this on top of 1.3 branch but while porting the patch to branch-2, I missed applying this hunk. @saintstack @apurtell Since you guys already +1 the previous diff, wanted to bring this change to your attention. Sorry for confusion. Cc @bharathv @infraio This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r478767046 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java ## @@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool } } + private void takeUserRegionLock() throws IOException { +try { + long waitTime = connectionConfig.getScannerTimeoutPeriod(); + if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) { Review comment: @bharathv @infraio Thank you for your feedback. > Specifically I was wondering if hbase.client.operation.timeout would be the right one to use for this. I don't feel hbase.client.operation.timeout is the right choice here too. This config is meant for the whole end to end operation timeout which includes all layers of retries and the default value is 20 mins. If we use this timeout then we are not gaining anything. We can introduce a new config property (something like hbase.client.lock.timeout.period) and default it to something like 10 seconds. That way we don't depend on existing scanner/operation timeout periods. Let me know what you guys think. Thank you ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r478701500 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java ## @@ -0,0 +1,32 @@ +/** + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.client; + +import java.io.IOException; +import org.apache.yetus.audience.InterfaceAudience; + +/* + Thrown whenever we are not able to get the lock within the specified wait time. + */ +@InterfaceAudience.Public +public class LockTimeoutException extends IOException { Review comment: Fixed in latest commit. Please review again This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.
shahrs87 commented on a change in pull request #2322: URL: https://github.com/apache/hbase/pull/2322#discussion_r478701500 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java ## @@ -0,0 +1,32 @@ +/** + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hbase.client; + +import java.io.IOException; +import org.apache.yetus.audience.InterfaceAudience; + +/* + Thrown whenever we are not able to get the lock within the specified wait time. + */ +@InterfaceAudience.Public +public class LockTimeoutException extends IOException { Review comment: Fixed in latest commit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org