[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13893199#comment-13893199 ] Hudson commented on HBASE-10277: FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #81 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/81/]) HBASE-10277 refactor AsyncProcess ADDENDUM fix javadoc warnings (sershe: rev 1565031) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.99.0 Attachments: HBASE-10277-addendum.patch, HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892662#comment-13892662 ] Hudson commented on HBASE-10277: FAILURE: Integrated in HBase-TRUNK #4889 (See [https://builds.apache.org/job/HBase-TRUNK/4889/]) HBASE-10277 refactor AsyncProcess (sershe: rev 1564832) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnection.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RetriesExhaustedWithDetailsException.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ResponseConverter.java * /hbase/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/CoprocessorHConnection.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892769#comment-13892769 ] Enis Soztutar commented on HBASE-10277: --- Sergey did you commit this? An update would have been nice. Also there seems to be javadoc warnings. Can you do an addendum patch for fixing those. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.99.0 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892770#comment-13892770 ] Enis Soztutar commented on HBASE-10277: --- Can you also resolve the issue. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.99.0 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892829#comment-13892829 ] Hudson commented on HBASE-10277: FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #80 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/80/]) HBASE-10277 refactor AsyncProcess (sershe: rev 1564832) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnection.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RetriesExhaustedWithDetailsException.java * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ResponseConverter.java * /hbase/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/CoprocessorHConnection.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.99.0 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892876#comment-13892876 ] Enis Soztutar commented on HBASE-10277: --- +1 on addendum. Thanks Sergey! refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.99.0 Attachments: HBASE-10277-addendum.patch, HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13893110#comment-13893110 ] Hudson commented on HBASE-10277: FAILURE: Integrated in HBase-TRUNK #4893 (See [https://builds.apache.org/job/HBase-TRUNK/4893/]) HBASE-10277 refactor AsyncProcess ADDENDUM fix javadoc warnings (sershe: rev 1565031) * /hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 0.99.0 Attachments: HBASE-10277-addendum.patch, HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891116#comment-13891116 ] Hadoop QA commented on HBASE-10277: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12626923/HBASE-10277.05.patch against trunk revision . ATTACHMENT ID: 12626923 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + // This action failed before creating ars. Add it to retained but do not add to submit list. + Retrying. Server is + server.getServerName() + , tableName= + tableName, t); + Throwable error, long backOffTime, boolean willRetry, String startTime){ {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8591//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8591//console This message is automatically generated. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891152#comment-13891152 ] Nicolas Liochon commented on HBASE-10277: - I published the review. My comments are cosmetic mainly, we discussed all the main points previously. I'm +1 (with my comments taken into account obviously ;- ). Thanks, Sergey. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891422#comment-13891422 ] Enis Soztutar commented on HBASE-10277: --- bq. Wrt moving error management for streaming mode out of AsyncProcess, for example into HTable, I think it might hurt performance because HTable will have to manage all these objects. Let me try to think if this can be avoided. Ok, we can address that as a follow up if there is benefit in abstracting that away. One small question in RB, other than that I'm +1 as well. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891455#comment-13891455 ] Sergey Shelukhin commented on HBASE-10277: -- I will file a follow-up JIRA. Responded on RB, but it now gives me errors, not sure if it got posted. The answer is yes. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891572#comment-13891572 ] Hadoop QA commented on HBASE-10277: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627021/HBASE-10277.06.patch against trunk revision . ATTACHMENT ID: 12627021 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 3 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + * See {@link #submit(ExecutorService, TableName, List, boolean, org.apache.hadoop.hbase.client.coprocessor.Batch.Callback, boolean)}. + boolean atLeastOne, Batch.CallbackCResult callback, boolean needResults) throws InterruptedIOException { + // This action failed before creating ars. Add it to retained but do not add to submit list. + * See {@link #submitAll(ExecutorService, TableName, List, org.apache.hadoop.hbase.client.coprocessor.Batch.Callback, Object[])}. + Retrying. Server is + server.getServerName() + , tableName= + tableName, t); + Throwable error, long backOffTime, boolean willRetry, String startTime){ {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestCatalogJanitor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8598//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8598//console This message is automatically generated. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891584#comment-13891584 ] Sergey Shelukhin commented on HBASE-10277: -- probably another mock-related failure... let me investigate refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891597#comment-13891597 ] Sergey Shelukhin commented on HBASE-10277: -- yeah NPE because mocj doesn't override newly added reusable AP. I'll see what can be done, probably just create new API in the mock. I'd assume +1s would stand with that change, so I'd commit late today or tomorrow refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891822#comment-13891822 ] Hadoop QA commented on HBASE-10277: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12627062/HBASE-10277.07.patch against trunk revision . ATTACHMENT ID: 12627062 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + * See {@link #submit(ExecutorService, TableName, List, boolean, org.apache.hadoop.hbase.client.coprocessor.Batch.Callback, boolean)}. + boolean atLeastOne, Batch.CallbackCResult callback, boolean needResults) throws InterruptedIOException { + // This action failed before creating ars. Add it to retained but do not add to submit list. + * See {@link #submitAll(ExecutorService, TableName, List, org.apache.hadoop.hbase.client.coprocessor.Batch.Callback, Object[])}. + Retrying. Server is + server.getServerName() + , tableName= + tableName, t); + Throwable error, long backOffTime, boolean willRetry, String startTime){ {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8600//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8600//console This message is automatically generated. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888063#comment-13888063 ] Sergey Shelukhin commented on HBASE-10277: -- TestEncodedSeekers stuff happens in other JIRAs and is probably unrelated refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13887400#comment-13887400 ] Sergey Shelukhin commented on HBASE-10277: -- I think I know what the issue is with these tests... I will probably fix tomorrow, or later today; the patch is still ready for review :) refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13887478#comment-13887478 ] Hadoop QA commented on HBASE-10277: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12626252/HBASE-10277.04.patch against trunk revision . ATTACHMENT ID: 12626252 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + // This action failed before creating ars. Add it to retained but do not add to submit list. + Retrying. Server is + server.getServerName() + , tableName= + tableName, t); + Throwable error, long backOffTime, boolean willRetry, String startTime){ {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.io.encoding.TestEncodedSeekers.testEncodedSeeker(TestEncodedSeekers.java:129) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8564//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8564//console This message is automatically generated. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885851#comment-13885851 ] Sergey Shelukhin commented on HBASE-10277: -- RB at https://reviews.apache.org/r/17510/ refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13886232#comment-13886232 ] Hadoop QA commented on HBASE-10277: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12626043/HBASE-10277.03.patch against trunk revision . ATTACHMENT ID: 12626043 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 2 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + // This action failed before creating ars. Add it to retained but do not add to submit list. + Retrying. Server is + server.getServerName() + , tableName= + tableName, t); + Throwable error, long backOffTime, boolean willRetry, String startTime){ {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.io.encoding.TestChangingEncoding org.apache.hadoop.hbase.rest.client.TestRemoteTable org.apache.hadoop.hbase.client.TestAdmin org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster org.apache.hadoop.hbase.rest.TestGetAndPutResource org.apache.hadoop.hbase.rest.TestDeleteRow org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster {color:red}-1 core zombie tests{color}. There are 2 zombie test(s): at org.apache.hadoop.hbase.io.encoding.TestEncodedSeekers.testEncodedSeeker(TestEncodedSeekers.java:129) at org.apache.hadoop.hbase.client.TestHCM.testMulti(TestHCM.java:858) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8558//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8558//console This message is automatically generated. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878494#comment-13878494 ] Nicolas Liochon commented on HBASE-10277: - For #3, flushCommit may be not called before close time. The write buffer manage the size of the message for the user, it allows him to stream its writes to the HTable. The issue here is really the error management., the feature itself is nice when everything works properly. I would propose an option #4: add a callback for error management. If the callback is set, we use it. If not, we raise an exception as we used to do. We could as well stream the gets/increments as the puts, and use a callback to return the result as well. This would save the creation of the Object[], and it would make the interface consistent. The code itself is already there imho. We can consider that changing the HTable semantic is for another jira btw, as you like. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879329#comment-13879329 ] Sergey Shelukhin commented on HBASE-10277: -- Callback for each put call would still need to be maintained separately for each put call, and if you need results (like from get) you'd still need to have place to store them. Current patch bypasses the creation of Object[] when results are not needed, e.g. for streaming puts. Previous code does the same thing, but the Object[] for get was still created inside the callback object. One problem with current callback is that as soon as you have multiple submits the index argument to callback becomes ambiguous, so you no longer know for which action you receive the error or result. So, HCM batch code creates AP for each single submit, that way it knows the index is always from that submit when the callback populates the array; and HTable just doesn't use callback for streaming puts because it doesn't need the results... if HTable were to use current callback for error management (or streaming gets when you need results and there are multiple submit calls), it becomes a real problem. We can instead add /per-call/ callback in the context. It's a hybrid between #3 and #4; AP can avoid global error support; we can add async call with callback to HTable which would use the regular path; current streaming put can have the same semantics but maintain the contexts in HTable rather than AP. Let me think more about the latter case. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877390#comment-13877390 ] Nicolas Liochon commented on HBASE-10277: - bq. Does it mean that an AsyncProcess can now be shared between Tables? bq. Yes This seems great. Would it be possible then to have a single AsyncProcess per HConnection, shared between the different htables objects? This would make Side question: would it make sense to use the multiget path for a single get, instead of having two different paths? bq. When we have a scenario to use some callback, we can add it, under YAGNI principle The scenario is already there: it's how to manage the errors with the write buffer. I didn't want to make the interface public (as once it's public you should not change it), but at the end of the day, the callback is the most obvious solution to the problem. Having it here sets a base for the discussion. If your patch allows to have a common resource management per HTable, I'm happy to lose the callbacks as a side effect of the patch, but having both would be better imho. bq. IIRC most of these paths are deprecated. What's deprecated is mainly that the batch interfaces were in HConnection instead of HTable. The Object[] is ugly, but is still the 'recommended' way. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878076#comment-13878076 ] Sergey Shelukhin commented on HBASE-10277: -- bq. This seems great. Would it be possible then to have a single AsyncProcess per HConnection, shared between the different htables objects? This would make Not for legacy mode, because then the cross-put behavior will also be cross-HTable. For individual requests, yeah, that can be done. Also the sentence appears to be unfinished. bq. Side question: would it make sense to use the multiget path for a single get, instead of having two different paths? Yeah, that is possible, but it is in scope of different JIRA. bq. The scenario is already there: it's how to manage the errors with the write buffer. I didn't want to make the interface public (as once it's public you should not change it), but at the end of the day, the callback is the most obvious solution to the problem. Having it here sets a base for the discussion. If your patch allows to have a common resource management per HTable, I'm happy to lose the callbacks as a side effect of the patch, but having both would be better imho. Can you elaborate on the error management? Right now the patch preserves the cross-put-errors mode for HTable, without the callback. bq. What's deprecated is mainly that the batch interfaces were in HConnection instead of HTable. The Object[] is ugly, but is still the 'recommended' way Yeah, for these paths and without the custom pool is where we reuse the same AsyncProcess. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878100#comment-13878100 ] stack commented on HBASE-10277: --- Pardon the dumb question: Why we need to support 'legacy' behavior? Strip it? refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878106#comment-13878106 ] Sergey Shelukhin commented on HBASE-10277: -- 94 compat... HTable put is currently async but does not have any means to return errors. flushCommits can flush multiple puts. Errors are eventually thrown thru some put call or flushCommits. We can either break HTable::put interface (doesn't seem viable), make put-s sync and add separate async put (that is possible but may also be surprising), or remove old pattern from AP, but keep track of all the puts inside HTable itself, and aggregate all errors only when flushCommits is called, for example (with some client performance loss because multiple requests will be tracked on higher level than in AP). Overall, I can see merit in scenario where you do bunch of puts and then flush... it could be replaced with user issuing multi-puts explicitly, but now that API is such as it is, we cannot simply remove it I think. Maybe the 3rd approach above is viable, if we add some javadocs/notes. What do you think? refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878111#comment-13878111 ] stack commented on HBASE-10277: --- So, this redo has to be backportable to 0.94? Or you mean that the 0.94 APIs must work as they did in 0.94 though you are in a 0.96+ context? Which is option #3? I do not see a #3 above. Do you mean 'remove old pattern from AP'? If so, that sounds good to me. AP is done 'right' (but you have to add hackery to handle the ugly stuff a while). Old API is deprecated and IMO, it is find if deprecated API loses perf -- it is incentive to move to the new way. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878137#comment-13878137 ] Sergey Shelukhin commented on HBASE-10277: -- Yeah, I mean the API behavior compat; not backporting. Options are (inserting numbers into the above) bq. (1) either break HTable::put interface (doesn't seem viable), (2) make put-s sync and add separate async put (that is possible but may also be surprising), or (3) remove old pattern from AP, but keep track of all the puts inside HTable itself, and aggregate all errors only when flushCommits is called, for example (with some client performance loss because multiple requests will be tracked on higher level than in AP). HTable::put is not deprecated though, it just has very idiosyncratic behavior compared to usual sync, async or batching APIs. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878276#comment-13878276 ] Hadoop QA commented on HBASE-10277: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12624266/HBASE-10277.01.patch against trunk revision . ATTACHMENT ID: 12624266 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + // This action failed before creating ars. Add it to retained but do not add to submit list. + Throwable error, long backOffTime, boolean willRetry, String startTime){ {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestCatalogJanitor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8487//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8487//console This message is automatically generated. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.01.patch, HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876520#comment-13876520 ] Nicolas Liochon commented on HBASE-10277: - It's a difficult patch to read: 1) Some changes are cosmetics: some protected becomes private, some this. are removed. I'm not against these changes, but it makes the real meat more difficult to find. 2) The javadoc has not been updated, so when the code differs from the javadoc, the reader has to sort out himself if it's just that the javadoc is now out dated or if there is a regression. I haven't yet reviewed it globally, but here is a set of questions/comments. AsyncProcess#submit. Why does it take a tableName? Does it mean that an AsyncProcess can now be shared between Tables? AsyncRequestSet#waitUntilDone Same responsibility as AsyncProcess#waitUntilDone, but less features (no logs. These logs are useful). bq. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. This part should go in HConnection, as we should manage the load tracking globally and not only for a single call. Iit would be a change in bahavior compared to the 0.94, but I think we should do it. Would it make you life easier here? bq. I ran some perf test using YCSB and table with write-dropping coproc Great, really. We should do that much more often... bq. Probably this perf difference will not be noticeable on real requests (remains to be tested). Let me be more pessimistic here :-). bq. Also got rid of callback that was mostly used for tests, tests can check results without it. I'm not a big fan of this part of the change. Callbacks can be reused in different context (for example to have a different policy, such as ignoring errors as in HTableMultiplexer). As well, we now have a hardRetryLimit, but this attribute is used only in tests. More globally, This patch allows to reuse a single AsyncProcess between independant streams of writes. Would that be necessary if it was cheaper to create ? The cost is reading the configuration, as when we do a HTable#get and create a RegionServerCallable. The problem is that with this patch, we still create a AsyncProcess in some cases, for example on the batchCallback path... refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. -I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided.- Cannot be avoided because the API to server doesn't have one-to-one correspondence between requests and responses in an individual call to multi (retries/rearrangement have nothing to do with it) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876760#comment-13876760 ] Sergey Shelukhin commented on HBASE-10277: -- bq. Could AsyncRequests be done as a 'Future'? Seems to have a bunch in common. I'll take a look at it... it's more like a multi-future. Maybe even FutureTask can be used. bq. We have something called Set but it does not implement java.util.Set: Will fix. bq. We make an ap and a multiap each time? Once per HTable. The difference is the mode of operation - the legacy mode or the normal one. bq. Batch should return an array of objects? It does? Don't quite understand the comment. Some of the existing interfaces accept the array for results that fill it. Backward compat. bq. 1) Some changes are cosmetics: some protected becomes private, some this. are removed. I'm not against these changes, but it makes the real meat more difficult to find. Removing this. is not cosmetic (at least in the cases I am aware of) - methods moved to a non-static nested class, there's no more this.. Changing to private can be removed, although it's a good change to have. bq. 2) The javadoc has not been updated, so when the code differs from the javadoc, the reader has to sort out himself if it's just that the javadoc is now out dated or if there is a regression. Will update. bq. AsyncProcess#submit. Why does it take a tableName? Does it mean that an AsyncProcess can now be shared between Tables? Yes. bq. AsyncRequestSet#waitUntilDone bq. Same responsibility as AsyncProcess#waitUntilDone, but less features (no logs. These logs are useful). Some logs were preserved. Previous waitUntilDone had two wait conditions and loops in an effort not to loop often, whereas only one seems to be necessary, so I don't think features were lost... I can add more logs. bq. This part should go in HConnection, as we should manage the load tracking globally and not only for a single call. Iit would be a change in bahavior compared to the 0.94, but I think we should do it. Would it make you life easier here? It may, actually... but HTable uses AP directly. In fact batch calls from HCM currently don't use throttling at all (it calls submitAll), so only HTable-direct-usage uses this code. {quote} bq. Probably this perf difference will not be noticeable on real requests (remains to be tested). Let me be more pessimistic here . {quote} Yeah, I'd probably need to test. I was not able to figure out what exactly is the cause of slowdown - YourKit gives very low % numbers for AP, CPU or wall clock, and almost identical between old and equivalent new methods across runs. {quote} bq. Also got rid of callback that was mostly used for tests, tests can check results without it. I'm not a big fan of this part of the change. Callbacks can be reused in different context (for example to have a different policy, such as ignoring errors as in HTableMultiplexer). As well, we now have a hardRetryLimit, but this attribute is used only in tests. {quote} Callback was already only used for tests, for practical purposes (also for array filling, but that is no longer necessary). I am not a big fan of having test-oriented code in production code; thus I replaced the only necessary test usage (stopping retries) with a field. I wanted to get rid of that too, but that would be too convoluted it seems. When we have a scenario to use some callback, we can add it, under YAGNI principle :) bq. More globally, This patch allows to reuse a single AsyncProcess between independant streams of writes. Would that be necessary if it was cheaper to create ? The cost is reading the configuration, as when we do a HTable#get and create a RegionServerCallable. The main reason is to have well-defined context for single call for normal (as opposed to HTable cross-put stuff) usage pattern. So for example replica calls could group requests differently and synchronize/populate the same result, cancel other calls when they are no longer useful, etc. It also separates the two patterns better (by ctor argument); see potential problems outlined in JIRA description. bq. The problem is that with this patch, we still create a AsyncProcess in some cases, for example on the batchCallback path... Yeah, that is due to ability to pass a custom execution pool (could be changed to accomodate it by making pool per request, but I am not sure it's necessary...) and limitations of java generics + current result type handling (AP will have to be per result type). IIRC most of these paths are deprecated. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments:
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872639#comment-13872639 ] Nicolas Liochon commented on HBASE-10277: - It's on my plate :-). I'm going to do the review. Thanks for working on this Sergey. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871439#comment-13871439 ] stack commented on HBASE-10277: --- Thanks for having a go at this [~sershe]. bq. The (ugly) behavior for HTable where e.g. next put will give you errors from previous put was lovingly preserved. Smile. Radical! Here are a couple of comments: Could AsyncRequests be done as a 'Future'? Seems to have a bunch in common. We have something called Set but it does not implement java.util.Set: + protected class AsyncRequestSet implements AsyncRequests { We make an ap and a multiap each time? +ap = new AsyncProcessObject(connection, pool, configuration, rpcCallerFactory, true); +multiAp = new AsyncProcessObject(connection, pool, configuration, rpcCallerFactory, false); Batch should return an array of objects? +Object[] results = new Object[actions.size()]; +batch(actions, results); +return results; Then you can do return batch(actions, new Object[actions.size()]); Needs a closer review. Whats our [~nkeywal] think? refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869762#comment-13869762 ] Sergey Shelukhin commented on HBASE-10277: -- I started the mailing thread on this. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868855#comment-13868855 ] Andrew Purtell commented on HBASE-10277: bq. The (ugly) behavior for HTable where e.g. next put will give you errors from previous put was lovingly preserved. To my mind this is a serious problem because the application is getting incorrect feedback on what operation actually failed. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867493#comment-13867493 ] Enis Soztutar commented on HBASE-10277: --- [~nkeywal] FYI. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently passed around the method calls. I am not sure yet, but maybe sending of the original index to server in ClientProtos.MultiAction can also be avoided. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10277) refactor AsyncProcess
[ https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867511#comment-13867511 ] Hadoop QA commented on HBASE-10277: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12622325/HBASE-10277.patch against trunk revision . ATTACHMENT ID: 12622325 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop1.1{color}. The patch compiles against the hadoop 1.1 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: + // This action failed before creating ars. Add it to retained but do not add to submit list. + Throwable error, long backOffTime, boolean willRetry, String startTime){ {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestCatalogJanitor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/8382//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/8382//console This message is automatically generated. refactor AsyncProcess - Key: HBASE-10277 URL: https://issues.apache.org/jira/browse/HBASE-10277 Project: HBase Issue Type: Improvement Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HBASE-10277.patch AsyncProcess currently has two patterns of usage, one from HTable flush w/o callback and with reuse, and one from HCM/HTable batch call, with callback and w/o reuse. In the former case (but not the latter), it also does some throttling of actions on initial submit call, limiting the number of outstanding actions per server. The latter case is relatively straightforward. The former appears to be error prone due to reuse - if, as javadoc claims should be safe, multiple submit calls are performed without waiting for the async part of the previous call to finish, fields like hasError become ambiguous and can be used for the wrong call; callback for success/failure is called based on original index of an action in submitted list, but with only one callback supplied to AP in ctor it's not clear to which submit call the index belongs, if several are outstanding. I was going to add support for HBASE-10070 to AP, and found that it might be difficult to do cleanly. It would be nice to normalize AP usage patterns; in particular, separate the global part (load tracking) from per-submit-call part. Per-submit part can more conveniently track stuff like initialActions, mapping of indexes and retry information, that is currently