[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13893199#comment-13893199
 ] 

Hudson commented on HBASE-10277:


FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #81 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/81/])
HBASE-10277 refactor AsyncProcess ADDENDUM fix javadoc warnings (sershe: rev 
1565031)
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java


 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.99.0

 Attachments: HBASE-10277-addendum.patch, HBASE-10277.01.patch, 
 HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, 
 HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.07.patch, 
 HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892662#comment-13892662
 ] 

Hudson commented on HBASE-10277:


FAILURE: Integrated in HBase-TRUNK #4889 (See 
[https://builds.apache.org/job/HBase-TRUNK/4889/])
HBASE-10277 refactor AsyncProcess (sershe: rev 1564832)
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnection.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RetriesExhaustedWithDetailsException.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ResponseConverter.java
* 
/hbase/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/CoprocessorHConnection.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java


 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, 
 HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-05 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892769#comment-13892769
 ] 

Enis Soztutar commented on HBASE-10277:
---

Sergey did you commit this? An update would have been nice. Also there seems to 
be javadoc warnings. Can you do an addendum patch for fixing those. 

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.99.0

 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, 
 HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-05 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892770#comment-13892770
 ] 

Enis Soztutar commented on HBASE-10277:
---

Can you also resolve the issue. 

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.99.0

 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, 
 HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892829#comment-13892829
 ] 

Hudson commented on HBASE-10277:


FAILURE: Integrated in HBase-TRUNK-on-Hadoop-1.1 #80 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/80/])
HBASE-10277 refactor AsyncProcess (sershe: rev 1564832)
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnection.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RetriesExhaustedWithDetailsException.java
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ResponseConverter.java
* 
/hbase/trunk/hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/CoprocessorHConnection.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/HConnectionTestingUtility.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java


 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.99.0

 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, 
 HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-05 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13892876#comment-13892876
 ] 

Enis Soztutar commented on HBASE-10277:
---

+1 on addendum. Thanks Sergey! 

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.99.0

 Attachments: HBASE-10277-addendum.patch, HBASE-10277.01.patch, 
 HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, 
 HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.07.patch, 
 HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13893110#comment-13893110
 ] 

Hudson commented on HBASE-10277:


FAILURE: Integrated in HBase-TRUNK #4893 (See 
[https://builds.apache.org/job/HBase-TRUNK/4893/])
HBASE-10277 refactor AsyncProcess ADDENDUM fix javadoc warnings (sershe: rev 
1565031)
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java


 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.99.0

 Attachments: HBASE-10277-addendum.patch, HBASE-10277.01.patch, 
 HBASE-10277.02.patch, HBASE-10277.03.patch, HBASE-10277.04.patch, 
 HBASE-10277.05.patch, HBASE-10277.06.patch, HBASE-10277.07.patch, 
 HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891116#comment-13891116
 ] 

Hadoop QA commented on HBASE-10277:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12626923/HBASE-10277.05.patch
  against trunk revision .
  ATTACHMENT ID: 12626923

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+  // This action failed before creating ars. Add it to retained 
but do not add to submit list.
+ Retrying. Server is  + server.getServerName() + , 
tableName= + tableName, t);
+ Throwable error, long backOffTime, boolean 
willRetry, String startTime){

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8591//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8591//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8591//console

This message is automatically generated.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, 
 HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call 

[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-04 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891152#comment-13891152
 ] 

Nicolas Liochon commented on HBASE-10277:
-

I published the review. My comments are cosmetic mainly, we discussed all the 
main points previously. I'm +1 (with my comments taken into account obviously 
;- ). Thanks, Sergey.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, 
 HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-04 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891422#comment-13891422
 ] 

Enis Soztutar commented on HBASE-10277:
---

bq. Wrt moving error management for streaming mode out of AsyncProcess, for 
example into HTable, I think it might hurt performance because HTable will have 
to manage all these objects. Let me try to think if this can be avoided.
Ok, we can address that as a follow up if there is benefit in abstracting that 
away. 

One small question in RB, other than that I'm +1 as well. 

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, 
 HBASE-10277.06.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-04 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891455#comment-13891455
 ] 

Sergey Shelukhin commented on HBASE-10277:
--

I will file a follow-up JIRA. Responded on RB, but it now gives me errors, not 
sure if it got posted. The answer is yes.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, 
 HBASE-10277.06.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891572#comment-13891572
 ] 

Hadoop QA commented on HBASE-10277:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12627021/HBASE-10277.06.patch
  against trunk revision .
  ATTACHMENT ID: 12627021

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 3 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+   * See {@link #submit(ExecutorService, TableName, List, boolean, 
org.apache.hadoop.hbase.client.coprocessor.Batch.Callback, boolean)}.
+  boolean atLeastOne, Batch.CallbackCResult callback, boolean 
needResults) throws InterruptedIOException {
+  // This action failed before creating ars. Add it to retained but do 
not add to submit list.
+   * See {@link #submitAll(ExecutorService, TableName, List, 
org.apache.hadoop.hbase.client.coprocessor.Batch.Callback, Object[])}.
+ Retrying. Server is  + server.getServerName() + , 
tableName= + tableName, t);
+ Throwable error, long backOffTime, boolean 
willRetry, String startTime){

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestCatalogJanitor

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8598//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8598//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8598//console

This message is automatically generated.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, 
 HBASE-10277.06.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 

[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-04 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891584#comment-13891584
 ] 

Sergey Shelukhin commented on HBASE-10277:
--

probably another mock-related failure... let me investigate

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, 
 HBASE-10277.06.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-04 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891597#comment-13891597
 ] 

Sergey Shelukhin commented on HBASE-10277:
--

yeah NPE because mocj doesn't override newly added reusable AP. I'll see what 
can be done, probably just create new API in the mock. I'd assume +1s would 
stand with that change, so I'd commit late today or tomorrow

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, 
 HBASE-10277.06.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13891822#comment-13891822
 ] 

Hadoop QA commented on HBASE-10277:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12627062/HBASE-10277.07.patch
  against trunk revision .
  ATTACHMENT ID: 12627062

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+   * See {@link #submit(ExecutorService, TableName, List, boolean, 
org.apache.hadoop.hbase.client.coprocessor.Batch.Callback, boolean)}.
+  boolean atLeastOne, Batch.CallbackCResult callback, boolean 
needResults) throws InterruptedIOException {
+  // This action failed before creating ars. Add it to retained but do 
not add to submit list.
+   * See {@link #submitAll(ExecutorService, TableName, List, 
org.apache.hadoop.hbase.client.coprocessor.Batch.Callback, Object[])}.
+ Retrying. Server is  + server.getServerName() + , 
tableName= + tableName, t);
+ Throwable error, long backOffTime, boolean 
willRetry, String startTime){

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8600//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8600//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8600//console

This message is automatically generated.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.05.patch, 
 HBASE-10277.06.patch, HBASE-10277.07.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called 

[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-31 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13888063#comment-13888063
 ] 

Sergey Shelukhin commented on HBASE-10277:
--

TestEncodedSeekers stuff happens in other JIRAs and is probably unrelated

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-30 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13887400#comment-13887400
 ] 

Sergey Shelukhin commented on HBASE-10277:
--

I think I know what the issue is with these tests... I will probably fix 
tomorrow, or later today; the patch is still ready for review :)

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-30 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13887478#comment-13887478
 ] 

Hadoop QA commented on HBASE-10277:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12626252/HBASE-10277.04.patch
  against trunk revision .
  ATTACHMENT ID: 12626252

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+  // This action failed before creating ars. Add it to retained 
but do not add to submit list.
+ Retrying. Server is  + server.getServerName() + , 
tableName= + tableName, t);
+ Throwable error, long backOffTime, boolean 
willRetry, String startTime){

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hbase.io.encoding.TestEncodedSeekers.testEncodedSeeker(TestEncodedSeekers.java:129)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8564//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8564//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8564//console

This message is automatically generated.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.04.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found 

[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-29 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885851#comment-13885851
 ] 

Sergey Shelukhin commented on HBASE-10277:
--

RB at https://reviews.apache.org/r/17510/

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13886232#comment-13886232
 ] 

Hadoop QA commented on HBASE-10277:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12626043/HBASE-10277.03.patch
  against trunk revision .
  ATTACHMENT ID: 12626043

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+  // This action failed before creating ars. Add it to retained 
but do not add to submit list.
+ Retrying. Server is  + server.getServerName() + , 
tableName= + tableName, t);
+ Throwable error, long backOffTime, boolean 
willRetry, String startTime){

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.io.encoding.TestChangingEncoding
  org.apache.hadoop.hbase.rest.client.TestRemoteTable
  org.apache.hadoop.hbase.client.TestAdmin
  org.apache.hadoop.hbase.regionserver.TestHRegionOnCluster
  org.apache.hadoop.hbase.rest.TestGetAndPutResource
  org.apache.hadoop.hbase.rest.TestDeleteRow
  
org.apache.hadoop.hbase.coprocessor.TestRegionObserverInterface
  
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

 {color:red}-1 core zombie tests{color}.  There are 2 zombie test(s):   
at 
org.apache.hadoop.hbase.io.encoding.TestEncodedSeekers.testEncodedSeeker(TestEncodedSeekers.java:129)
at org.apache.hadoop.hbase.client.TestHCM.testMulti(TestHCM.java:858)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8558//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8558//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8558//console

This message is automatically generated.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.02.patch, 
 HBASE-10277.03.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding 

[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-22 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878494#comment-13878494
 ] 

Nicolas Liochon commented on HBASE-10277:
-

For #3, flushCommit may be not called before close time. The write buffer 
manage the size of the message for the user, it allows him to stream its writes 
to the HTable. The issue here is really the error management., the feature 
itself is nice when everything works properly.

I would propose an option #4: add a callback for error management. If the 
callback is set, we use it. If not, we raise an exception as we used to do. We 
could as well stream the gets/increments as the puts, and use a callback to 
return the result as well. This would save the creation of the Object[], and it 
would make the interface consistent. The code itself is already there imho. We 
can consider that changing the HTable semantic is for another jira btw, as you 
like.


 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-22 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879329#comment-13879329
 ] 

Sergey Shelukhin commented on HBASE-10277:
--

Callback for each put call would still need to be maintained separately for 
each put call, and if you need results (like from get) you'd still need to have 
place to store them. Current patch bypasses the creation of Object[] when 
results are not needed, e.g. for streaming puts. Previous code does the same 
thing, but the Object[] for get was still created inside the callback object.
One problem with current callback is that as soon as you have multiple submits 
the index argument to callback becomes ambiguous, so you no longer know for 
which action you receive the error or result. So, HCM batch code creates AP for 
each single submit, that way it knows the index is always from that submit when 
the callback populates the array; and HTable just doesn't use callback for 
streaming puts because it doesn't need the results... if HTable were to use 
current callback for error management (or streaming gets when you need results 
and there are multiple submit calls), it becomes a real problem.

We can instead add /per-call/ callback in the context. It's a hybrid between #3 
and #4; AP can avoid global error support; we can add async call with callback 
to HTable which would use the regular path; current streaming put can have 
the same semantics but maintain the contexts in HTable rather than AP. Let me 
think more about the latter case.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-21 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13877390#comment-13877390
 ] 

Nicolas Liochon commented on HBASE-10277:
-

bq. Does it mean that an AsyncProcess can now be shared between Tables?
bq. Yes
This seems great. Would it be possible then to have a single AsyncProcess per 
HConnection, shared between the different htables objects? This would make
Side question: would it make sense to use the multiget path for a single get, 
instead of having two different paths?

bq.  When we have a scenario to use some callback, we can add it, under YAGNI 
principle 
The scenario is already there: it's how to manage the errors with the write 
buffer. I didn't want to make the interface public (as once it's public you 
should not change it), but at the end of the day, the callback is the most 
obvious solution to the problem. Having it here sets a base for the discussion. 
If your patch allows to have a common resource management per HTable, I'm happy 
to lose the callbacks as a side effect of the patch, but having both would be 
better imho.

bq. IIRC most of these paths are deprecated.
What's deprecated is mainly that the batch interfaces were in HConnection 
instead of HTable. The Object[] is ugly, but is still the 'recommended' way.


 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878076#comment-13878076
 ] 

Sergey Shelukhin commented on HBASE-10277:
--

bq. This seems great. Would it be possible then to have a single AsyncProcess 
per HConnection, shared between the different htables objects? This would make
Not for legacy mode, because then the cross-put behavior will also be 
cross-HTable.
For individual requests, yeah, that can be done.
Also the sentence appears to be unfinished.

bq. Side question: would it make sense to use the multiget path for a single 
get, instead of having two different paths?
Yeah, that is possible, but it is in scope of different JIRA.

bq. The scenario is already there: it's how to manage the errors with the write 
buffer. I didn't want to make the interface public (as once it's public you 
should not change it), but at the end of the day, the callback is the most 
obvious solution to the problem. Having it here sets a base for the discussion. 
If your patch allows to have a common resource management per HTable, I'm happy 
to lose the callbacks as a side effect of the patch, but having both would be 
better imho.
Can you elaborate on the error management? Right now the patch preserves the 
cross-put-errors mode for HTable, without the callback.


bq. What's deprecated is mainly that the batch interfaces were in HConnection 
instead of HTable. The Object[] is ugly, but is still the 'recommended' way
Yeah, for these paths and without the custom pool is where we reuse the same 
AsyncProcess.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878100#comment-13878100
 ] 

stack commented on HBASE-10277:
---

Pardon the dumb question: Why we need to support 'legacy' behavior?   Strip it?

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878106#comment-13878106
 ] 

Sergey Shelukhin commented on HBASE-10277:
--

94 compat... HTable put is currently async but does not have any means to 
return errors. flushCommits can flush multiple puts. Errors are eventually 
thrown thru some put call or flushCommits. We can either break HTable::put 
interface (doesn't seem viable), make put-s sync and add separate async put 
(that is possible but may also be surprising), or remove old pattern from AP, 
but keep track of all the puts inside HTable itself, and aggregate all errors 
only when flushCommits is called, for example (with some client performance 
loss because multiple requests will be tracked on higher level than in AP). 
Overall, I can see merit in scenario where you do bunch of puts and then 
flush... it could be replaced with user issuing multi-puts explicitly, but now 
that API is such as it is, we cannot simply remove it I think. Maybe the 3rd 
approach above is viable, if we add some javadocs/notes.
What do you think?

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878111#comment-13878111
 ] 

stack commented on HBASE-10277:
---

So, this redo has to be backportable to 0.94?  Or you mean that the 0.94 APIs 
must work as they did in 0.94 though you are in a 0.96+ context?

Which is option #3?  I do not see a #3 above.  Do you mean 'remove old pattern 
from AP'?  If so, that sounds good to me.  AP is done 'right' (but you have to 
add hackery to handle the ugly stuff a while).  Old API is deprecated and IMO, 
it is find if deprecated API loses perf -- it is incentive to move to the new 
way.



 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-21 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878137#comment-13878137
 ] 

Sergey Shelukhin commented on HBASE-10277:
--

Yeah, I mean the API behavior compat; not backporting.
Options are (inserting numbers into the above)
bq. (1) either break HTable::put interface (doesn't seem viable), (2) make 
put-s sync and add separate async put (that is possible but may also be 
surprising), or (3) remove old pattern from AP, but keep track of all the puts 
inside HTable itself, and aggregate all errors only when flushCommits is 
called, for example (with some client performance loss because multiple 
requests will be tracked on higher level than in AP). 
HTable::put is not deprecated though, it just has very idiosyncratic behavior 
compared to usual sync, async or batching APIs.



 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13878276#comment-13878276
 ] 

Hadoop QA commented on HBASE-10277:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12624266/HBASE-10277.01.patch
  against trunk revision .
  ATTACHMENT ID: 12624266

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+  // This action failed before creating ars. Add it to retained 
but do not add to submit list.
+ Throwable error, long backOffTime, boolean 
willRetry, String startTime){

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestCatalogJanitor

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8487//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8487//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8487//console

This message is automatically generated.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.01.patch, HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry 

[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-20 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876520#comment-13876520
 ] 

Nicolas Liochon commented on HBASE-10277:
-

It's a difficult patch to read:
1) Some changes are cosmetics: some protected becomes private, some this. are 
removed. I'm not against these changes, but it makes the real meat more 
difficult to find.
2) The javadoc has not been updated, so when the code differs from the javadoc, 
the reader has to sort out himself if it's just that the javadoc is now out 
dated or if there is a regression.
I haven't yet reviewed it globally, but here is a set of questions/comments.

AsyncProcess#submit. Why does it take a tableName? Does it mean that an 
AsyncProcess can now be shared between Tables?

AsyncRequestSet#waitUntilDone
Same responsibility as AsyncProcess#waitUntilDone, but less features (no logs. 
These logs are useful).
bq. It would be nice to normalize AP usage patterns; in particular, separate 
the global part (load tracking) from per-submit-call part.
This part should go in HConnection, as we should manage the load tracking 
globally and not only for a single call. Iit would be a change in bahavior 
compared to the 0.94, but I think we should do it. Would it make you life 
easier here?

bq. I ran some perf test using YCSB and table with write-dropping coproc
Great, really. We should do that much more often...

bq. Probably this perf difference will not be noticeable on real requests 
(remains to be tested).
Let me be more pessimistic here :-).

bq. Also got rid of callback that was mostly used for tests, tests can check 
results without it.
I'm not a big fan of this part of the change. Callbacks can be reused in 
different context (for example to have a different policy, such as ignoring 
errors as in HTableMultiplexer). As well, we now have a hardRetryLimit, but 
this attribute is used only in tests.


More globally, This patch allows to reuse a single AsyncProcess between 
independant streams of writes. Would that be necessary if it was cheaper to 
create ? The cost is reading the configuration, as when we do a HTable#get and 
create a RegionServerCallable. The problem is that with this patch, we still 
create a AsyncProcess in some cases, for example on the batchCallback path...

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 -I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.- Cannot be avoided because 
 the API to server doesn't have one-to-one correspondence between requests and 
 responses in an individual call to multi (retries/rearrangement have nothing 
 to do with it)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-20 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13876760#comment-13876760
 ] 

Sergey Shelukhin commented on HBASE-10277:
--


bq. Could AsyncRequests be done as a 'Future'? Seems to have a bunch in common.
I'll take a look at it... it's more like a multi-future. Maybe even FutureTask 
can be used.

bq. We have something called Set but it does not implement java.util.Set:
Will fix.

bq. We make an ap and a multiap each time?
Once per HTable. The difference is the mode of operation - the legacy mode or 
the normal one.

bq. Batch should return an array of objects?
It does? Don't quite understand the comment. Some of the existing interfaces 
accept the array for results that fill it. Backward compat.


bq. 1) Some changes are cosmetics: some protected becomes private, some this. 
are removed. I'm not against these changes, but it makes the real meat more 
difficult to find.
Removing this. is not cosmetic (at least in the cases I am aware of) - 
methods moved to a non-static nested class, there's no more this..
Changing to private can be removed, although it's a good change to have.

bq. 2) The javadoc has not been updated, so when the code differs from the 
javadoc, the reader has to sort out himself if it's just that the javadoc is 
now out dated or if there is a regression. 
Will update.

bq. AsyncProcess#submit. Why does it take a tableName? Does it mean that an 
AsyncProcess can now be shared between Tables?
Yes.

bq. AsyncRequestSet#waitUntilDone
bq. Same responsibility as AsyncProcess#waitUntilDone, but less features (no 
logs. These logs are useful).
Some logs were preserved. 
Previous waitUntilDone had two wait conditions and loops in an effort not to 
loop often, whereas only one seems to be necessary, so I don't think features 
were lost... I can add more logs.

bq. This part should go in HConnection, as we should manage the load tracking 
globally and not only for a single call. Iit would be a change in bahavior 
compared to the 0.94, but I think we should do it. Would it make you life 
easier here?
It may, actually... but HTable uses AP directly. In fact batch calls from HCM 
currently don't use throttling at all (it calls submitAll), so only 
HTable-direct-usage uses this code.

{quote}
bq. Probably this perf difference will not be noticeable on real requests 
(remains to be tested).
Let me be more pessimistic here .
{quote}
Yeah, I'd probably need to test. I was not able to figure out what exactly is 
the cause of slowdown - YourKit gives very low % numbers for AP, CPU or wall 
clock, and almost identical between old and equivalent new methods across runs.

{quote}
bq. Also got rid of callback that was mostly used for tests, tests can check 
results without it.
I'm not a big fan of this part of the change. Callbacks can be reused in 
different context (for example to have a different policy, such as ignoring 
errors as in HTableMultiplexer). As well, we now have a hardRetryLimit, but 
this attribute is used only in tests.
{quote}
Callback was already only used for tests, for practical purposes (also for 
array filling, but that is no longer necessary).
I am not a big fan of having test-oriented code in production code; thus I 
replaced the only necessary test usage (stopping retries) with a field. I 
wanted to get rid of that too, but that would be too convoluted it seems.
When we have a scenario to use some callback, we can add it, under YAGNI 
principle :)

bq. More globally, This patch allows to reuse a single AsyncProcess between 
independant streams of writes. Would that be necessary if it was cheaper to 
create ? The cost is reading the configuration, as when we do a HTable#get and 
create a RegionServerCallable.
The main reason is to have well-defined context for single call for normal 
(as opposed to HTable cross-put stuff) usage pattern. So for example replica 
calls could group requests differently and synchronize/populate the same 
result, cancel other calls when they are no longer useful, etc.
It also separates the two patterns better (by ctor argument); see potential 
problems outlined in JIRA description.

bq. The problem is that with this patch, we still create a AsyncProcess in some 
cases, for example on the batchCallback path...
Yeah, that is due to ability to pass a custom execution pool (could be changed 
to accomodate it by making pool per request, but I am not sure it's 
necessary...) and limitations of java generics + current result type handling 
(AP will have to be per result type). IIRC most of these paths are deprecated.



 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: 

[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-15 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872639#comment-13872639
 ] 

Nicolas Liochon commented on HBASE-10277:
-

It's on my plate :-). I'm going to do the review. Thanks for working on this 
Sergey.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-14 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871439#comment-13871439
 ] 

stack commented on HBASE-10277:
---

Thanks for having a go at this [~sershe].  

 bq. The (ugly) behavior for HTable where e.g. next put will give you errors 
from previous put was lovingly preserved.

Smile.

Radical!

Here are a couple of comments:

Could AsyncRequests be done as a 'Future'?  Seems to have a bunch in common.

We have something called Set but it does not implement java.util.Set: 

+  protected class AsyncRequestSet implements AsyncRequests {

We make an ap and a multiap each time?

+ap = new AsyncProcessObject(connection, pool, configuration, 
rpcCallerFactory, true);
+multiAp = new AsyncProcessObject(connection, pool, configuration, 
rpcCallerFactory, false);

Batch should return an array of objects?

+Object[] results = new Object[actions.size()];
+batch(actions, results);
+return results;

Then you can do return batch(actions, new Object[actions.size()]);

Needs a closer review.

Whats our [~nkeywal] think?

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13869762#comment-13869762
 ] 

Sergey Shelukhin commented on HBASE-10277:
--

I started the mailing thread on this. 

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-11 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868855#comment-13868855
 ] 

Andrew Purtell commented on HBASE-10277:


bq. The (ugly) behavior for HTable where e.g. next put will give you errors 
from previous put was lovingly preserved.

To my mind this is a serious problem because the application is getting 
incorrect feedback on what operation actually failed.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-09 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867493#comment-13867493
 ] 

Enis Soztutar commented on HBASE-10277:
---

[~nkeywal] FYI. 

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867511#comment-13867511
 ] 

Hadoop QA commented on HBASE-10277:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12622325/HBASE-10277.patch
  against trunk revision .
  ATTACHMENT ID: 12622325

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+  // This action failed before creating ars. Add it to retained 
but do not add to submit list.
+ Throwable error, long backOffTime, boolean 
willRetry, String startTime){

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestCatalogJanitor

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8382//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8382//console

This message is automatically generated.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently