subject:"\[jira\] \[Commented\] \(HBASE\-5924\) In the client code, don't wait for all the requests to be executed before resubmitting a request in error."

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-14 Thread Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294841#comment-13294841
]

Ted Yu commented on HBASE-5924:
---

Strange that TestDrainingServer failed Hadoop QA.
It passed locally.

Patch integrated to trunk.

Thanks for the patch, N.

Thanks for the review, Stack.

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

Key: HBASE-5924
URL: https://issues.apache.org/jira/browse/HBASE-5924
Project: HBase
Issue Type: Improvement
Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Fix For: 0.96.0

Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v19.patch,
5924.v5.patch, 5924.v9.patch

The client (in the function HConnectionManager#processBatchCallback) works in
two steps:
- make the requests
- collect the failures and successes and prepare for retry
It means that when there is an immediate error (region moved, split, dead
server, ...) we still wait for all the initial requests to be executed before
submitting again the failed request. If we have a scenario with all the
requests taking 5 seconds we have a final execution time of: 5 (initial
requests) + 1 (wait time) + 5 (final request) = 11s.
We could improve this by analyzing immediately the results. This would lead
us, for the scenario mentioned above, to 6 seconds.
So we could have a performance improvement of nearly 50% in many cases, and
much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294928#comment-13294928
 ] 

Hudson commented on HBASE-5924:
---

Integrated in HBase-TRUNK #3023 (See 
[https://builds.apache.org/job/HBase-TRUNK/3023/])
HBASE-5924 In the client code, don't wait for all the requests to be 
executed before resubmitting a request in error (N Keywal)

Submitted by:   N Keywal
Reviewed by:Stack, Ted (Revision 1350105)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnection.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Triple.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithRemove.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java


 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v19.patch, 
 5924.v5.patch, 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-14 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294977#comment-13294977
 ] 

Hudson commented on HBASE-5924:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #53 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/53/])
HBASE-5924 In the client code, don't wait for all the requests to be 
executed before resubmitting a request in error (N Keywal)

Submitted by:   N Keywal
Reviewed by:Stack, Ted (Revision 1350105)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnection.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/Triple.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithRemove.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java


 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v19.patch, 
 5924.v5.patch, 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-14 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13295204#comment-13295204
]

stack commented on HBASE-5924:
--

I took a look at committed patch. This looks great.

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v19.patch,
5924.v5.patch, 5924.v9.patch

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-13 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294587#comment-13294587
 ] 

nkeywal commented on HBASE-5924:


v19 with all the comments taken into account. I will create an another jira to 
rearrange the coprocessors tests on the znode.

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v19.patch, 
 5924.v5.patch, 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-13 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294612#comment-13294612
]

Hadoop QA commented on HBASE-5924:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12531976/5924.v19.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 9 new or modified tests.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.TestDrainingServer

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2163//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/2163//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/2163//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2163//console

This message is automatically generated.

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v19.patch,
5924.v5.patch, 5924.v9.patch

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-12 Thread nkeywal (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293775#comment-13293775
]

nkeywal commented on HBASE-5924:

@ted: Ok. These issues were already in my initial patch. Could you confirm that
you have finished the review? I would like to deliver 'the' final patch. Thank
you.

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v5.patch,
5924.v9.patch

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-12 Thread Zhihong Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293778#comment-13293778
 ] 

Zhihong Ted Yu commented on HBASE-5924:
---

I don't have further comments.

Thanks

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v5.patch, 
 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-08 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291624#comment-13291624
 ] 

nkeywal commented on HBASE-5924:


I've done some changes to TestRegionServerCoprocessorExceptionWithAbort. The 
new implementation is very different but I think closer to what the test really 
wants to check...

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v5.patch, 
 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-08 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291644#comment-13291644
]

Hadoop QA commented on HBASE-5924:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12531380/5924.v14.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 6 new or modified tests.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2129//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/2129//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/2129//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2129//console

This message is automatically generated.

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-08 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291656#comment-13291656
 ] 

nkeywal commented on HBASE-5924:


I don't reproduce locally the issue on TestServerCustomProtocol, seems ok to me.

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v5.patch, 
 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-08 Thread Zhihong Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291657#comment-13291657
 ] 

Zhihong Ted Yu commented on HBASE-5924:
---

With RSTracker gone, the following flag is no longer checked:
{code}
-public synchronized void nodeDeleted(String path) {
-  if (path.equals(rsNode)) {
-regionZKNodeWasDeleted = true;
{code}
Can we keep the check ?
{code}
-assertTrue(RegionServer aborted on coprocessor exception, as expected.,
-rsTracker.regionZKNodeWasDeleted);
{code}
I think this should be kept:
{code}
-table.close();
{code}

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v5.patch, 
 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-08 Thread nkeywal (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291663#comment-13291663
]

nkeywal commented on HBASE-5924:

bq. With RSTracker gone, the following flag is no longer checked:
Aggreed, but this unit test is supposed to test if the region server aborted
when the coprocessor bugged, not if the regionserver znode is deleted on
regionserver abort. I propose to check if there is an existing test on this in
the tests suite and if not add it in the regionserver test package. I will
comment in this jira if there is already a test, create a new one if I need to
extend an existing test case.

bq. table.close();
Yeah, I checked and it's still works, with or without the close.

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v5.patch,
5924.v9.patch

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-08 Thread Zhihong Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291811#comment-13291811
 ] 

Zhihong Ted Yu commented on HBASE-5924:
---

{code}
$ find hbase-server/src/test -name '*.java' -exec grep 'nized void 
nodeDeleted(Str' {} \; -print
public synchronized void nodeDeleted(String path) {
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithAbort.java
public synchronized void nodeDeleted(String path) {
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithRemove.java
public synchronized void nodeDeleted(String path) {
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java
{code}
The other two checks are for master znode.

w.r.t. table.close(), it is good programming practice of cleaning up resources.

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v5.patch, 
 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-08 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291815#comment-13291815
 ] 

nkeywal commented on HBASE-5924:


bq. w.r.t. table.close(), it is good programming practice of cleaning up 
resources.
Yes, I agree. I wanted to say in my previous answer: I tested, it works, it can 
be added. 

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v5.patch, 
 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-08 Thread Zhihong Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291889#comment-13291889
 ] 

Zhihong Ted Yu commented on HBASE-5924:
---

{code}
+if (loc == null)
+  throw new IOException();
{code}
Without braces, the throw statement should be on the same line as if. Please 
include a brief message for the exception.
Some long lines should be wrapped:
{code}
+final MapHRegionLocation, MultiActionR actionsByServer = new 
HashMapHRegionLocation, MultiActionR();
...
+new TripleMultiActionR, HRegionLocation, 
FutureMultiResponse(e.getValue(), e.getKey(), this.pool.submit(callable));
{code}

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5924.v11.patch, 5924.v14.patch, 5924.v5.patch, 
 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-07 Thread nkeywal (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13290899#comment-13290899
]

nkeywal commented on HBASE-5924:

@ted

bq. 'origin' - 'original', 'what are the actions to replay' - 'what actions
to replay'
Done.

bq. InterruptedIOException should be thrown.
Done.

bq. The above is hard to read. A period between 'records' and 'results' ? A
period between 'list' and 'walk' ?
It was already there previously :-). But I aggree, it's better with some
periods or dots. Done.

bq. hbase-server/src/main/java/org/apache/hadoop/hbase/util/Triple.java was not
included in patch v9.
Done

@stack

bq. Did you add the history in the first place? Why is it safe to remove it now?
In the previous code we were updating the locations cache multiple times for
the same row, and the second time without the RegionMovedException. So it was
necessary to store that we had already taken the error into account for this
row... We now update the locations cache only once, so we don't need to store
the history anymore.

bq. On your three comments above, on 1., on the unused code, it may not be
triggered by the test suite – that could just be bad test coverage – but
independent, there may have been a reason for it. If your review of
processBatchCallback has it making no sense, by all means purge it (as you have
done).

Yep, for this one removing it allows to simplify the algorithm as I can find
the original actions.

bq. On 2., the callback, it looks like you kept it. I think that sensible. On
3., can we move it to HTable? Deprecate the current version in favor of the new
HTable/HTableInterface version? Would that be too disruptive?

We can keep the existing interface, deprecate it, and add the new one in
HTable, making it call the old one.
Then in the future remove if from HConnection and move the code in HTable.

I've done it in v10.

bq. Any way you can add tests to prove your claims of improvement above (its
hard to review for that...

It's hard. Testing that we restart immediately instead of waiting for all
results is difficult without adding sleeps and/or mocking a lot of things,
because it's not visible at all outside of the method: its interface has not
changed, just the internal algorithm.

Functionally, it's tested through testRegionCaching (with some extra checks in
it in this patch), and it proves that:
- it works on nominal case (and you can't start the mini cluster when the
nominal case does not work).
- it retries when one RS fails
- it stops to retry when the number of retries is reached, and throws the right
exception with the right content

For the performance improvement on nominal case, unfortunately it does not make
a big difference. It's cleaner, but the tests done show that it's not important
vs. the remaining time.

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-07 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291069#comment-13291069
]

stack commented on HBASE-5924:
--

Upload v10 nkeywal and we'll commit it. Thanks for responses above.

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-07 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291370#comment-13291370
 ] 

nkeywal commented on HBASE-5924:


v11: I changed the names to match HTableInterface#batch, so instead of 
HTableInterface#processBatchCallback I created HTableInterface#batchCallback. 
Local tests in progress.

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 5924.v11.patch, 5924.v5.patch, 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-07 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291375#comment-13291375
 ] 

nkeywal commented on HBASE-5924:


Ok, I got a random failure in a test I touched 
(TestRegionServerCoprocessorExceptionWithAbort), but it's because the test is 
flaky I think (I can be wrong :-) ). I will have a look tomorrow to be sure, 
but I think the patch v11 is reasonable enough to be committed.

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 5924.v11.patch, 5924.v5.patch, 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-07 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291434#comment-13291434
]

Hadoop QA commented on HBASE-5924:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12531326/5924.v11.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 6 new or modified tests.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
org.apache.hadoop.hbase.client.TestAdmin

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2125//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/2125//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/2125//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2125//console

This message is automatically generated.

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-07 Thread Zhihong Ted Yu (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13291476#comment-13291476
]

Zhihong Ted Yu commented on HBASE-5924:
---

TestRegionServerCoprocessorExceptionWithAbort failed on QA machine.
Should investigate.

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-06 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13290339#comment-13290339
 ] 

nkeywal commented on HBASE-5924:


v9, with:
- Ted's comment taken into account
- Full removal of UpdateHistory stuff
- Fix for HBASE-6156

Can be committed imho.

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 5924.v5.patch, 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-06 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13290443#comment-13290443
]

stack commented on HBASE-5924:
--

@nkeywal Did you add the history in the first place? Why is it safe to remove
it now?

Thanks for redoing processBatchCallback.

On your three comments above, on 1., on the unused code, it may not be
triggered by the test suite -- that could just be bad test coverage -- but
independent, there may have been a reason for it. If your review of
processBatchCallback has it making no sense, by all means purge it (as you have
done).

On 2., the callback, it looks like you kept it. I think that sensible.

On 3., can we move it to HTable? Deprecate the current version in favor of the
new HTable/HTableInterface version? Would that be too disruptive?

Any way you can add tests to prove your claims of improvement above (its hard
to review for that... but sounds like it at least works... going by it passing
all unit tests... good on you nkeywal)

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-06 Thread Zhihong Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13290483#comment-13290483
 ] 

Zhihong Ted Yu commented on HBASE-5924:
---

{code}
+  // We need the origin multi action to find out what are the actions 
to replay if
{code}
'origin' - 'original', 'what are the actions to replay' - 'what actions to 
replay'
{code}
+} catch (InterruptedException e) {
+  throw new IOException(e);
{code}
InterruptedIOException should be thrown.
{code}
+  // mutate list so that it is empty for complete success, or contains
+  // only failed records results are returned in the same order as the
+  // requests in list walk the list backwards, so we can remove from list
{code}
The above is hard to read. A period between 'records' and 'results' ? A period 
between 'list' and 'walk' ?

Hadoop QA didn't run tests:
https://builds.apache.org/job/PreCommit-HBASE-Build/2116/console

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 5924.v5.patch, 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-06 Thread Zhihong Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13290509#comment-13290509
 ] 

Zhihong Ted Yu commented on HBASE-5924:
---

hbase-server/src/main/java/org/apache/hadoop/hbase/util/Triple.java was not 
included in patch v9.
Hence:
{code}
[ERROR] 
/home/hduser/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java:[2061,23]
 cannot find symbol
[ERROR] symbol  : class Triple
[ERROR] location: class 
org.apache.hadoop.hbase.client.HConnectionManager.HConnectionImplementation.ProcessR
{code}

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 5924.v5.patch, 5924.v9.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-05 Thread nkeywal (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289591#comment-13289591
]

nkeywal commented on HBASE-5924:

Bumped the priority to major to make clear it's a complete rewriting. Tests are
in progress, I will push a version soon anyway to get feedback.
Changes:
Analyze the replies as they come, not in the initial request order
Replay the failed request immediately, not when we have all the replies
Reuse the actions in case of errors instead of recreating the objects
Don't iterate on the results list to find the errors
Don't reiterate on the results list to detail the errors.

Note that I removed the 'updateHistory' list but not the code in case the
feedback shows it should still be used.
Even if it's a one to one implementation, it's preferable to add specific
tests. Will do in a later update. And the current implementation stayed in
HConnectionManager and kept its callback. Happy to change this.

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

Key: HBASE-5924
URL: https://issues.apache.org/jira/browse/HBASE-5924
Project: HBase
Issue Type: Improvement
Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-05 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289646#comment-13289646
]

Hadoop QA commented on HBASE-5924:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12530992/5924.v5.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9)
warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2110//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/2110//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/2110//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2110//console

This message is automatically generated.

In the client code, don't wait for all the requests to be executed before
resubmitting a request in error.
--

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-05 Thread Zhihong Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289703#comment-13289703
 ] 

Zhihong Ted Yu commented on HBASE-5924:
---

Nice work.
{code}
private static class ProcessR {
{code}
Can we name the above class more meaningfully ? javadoc is desirable.
{code}
   * @param sleepTime - sleep time befora actually executing the actions. 
Can be zero.
{code}
'befora' - 'before'
{code}
   for (ActionR aTodo : actionsList) {
{code}
aToDo - anAction ?
{code}
  CallableMultiResponse callable = createDelayedCallable(sleepTime, 
e.getKey(), e.getValue());
  TripleMultiActionR, HRegionLocation, FutureMultiResponse p =
new TripleMultiActionR, HRegionLocation, 
FutureMultiResponse(e.getValue(), e.getKey(), this.pool.submit(callable));
{code}
Wrap the two long lines above.
{code}
  throw new IllegalArgumentException(
argument results must be the same size as argument list);
{code}
It would be nice to include the sizes in exception message.


 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Attachments: 5924.v5.patch


 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-01 Thread nkeywal (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287582#comment-13287582
 ] 

nkeywal commented on HBASE-5924:


This leads to a complete rewriting of the processBatchCallback function.
3 comments:
1) I don't see how this piece of code can happen, and I ran the complete test 
suite without getting into this part. Do I miss anything?
{noformat}
  for (PairInteger, Object regionResult : regionResults) {
if (regionResult == null) {
  // if the first/only record is 'null' the entire region 
failed.
  LOG.debug(Failures for region:  +
  Bytes.toStringBinary(regionName) +
  , removing from cache);
} else {
{noformat}

2) The callback is never used internally. Is this something we should keep for 
customer code?

3) Do I move it to HTable? There is a comment saying that it does not belong to 
Connection, and it's true. But it's public, so...



 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor

 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-06-01 Thread Zhihong Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287595#comment-13287595
 ] 

Zhihong Yu commented on HBASE-5924:
---

For #2 above, I think we can remove the callback in 0.96

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor

 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

2012-05-03 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267934#comment-13267934
 ] 

stack commented on HBASE-5924:
--

+1

 In the client code, don't wait for all the requests to be executed before 
 resubmitting a request in error.
 --

 Key: HBASE-5924
 URL: https://issues.apache.org/jira/browse/HBASE-5924
 Project: HBase
  Issue Type: Improvement
  Components: client
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor

 The client (in the function HConnectionManager#processBatchCallback) works in 
 two steps:
  - make the requests
  - collect the failures and successes and prepare for retry
 It means that when there is an immediate error (region moved, split, dead 
 server, ...) we still wait for all the initial requests to be executed before 
 submitting again the failed request. If we have a scenario with all the 
 requests taking 5 seconds we have a final execution time of: 5 (initial 
 requests) + 1 (wait time) + 5 (final request) = 11s.
 We could improve this by analyzing immediately the results. This would lead 
 us, for the scenario mentioned above, to 6 seconds. 
 So we could have a performance improvement of nearly 50% in many cases, and 
 much more than 50% if the request execution time is different.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

[jira] [Commented] (HBASE-5924) In the client code, don't wait for all the requests to be executed before resubmitting a request in error.

32 matches

Site Navigation

Mail list logo

Footer information