[jira] [Commented] (HBASE-18061) [C++] Fix retry logic in multi-get calls

2017-07-14 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16087897#comment-16087897
 ] 

Enis Soztutar commented on HBASE-18061:
---

Thanks Sudeep for the patch. 

I've run the unit tests which pass. I was also be to run simple-client doing 
multi-gets against 500K rows with batch sizes 1,10,100,1000,1 and 10.  

A batch size of 500K results in  
{code}
 what():  LifoSemMPMCQueue full, can't add item
{code}
but it is fine for now. 

Why are we doing try lock here? What happens when we cannot acquire the lock? 
{code}
-std::lock_guard lock(multi_mutex_);
-for (uint64_t num = 0; num < completed_responses.size(); ++num) {
+std::unique_lock lock(multi_mutex_, std::try_to_lock);
{code}
We are doing this in multiple places actually, and we never check the results 
of whether the lock was acquired or not. Seems something to fix before commit.  

remove these from the patch: 
{code}
+//#include 
+//using ::testing::Return;
+//using ::testing::_;
{code}

I was double checking the logic again against the java implementation. It seems 
that we are missing the function: 
{code}
  private List removeErrors(Action action) {
synchronized (action2Errors) {
  return action2Errors.remove(action);
}
  }
{code} 
which is called from failOne() etc, and removes the error from the 
actions2Errors map. However, logically, I think it is fine to not have this 
removal logic. Just making a note here in case we run into it later. 
- Mainly a code cleanness issue, but you should move this logic: 
{code}
auto itr = search->second->actions_by_region().find(region_loc->region_name());
  if (itr != search->second->actions_by_region().end()) {
search->second->AddActionsByRegion(region_loc, action);
  } else {
search->second = std::make_shared(region_loc);
search->second->AddActionsByRegion(region_loc, action);
  }
{code}
to be inside ServerRequest::AddActionsByRegion(). The logic for "add the action 
to the list of actions by region by finding the region" belongs in this method, 
and should not be exposed outside normally. 

I think there is no multi-region test with failure conditions. We can look at 
adding that in a follow up issue if needed. 

In {{::Send()}}, this logic: 
{code}
if (!ExceptionUtil::ShouldRetry(ew)) {
  std::vector failed_actions;
  for (auto  : action_by_server.second->actions_by_region()) {
for (const auto _action : value.second->actions()) {
  failed_actions.push_back(failed_action);
}
  }
  FailAll(failed_actions, tries);
} else {
  OnError(action_by_server.second->actions_by_region(), tries, ew,
  action_by_server.first);
}
{code}
can be replaced by a simple OnError() call, no? OnError() already checks 
whether exception should retry and calls FailAll() with the same vector of 
actions. Can you please double check. 


Other than these patch looks pretty good. 

> [C++] Fix retry logic in multi-get calls
> 
>
> Key: HBASE-18061
> URL: https://issues.apache.org/jira/browse/HBASE-18061
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Sudeep Sunthankar
> Fix For: HBASE-14850
>
> Attachments: HBASE-18061.HBASE-14850.v1.patch, 
> HBASE-18061.HBASE-14850.v3.patch, HBASE-18061.HBASE-14850.v5.patch, 
> HBASE-18061.HBASE-14850.v6.patch, HBASE-18061.HBASE-14850.v7.patch, 
> hbase-18061-v8.patch
>
>
> HBASE-17576 adds multi-gets. There are a couple of todos to fix in the retry 
> logic, and some unit testing to be done for the multi-gets. We'll do these in 
> this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18061) [C++] Fix retry logic in multi-get calls

2017-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086830#comment-16086830
 ] 

Hadoop QA commented on HBASE-18061:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HBASE-18061 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-18061 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877223/hbase-18061-v8.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7654/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> [C++] Fix retry logic in multi-get calls
> 
>
> Key: HBASE-18061
> URL: https://issues.apache.org/jira/browse/HBASE-18061
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Sudeep Sunthankar
> Fix For: HBASE-14850
>
> Attachments: HBASE-18061.HBASE-14850.v1.patch, 
> HBASE-18061.HBASE-14850.v3.patch, HBASE-18061.HBASE-14850.v5.patch, 
> HBASE-18061.HBASE-14850.v6.patch, HBASE-18061.HBASE-14850.v7.patch, 
> hbase-18061-v8.patch
>
>
> HBASE-17576 adds multi-gets. There are a couple of todos to fix in the retry 
> logic, and some unit testing to be done for the multi-gets. We'll do these in 
> this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18061) [C++] Fix retry logic in multi-get calls

2017-07-12 Thread Sudeep Sunthankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084716#comment-16084716
 ] 

Sudeep Sunthankar commented on HBASE-18061:
---

Thanks [~tedyu], yes I can see there are some issues. I have rectified the same 
and testing the patch. 

> [C++] Fix retry logic in multi-get calls
> 
>
> Key: HBASE-18061
> URL: https://issues.apache.org/jira/browse/HBASE-18061
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Sudeep Sunthankar
> Fix For: HBASE-14850
>
> Attachments: HBASE-18061.HBASE-14850.v1.patch, 
> HBASE-18061.HBASE-14850.v3.patch, HBASE-18061.HBASE-14850.v5.patch, 
> HBASE-18061.HBASE-14850.v6.patch, HBASE-18061.HBASE-14850.v7.patch
>
>
> HBASE-17576 adds multi-gets. There are a couple of todos to fix in the retry 
> logic, and some unit testing to be done for the multi-gets. We'll do these in 
> this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18061) [C++] Fix retry logic in multi-get calls

2017-07-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084639#comment-16084639
 ] 

Ted Yu commented on HBASE-18061:


After resolving the above conflict, I got the following compilation errors:
{code}
core/response-converter.cc:154:55: error: no viable conversion from 
'shared_ptr' to 'shared_ptr'
  multi_response->AddRegionException(region_name, ew);
  ^~
core/response-converter.cc:174:57: error: no viable conversion from 
'shared_ptr' to 'shared_ptr'
multi_response->AddRegionException(region_name, ew);
^~
{code}

> [C++] Fix retry logic in multi-get calls
> 
>
> Key: HBASE-18061
> URL: https://issues.apache.org/jira/browse/HBASE-18061
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Sudeep Sunthankar
> Fix For: HBASE-14850
>
> Attachments: HBASE-18061.HBASE-14850.v1.patch, 
> HBASE-18061.HBASE-14850.v3.patch, HBASE-18061.HBASE-14850.v5.patch, 
> HBASE-18061.HBASE-14850.v6.patch, HBASE-18061.HBASE-14850.v7.patch
>
>
> HBASE-17576 adds multi-gets. There are a couple of todos to fix in the retry 
> logic, and some unit testing to be done for the multi-gets. We'll do these in 
> this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18061) [C++] Fix retry logic in multi-get calls

2017-07-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084209#comment-16084209
 ] 

Ted Yu commented on HBASE-18061:


I tried to apply patch v7 but saw:

1 out of 16 hunks FAILED -- saving rejects to file 
hbase-native-client/core/async-batch-rpc-retrying-caller.cc.rej

> [C++] Fix retry logic in multi-get calls
> 
>
> Key: HBASE-18061
> URL: https://issues.apache.org/jira/browse/HBASE-18061
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Sudeep Sunthankar
> Fix For: HBASE-14850
>
> Attachments: HBASE-18061.HBASE-14850.v1.patch, 
> HBASE-18061.HBASE-14850.v3.patch, HBASE-18061.HBASE-14850.v5.patch, 
> HBASE-18061.HBASE-14850.v6.patch, HBASE-18061.HBASE-14850.v7.patch
>
>
> HBASE-17576 adds multi-gets. There are a couple of todos to fix in the retry 
> logic, and some unit testing to be done for the multi-gets. We'll do these in 
> this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18061) [C++] Fix retry logic in multi-get calls

2017-07-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16083346#comment-16083346
 ] 

Hadoop QA commented on HBASE-18061:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HBASE-18061 does not apply to HBASE-14850. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-18061 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876747/HBASE-18061.HBASE-14850.v7.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7626/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> [C++] Fix retry logic in multi-get calls
> 
>
> Key: HBASE-18061
> URL: https://issues.apache.org/jira/browse/HBASE-18061
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Sudeep Sunthankar
> Fix For: HBASE-14850
>
> Attachments: HBASE-18061.HBASE-14850.v1.patch, 
> HBASE-18061.HBASE-14850.v3.patch, HBASE-18061.HBASE-14850.v5.patch, 
> HBASE-18061.HBASE-14850.v6.patch, HBASE-18061.HBASE-14850.v7.patch
>
>
> HBASE-17576 adds multi-gets. There are a couple of todos to fix in the retry 
> logic, and some unit testing to be done for the multi-gets. We'll do these in 
> this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18061) [C++] Fix retry logic in multi-get calls

2017-07-11 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16083256#comment-16083256
 ] 

Enis Soztutar commented on HBASE-18061:
---

Thanks Sudeep. Can you please rebase the patch. It does not apply cleanly to 
the top of HBASE-14850 branch. Also, you are missing the file for the test in 
the patch. 

> [C++] Fix retry logic in multi-get calls
> 
>
> Key: HBASE-18061
> URL: https://issues.apache.org/jira/browse/HBASE-18061
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Sudeep Sunthankar
> Fix For: HBASE-14850
>
> Attachments: HBASE-18061.HBASE-14850.v1.patch, 
> HBASE-18061.HBASE-14850.v3.patch, HBASE-18061.HBASE-14850.v5.patch, 
> HBASE-18061.HBASE-14850.v6.patch
>
>
> HBASE-17576 adds multi-gets. There are a couple of todos to fix in the retry 
> logic, and some unit testing to be done for the multi-gets. We'll do these in 
> this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18061) [C++] Fix retry logic in multi-get calls

2017-07-10 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16081243#comment-16081243
 ] 

Enis Soztutar commented on HBASE-18061:
---

Thanks Sudeep for the updated patch. 
Can you please clean up this part: 
{code}
 if (search != actions_by_server.end()) {
+  // Multiple regions might be hosted on the same server 
(ServerName)
+  // Check if a Region is already present in the map or not. If 
not add it
+  // auto actions_by_region_name = 
search->second->actions_by_region();
+  auto itr = 
search->second->actions_by_region().find(region_loc->region_name());
+  if (itr != search->second->actions_by_region().end()) {
 search->second->AddActionsByRegion(region_loc, action);
   } else {
-  // Create new key
+// Create new key for the region location region_loc
+search->second = std::make_shared(region_loc);
+search->second->AddActionsByRegion(region_loc, action);
+// search->second = server_request;
+  }
+} else {
+  // Create new key for this ServerName
{code} 

- Can you also cleap up the VLOG's a little bit. We can use VLOG(5) for very 
excessive stuff, VLOG(1) for most of the exceptions, and anything in between. 
VLOG(8) is too much. 
- Please remove all commented out code like this: 
{code}
+//#include "connection/request.h"
..
//LOG(INFO) << "tresults size :- " << tresults.size();
{code}
- Did you run bin/format-code.sh and make lint ? 
- remove these: 
{code}
//TODO This shouldn't be in THROW, but we are getting incorrect RL's at the 
moment
{code}
Some of these tests were expected to fail, no? 
- You need to enable this test? 
{code}
+TEST_F(AsyncBatchRpcRetryTest, TestFailWithException) {
+  std::shared_ptr region_locator(
+  std::make_shared(5));
+  EXPECT_ANY_THROW(runMultiTest(region_locator, "table3"));
+  //TODO will be in EXPECT_ANY_THROW
+  // Not failing.
+  //runMultiTest(region_locator, "table3");
+}
{code}
- Undo this: 
{code}
-test_util->StartMiniCluster(2);
+test_util->StartMiniCluster(10);
{code}
- Please go over the patch again, and make sure that there is no commented-out 
code lying around. 
- This test is good {{+TEST_F(ClientTest, MultiGetsWithRegionSplits)}}, but 
there is a lot of code duplication. Maybe move the logic to a method, and call 
it from the MutlGets and MultiGetsWithRegionSplits tests. 


> [C++] Fix retry logic in multi-get calls
> 
>
> Key: HBASE-18061
> URL: https://issues.apache.org/jira/browse/HBASE-18061
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Sudeep Sunthankar
> Fix For: HBASE-14850
>
> Attachments: HBASE-18061.HBASE-14850.v1.patch, 
> HBASE-18061.HBASE-14850.v3.patch, HBASE-18061.HBASE-14850.v5.patch
>
>
> HBASE-17576 adds multi-gets. There are a couple of todos to fix in the retry 
> logic, and some unit testing to be done for the multi-gets. We'll do these in 
> this issue. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18061) [C++] Fix retry logic in multi-get calls

2017-06-06 Thread Sudeep Sunthankar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039869#comment-16039869
 ] 

Sudeep Sunthankar commented on HBASE-18061:
---

Thanks for the review [~enis]. I will have a look at HBASE-17907. Till I get 
the mocks working, I have changed some part of the code to send requests to a 
dummy region. This results in generation of exceptions as well as retries. 

> [C++] Fix retry logic in multi-get calls
> 
>
> Key: HBASE-18061
> URL: https://issues.apache.org/jira/browse/HBASE-18061
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Sudeep Sunthankar
> Fix For: HBASE-14850
>
> Attachments: HBASE-18061.HBASE-14850.v1.patch
>
>
> HBASE-17576 adds multi-gets. There are a couple of todos to fix in the retry 
> logic, and some unit testing to be done for the multi-gets. We'll do these in 
> this issue. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18061) [C++] Fix retry logic in multi-get calls

2017-06-06 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039859#comment-16039859
 ] 

Enis Soztutar commented on HBASE-18061:
---

For this: 
{code}
+  // TODO We don't get an info on whether to retry an exception or not. Should 
have a method to
{code}
It is unfortunate that the wire protocol for this does not contain information 
about whether the exception should be retried or not. We are just handling that 
in the java client layer by relying on the exception hierarchy to catch 
DoNotRetryExceptions in onError(). 

I have actually added bunch of java class names, and specific logic in the 
patch for HBASE-17907 (look at exception.h). Let me compile a list of all 
exceptions and enhance the ExceptionUtils::ShouldRetry(). 

bq. +  // We are just appending stack trace as NameBytesPair.value contains the 
entire info
Looking at ProtobufUtil.java #toException(), it seems that the value is the 
description rather than the stack trace. But we can re-use the stack trace 
field for this. 

exception-utils.h -> Sorry, in HBASE-17907 patch, I had to do the same thing. 
So we have this already in exception.h file. When you rebase you can use it 
instead. 

How are you testing this? 



> [C++] Fix retry logic in multi-get calls
> 
>
> Key: HBASE-18061
> URL: https://issues.apache.org/jira/browse/HBASE-18061
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Sudeep Sunthankar
> Fix For: HBASE-14850
>
> Attachments: HBASE-18061.HBASE-14850.v1.patch
>
>
> HBASE-17576 adds multi-gets. There are a couple of todos to fix in the retry 
> logic, and some unit testing to be done for the multi-gets. We'll do these in 
> this issue. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)