[ 
https://issues.apache.org/jira/browse/HBASE-18796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16176299#comment-16176299
 ] 

Abhishek Singh Chouhan commented on HBASE-18796:
------------------------------------------------

[~apurtell] The addendum might not be correct solution to this problem and 
might cause a problem elsewhere. I think we should hold on to committing that.
I had a bit more look into the behavior. Scan.next says that we should either 
get values or we get null if the scanner is exhausted. This doesn't seem to be 
the case hence we're getting the issue here.
In the client scanner 
values = call(callable, caller, scannerTimeout, true);
we get a single element in the Result[] values array. Now we proceed to
Result[] resultsToAddToCache =
          scanResultCache.addAndGet(values, callable.isHeartbeatMessage());
int numberOfCompleteRows =
          scanResultCache.numberOfCompleteRows() - numberOfCompleteRowsBefore;

In CompleteScanResultCache#addAndGet we have the code:
{code}
Result last = results[results.length - 1];
    if (last.mayHaveMoreCellsInRow()) {
      if (partialResults.isEmpty()) {
        partialResults.add(last);
        return updateNumberOfCompleteResultsAndReturn(Arrays.copyOf(results, 
results.length - 1));
      }
{code}
here since results.length = 1 and last.mayHaveMoreCellsInRow() is true we add 
the result we got into partialResults however the completed results is 0.  
We end up in  (have to look more into this part)
if (scan.getLimit() == 0 || scanExhausted(values)) {
        closeScanner();
        closed = true;
        break;
      }
and end up returning null to the client which should not be the case. We should 
open another jira for this. Setting allowpartial in ConnectionImplementation 
might be the wrong thing to do that has other side effects and might hide the 
actual problem.
Someone with more familiarity on the partial scanning side should have more 
insights. I'll dig into this more tomorrow. In the meantime if we want to just 
fix the test we can add isTableEnabled() in the wait for split which would make 
it more robust since checking just number of regions might be a false indicator 
of successful split.

[~tedyu] [~apurtell] [~lhofhansl]

> Admin#isTableAvailable returns incorrect result before daughter regions are 
> opened
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-18796
>                 URL: https://issues.apache.org/jira/browse/HBASE-18796
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.3.1
>            Reporter: Abhishek Singh Chouhan
>            Assignee: Abhishek Singh Chouhan
>             Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2, 1.5.0
>
>         Attachments: HBASE-18796-addendum.branch-1.patch, 
> HBASE-18796-addendum.master.patch, HBASE-18796.branch-1.001.patch, 
> HBASE-18796.branch-1.001.patch, HBASE-18796.branch-1.002.patch, 
> HBASE-18796.branch-1.003.patch, HBASE-18796.master.001.patch
>
>
> Admin#isTableAvailable checks if it can getServerName for the meta entries it 
> reads. During the time of split server location are added to the meta entries 
> in MetaTableAccessor#splitRegion although the description of the method says 
> "Does not add the location information to the daughter regions since they are 
> not open yet.". At this point during the split daughter regions are not 
> actually open, so we can get to a state where parent is offline, daughters 
> are not yet open but isTableAvailable returns true.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to