[
https://issues.apache.org/jira/browse/HBASE-18796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16174755#comment-16174755
]
Abhishek Singh Chouhan commented on HBASE-18796:
------------------------------------------------
Spent some time looking at the failure. Looks to be a problem elsewhere that
surfaced.
The test does a split and then tries a batch get operation which fails due to
table not found although the table is there. This is happening because now that
we do not put daughter locations before they're actually opened on the
regionserver, we run into NoServerForRegionException in
ConnectionImplementation#locateRegionInMeta which should be fine since there
are retries which should succeed as soon as the region is opened. However our
retry fails on a TableNotFound exception here
{code}
try (ReversedClientScanner rcs =
new ReversedClientScanner(conf, s, TableName.META_TABLE_NAME, this,
rpcCallerFactory,
rpcControllerFactory, getMetaLookupPool(),
metaReplicaCallTimeoutScanInMicroSecond)) {
regionInfoRow = rcs.next();
}
if (regionInfoRow == null) {
throw new TableNotFoundException(tableName);
}
{code}
The result that we get has mayHaveMoreCellsInRow() true during one of the
retries, since we don't have setAllowPartialResults(true) set on our scan we
get regionInfoRow as null since we got only 1 row which has
mayHaveMoreCellsInRow() as true and we use
CompleteScanResultCache which won't return this to the client. After i do
{code}
s.addFamily(HConstants.CATALOG_FAMILY);
s.setOneRowLimit();
+ s.setAllowPartialResults(true);
if (this.useMetaReplicas) {
s.setConsistency(Consistency.TIMELINE);
}
{code}
the client is able to ride over the split during its retries and the test
passes.
[~tedyu] [~apurtell] This issues seems to be something that can be hit during
any other retry too in locateRegionInMeta when mayHaveMoreCellsInRow() is true
for the meta scan and the client would get TableNotFound and will not retry. I
can open another jira for this if this sounds good.
> Admin#isTableAvailable returns incorrect result before daughter regions are
> opened
> ----------------------------------------------------------------------------------
>
> Key: HBASE-18796
> URL: https://issues.apache.org/jira/browse/HBASE-18796
> Project: HBase
> Issue Type: Bug
> Affects Versions: 1.3.1
> Reporter: Abhishek Singh Chouhan
> Assignee: Abhishek Singh Chouhan
> Fix For: 2.0.0, 3.0.0, 1.4.0, 1.3.2, 1.5.0
>
> Attachments: HBASE-18796.branch-1.001.patch,
> HBASE-18796.branch-1.001.patch, HBASE-18796.branch-1.002.patch,
> HBASE-18796.branch-1.003.patch, HBASE-18796.master.001.patch
>
>
> Admin#isTableAvailable checks if it can getServerName for the meta entries it
> reads. During the time of split server location are added to the meta entries
> in MetaTableAccessor#splitRegion although the description of the method says
> "Does not add the location information to the daughter regions since they are
> not open yet.". At this point during the split daughter regions are not
> actually open, so we can get to a state where parent is offline, daughters
> are not yet open but isTableAvailable returns true.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)