Could someone help me review this problem?
At 2017-05-13 12:10:55, "程磊" <[email protected]> wrote: Sorry, the code format of previous email is error , I reedit the code format : When I look the code of Index.preWALRestore method, I find RecoveryIndexWriter.write method is used to write the indexUpdates in following line 565: 547 @Override 548 public void preWALRestore(ObserverContext<RegionCoprocessorEnvironment> env, HRegionInfo info, 549 HLogKey logKey, WALEdit logEdit) throws IOException { 550 if (this.disabled) { 551 super.preWALRestore(env, info, logKey, logEdit); 552 return; 553 } 554 // TODO check the regions in transition. If the server on which the region lives is this one, 555 // then we should rety that write later in postOpen. 556 // we might be able to get even smarter here and pre-split the edits that are server-local 557 // into their own recovered.edits file. This then lets us do a straightforward recovery of each 558 // region (and more efficiently as we aren't writing quite as hectically from this one place). 559 560 /* 561 * Basically, we let the index regions recover for a little whilelong before retrying in the 562 * hopes they come up before the primary table finishes. 563 */ 564 Collection<Pair<Mutation, byte[]>> indexUpdates = extractIndexUpdate(logEdit); 565 recoveryWriter.write(indexUpdates, true); 566 } but the RecoveryIndexWriter.write method is as following, it directly throws Exception except non-existing tables, so RecoveryIndexWriter's failurePolicy(which is StoreFailuresInCachePolicy by default) even has no opportunity to be used, and it leads to Index.failedIndexEdits which is filled by the StoreFailuresInCachePolicy is always empty. @Override public void write(Collection<Pair<Mutation, byte[]>> toWrite, boolean allowLocalUpdates) throws IOException { try { write(resolveTableReferences(toWrite), allowLocalUpdates); } catch (MultiIndexWriteFailureException e) { for (HTableInterfaceReference table : e.getFailedTables()) { if (!admin.tableExists(table.getTableName())) { LOG.warn("Failure due to non existing table: " + table.getTableName()); nonExistingTablesList.add(table); } else { throw e; } } } } So the Index.postOpen method seems useless,because the updates Multimap in following 522 line which is geted from Index.failedIndexEdits is always empty. 520 @Override 521 public void postOpen(final ObserverContext<RegionCoprocessorEnvironment> c) { 522 Multimap<HTableInterfaceReference, Mutation> updates = failedIndexEdits.getEdits(c.getEnvironment().getRegion()); 523 524 if (this.disabled) { 525 super.postOpen(c); 526 return; 527 } 528 LOG.info("Found some outstanding index updates that didn't succeed during" 529 + " WAL replay - attempting to replay now."); 530 //if we have no pending edits to complete, then we are done 531 if (updates == null || updates.size() == 0) { 532 return; 533 } 534 535 // do the usual writer stuff, killing the server again, if we can't manage to make the index 536 // writes succeed again 537 try { 538 writer.writeAndKillYourselfOnFailure(updates, true); 539 } catch (IOException e) { 540 LOG.error("During WAL replay of outstanding index updates, " 541 + "Exception is thrown instead of killing server during index writing", e); 542 } 543 } I think in Index.preWALRestore method, we should use RecoveryWriter.writeAndKillYourselfOnFailure method to write the indexUpdates, not the RecoveryIndexWriter.write method.
