[
https://issues.apache.org/jira/browse/SOLR-11278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150114#comment-16150114
]
Amrit Sarkar edited comment on SOLR-11278 at 9/1/17 6:37 AM:
-------------------------------------------------------------
Rambling again:
What is the use of bootstrapFuture essentially? to get status of the current
operation, right?
In CdcrRequestHandler.java :: there are some custom log lines, ignore them:
{code}
Runnable runnable = () -> {
Lock recoveryLock = req.getCore().getSolrCoreState().getRecoveryLock();
boolean locked = recoveryLock.tryLock();
SolrCoreState coreState = core.getSolrCoreState();
try {
if (!locked) {
log.info("we reached this point :: CANCEL BOOTSTRAP, locked :: " +
locked);
handleCancelBootstrap(req, rsp);
} else if (leaderStateManager.amILeader()) {
coreState.setCdcrBootstrapRunning(true);
//running.set(true);
String masterUrl = req.getParams().get(ReplicationHandler.MASTER_URL);
BootstrapCallable bootstrapCallable = new
BootstrapCallable(masterUrl, core);
coreState.setCdcrBootstrapCallable(bootstrapCallable);
Future<Boolean> bootstrapFuture =
core.getCoreContainer().getUpdateShardHandler().getRecoveryExecutor()
.submit(bootstrapCallable);
try {
log.info("we reached this point :: all good, bootstrapFuture.get ::
" + bootstrapFuture.get());
} catch (Exception e) {
log.error("bootstrapFuture.get :: ",e);
}
coreState.setCdcrBootstrapFuture(bootstrapFuture);
try {
bootstrapFuture.get();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
log.warn("Bootstrap was interrupted", e);
} catch (ExecutionException e) {
log.error("Bootstrap operation failed", e);
}
} else {
log.error("Action {} sent to non-leader replica @ {}:{}. Aborting
bootstrap.", CdcrParams.CdcrAction.BOOTSTRAP, collectionName, shard);
}
} finally {
if (locked) {
coreState.setCdcrBootstrapRunning(false);
recoveryLock.unlock();
}
}
};
{code}
*bootstrapFuture.get()* throws:
{quote}
[beaster] 2> 43072 ERROR
(updateExecutor-39-thread-1-processing-n:127.0.0.1:41488_solr
x:cdcr-target_shard1_replica_n1 s:shard1 c:cdcr-target r:core_node2)
[n:127.0.0.1:41488_solr c:cdcr-target s:shard1 r:core_node2
x:cdcr-target_shard1_replica_n1] o.a.s.h.CdcrRequestHandler Bootstrap operation
failed
[beaster] 2> java.util.concurrent.ExecutionException:
java.lang.AssertionError
[beaster] 2> at
java.util.concurrent.FutureTask.report(FutureTask.java:122)
[beaster] 2> at
java.util.concurrent.FutureTask.get(FutureTask.java:192)
[beaster] 2> at
org.apache.solr.handler.CdcrRequestHandler.lambda$handleBootstrapAction$0(CdcrRequestHandler.java:653)
[beaster] 2> at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
[beaster] 2> at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[beaster] 2> at
java.util.concurrent.FutureTask.run(FutureTask.java:266)
[beaster] 2> at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
[beaster] 2> at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[beaster] 2> at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[beaster] 2> at java.lang.Thread.run(Thread.java:748)
[beaster] 2> Caused by: java.lang.AssertionError
[beaster] 2> at
org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:804)
[beaster] 2> at
org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:723)
[beaster] 2> at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
[beaster] 2> ... 5 more
{quote}
and bootstrap operation fails.
FutureTask.java ::
{code}
/**
* Returns result or throws exception for completed task.
* @param s completed state value
*/
@SuppressWarnings("unchecked")
private V report(int s) throws ExecutionException {
Object x = outcome;
if (s == NORMAL)
return (V)x;
if (s >= CANCELLED)
throw new CancellationException();
throw new ExecutionException((Throwable)x);
}
{code}
and the assertion failure is at same function {{finally}} block ::
{code}
if (closed || !success) {
// we cannot apply the buffer in this case because it will introduce
newer versions in the
// update log and then the source cluster will get those versions via
collectioncheckpoint
// causing the versions in between to be completely missed
boolean dropped = ulog.dropBufferedUpdates();
assert dropped;
}
{code}
{{dropped}} is false, {{bufferredUpdates}} are not cleared / dropped? I
understand it is calling its own function but this is difficult to comprehend
who is calling what and what is getting returned..
was (Author: [email protected]):
Rambling again:
What is the use of bootstrapFuture essentially? to get status of the current
operation, right?
In CdcrRequestHandler.java :: there are some custom log lines, ignore them:
{code}
Runnable runnable = () -> {
Lock recoveryLock = req.getCore().getSolrCoreState().getRecoveryLock();
boolean locked = recoveryLock.tryLock();
SolrCoreState coreState = core.getSolrCoreState();
try {
if (!locked) {
log.info("we reached this point :: CANCEL BOOTSTRAP, locked :: " +
locked);
handleCancelBootstrap(req, rsp);
} else if (leaderStateManager.amILeader()) {
coreState.setCdcrBootstrapRunning(true);
//running.set(true);
String masterUrl = req.getParams().get(ReplicationHandler.MASTER_URL);
BootstrapCallable bootstrapCallable = new
BootstrapCallable(masterUrl, core);
coreState.setCdcrBootstrapCallable(bootstrapCallable);
Future<Boolean> bootstrapFuture =
core.getCoreContainer().getUpdateShardHandler().getRecoveryExecutor()
.submit(bootstrapCallable);
try {
log.info("we reached this point :: all good, bootstrapFuture.get ::
" + bootstrapFuture.get());
} catch (Exception e) {
log.error("bootstrapFuture.get :: ",e);
}
coreState.setCdcrBootstrapFuture(bootstrapFuture);
try {
bootstrapFuture.get();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
log.warn("Bootstrap was interrupted", e);
} catch (ExecutionException e) {
log.error("Bootstrap operation failed", e);
}
} else {
log.error("Action {} sent to non-leader replica @ {}:{}. Aborting
bootstrap.", CdcrParams.CdcrAction.BOOTSTRAP, collectionName, shard);
}
} finally {
if (locked) {
coreState.setCdcrBootstrapRunning(false);
recoveryLock.unlock();
}
}
};
{code}
*bootstrapFuture.get()* throws:
{quote}
[beaster] 2> 43072 ERROR
(updateExecutor-39-thread-1-processing-n:127.0.0.1:41488_solr
x:cdcr-target_shard1_replica_n1 s:shard1 c:cdcr-target r:core_node2)
[n:127.0.0.1:41488_solr c:cdcr-target s:shard1 r:core_node2
x:cdcr-target_shard1_replica_n1] o.a.s.h.CdcrRequestHandler Bootstrap operation
failed
[beaster] 2> java.util.concurrent.ExecutionException:
java.lang.AssertionError
[beaster] 2> at
java.util.concurrent.FutureTask.report(FutureTask.java:122)
[beaster] 2> at
java.util.concurrent.FutureTask.get(FutureTask.java:192)
[beaster] 2> at
org.apache.solr.handler.CdcrRequestHandler.lambda$handleBootstrapAction$0(CdcrRequestHandler.java:653)
[beaster] 2> at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176)
[beaster] 2> at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[beaster] 2> at
java.util.concurrent.FutureTask.run(FutureTask.java:266)
[beaster] 2> at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
[beaster] 2> at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[beaster] 2> at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[beaster] 2> at java.lang.Thread.run(Thread.java:748)
[beaster] 2> Caused by: java.lang.AssertionError
[beaster] 2> at
org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:804)
[beaster] 2> at
org.apache.solr.handler.CdcrRequestHandler$BootstrapCallable.call(CdcrRequestHandler.java:723)
[beaster] 2> at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:197)
[beaster] 2> ... 5 more
{quote}
and bootstrap operation fails.
FutureTask.java ::
{code}
/**
* Returns result or throws exception for completed task.
* @param s completed state value
*/
@SuppressWarnings("unchecked")
private V report(int s) throws ExecutionException {
Object x = outcome;
if (s == NORMAL)
return (V)x;
if (s >= CANCELLED)
throw new CancellationException();
throw new ExecutionException((Throwable)x);
}
{code}
and the assertion failure is at same function {{finally}} block ::
{code}
if (closed || !success) {
// we cannot apply the buffer in this case because it will introduce
newer versions in the
// update log and then the source cluster will get those versions via
collectioncheckpoint
// causing the versions in between to be completely missed
boolean dropped = ulog.dropBufferedUpdates();
assert dropped;
}
{code}
{{dropped}} is false, {{bufferredUpdates}} are not cleared / dropped? I
understand it is calling its own function but this is difficult to understand.
> CdcrBootstrapTest failing in branch_6_6
> ---------------------------------------
>
> Key: SOLR-11278
> URL: https://issues.apache.org/jira/browse/SOLR-11278
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: CDCR
> Reporter: Amrit Sarkar
> Assignee: Varun Thacker
> Attachments: SOLR-11278-cancel-bootstrap-on-stop.patch,
> SOLR-11278.patch, test_results
>
>
> I ran beast for 10 rounds:
> ant beast -Dtestcase=CdcrBootstrapTest -Dtests.multiplier=2 -Dtests.slow=true
> -Dtests.locale=vi -Dtests.timezone=Asia/Yekaterinburg -Dtests.asserts=true
> -Dtests.file.encoding=US-ASCII -Dbeast.iters=10
> and seeing following failure:
> {code}
> [beaster] [01:37:16.282] FAILURE 153s |
> CdcrBootstrapTest.testBootstrapWithSourceCluster <<<
> [beaster] > Throwable #1: java.lang.AssertionError: Document mismatch on
> target after sync expected:<2000> but was:<1000>
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]