[
https://issues.apache.org/jira/browse/IGNITE-14006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mikhail Petrov updated IGNITE-14006:
------------------------------------
Description:
Node fails with assertion error during event listener registration from client
node in case the remote filter class is missing on one or more server nodes:
{code:java}
[2021-01-17
15:49:00,313][ERROR][disco-notifier-worker-#83%continuous.CacheContinuousQueryExternalNodeFilterTest1%][IgniteTestResources]
Critical system error detected. Will be handled accordingly to configured
handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION,
err=java.lang.AssertionError]]
java.lang.AssertionError
at
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$LocalRoutineInfo.<init>(GridContinuousProcessor.java:2117)
at
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1447)
at
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:117)
at
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:220)
at
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:211)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:670)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:533)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2635)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2673)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
{code}
Reproducer:
{code:java}
/** */
public class CacheContinuousQueryExternalNodeFilterTest extends
GridCommonAbstractTest {
/** */
private static final String EXT_EVT_FILTER_CLS =
"org.apache.ignite.tests.p2p.GridEventConsumeFilter";
/** */
private static final URL[] URLS;
static {
try {
URLS = new URL[] {new URL(getProperty("p2p.uri.cls.second"))};
}
catch (MalformedURLException e) {
throw new RuntimeException(e);
}
}
/** */
private final ClassLoader extLdr = getExternalClassLoader();
/** */
private final ClassLoader secondExtLdr = new URLClassLoader(URLS,
U.gridClassLoader());
/** {@inheritDoc} */
@Override protected IgniteConfiguration getConfiguration(String
igniteInstanceName) throws Exception {
IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);
cfg.setPeerClassLoadingEnabled(false);
cfg.setFailureHandler(new StopNodeOrHaltFailureHandler());
if (getTestIgniteInstanceName(0).equals(igniteInstanceName))
cfg.setClassLoader(secondExtLdr);
else
cfg.setClassLoader(extLdr);
return cfg;
}
/** */
@Test
public void test() throws Exception {
startGrids(2);
IgniteEx cli = startClientGrid(2);
Class<IgnitePredicate<Event>> rmtFilter =
(Class<IgnitePredicate<Event>>)extLdr
.loadClass(EXT_EVT_FILTER_CLS);
try {
cli.events().remoteListen(null, rmtFilter.newInstance(),
EVT_CACHE_OBJECT_PUT);
}
catch (Throwable ignored) {
// No-op.
}
// waits for all node to handle an error occurred during processing of
StartRoutineDiscoveryMessage
U.sleep(3000);
}
}
{code}
The root cause of described above behavior is:
First node that can't unmarshall StartRoutineDiscoveryMessage leaves
GridContinuousHandler property as null and since StartRoutineDiscoveryMessage
is mutable marshalls it again and sends it to the next node. So every node in
the ring after that node can't obtain GridContinuousHandler (even it can
properly unmarshall the message) and fails with assertion.
To solve this problem it is proposed to skip registration of CQ if some nodes
has already failed to register CQ. Anyway it will be stopped after query
initiator receives an ACK message.
was:
Node fails with assertion error during event listener registration from client
node in case the remote filter class is missing on one or more server nodes:
{code:java}
[2021-01-17
15:49:00,313][ERROR][disco-notifier-worker-#83%continuous.CacheContinuousQueryExternalNodeFilterTest1%][IgniteTestResources]
Critical system error detected. Will be handled accordingly to configured
handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION,
err=java.lang.AssertionError]]
java.lang.AssertionError
at
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$LocalRoutineInfo.<init>(GridContinuousProcessor.java:2117)
at
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1447)
at
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:117)
at
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:220)
at
org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:211)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:670)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:533)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2635)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2673)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
{code}
Reproducer:
{code:java}
/** */
public class CacheContinuousQueryExternalNodeFilterTest extends
GridCommonAbstractTest {
/** */
private static final String EXT_EVT_FILTER_CLS =
"org.apache.ignite.tests.p2p.GridEventConsumeFilter";
/** */
private static final URL[] URLS;
static {
try {
URLS = new URL[] {new URL(getProperty("p2p.uri.cls.second"))};
}
catch (MalformedURLException e) {
throw new RuntimeException(e);
}
}
/** */
private final ClassLoader extLdr = getExternalClassLoader();
/** */
private final ClassLoader secondExtLdr = new URLClassLoader(URLS,
U.gridClassLoader());
/** {@inheritDoc} */
@Override protected IgniteConfiguration getConfiguration(String
igniteInstanceName) throws Exception {
IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);
cfg.setPeerClassLoadingEnabled(false);
cfg.setFailureHandler(new StopNodeOrHaltFailureHandler());
if (getTestIgniteInstanceName(0).equals(igniteInstanceName))
cfg.setClassLoader(secondExtLdr);
else
cfg.setClassLoader(extLdr);
return cfg;
}
/** */
@Test
public void test() throws Exception {
startGrids(2);
IgniteEx cli = startClientGrid(2);
Class<IgnitePredicate<Event>> rmtFilter =
(Class<IgnitePredicate<Event>>)extLdr
.loadClass(EXT_EVT_FILTER_CLS);
try {
cli.events().remoteListen(null, rmtFilter.newInstance(),
EVT_CACHE_OBJECT_PUT);
}
catch (Throwable ignored) {
// No-op.
}
// waits for all node to handle an error occurred during processing of
StartRoutineDiscoveryMessage
U.sleep(3000);
}
}
{code}
> Node fails with assertion error during event listener registration
> ------------------------------------------------------------------
>
> Key: IGNITE-14006
> URL: https://issues.apache.org/jira/browse/IGNITE-14006
> Project: Ignite
> Issue Type: Bug
> Reporter: Mikhail Petrov
> Assignee: Mikhail Petrov
> Priority: Major
>
> Node fails with assertion error during event listener registration from
> client node in case the remote filter class is missing on one or more server
> nodes:
> {code:java}
> [2021-01-17
> 15:49:00,313][ERROR][disco-notifier-worker-#83%continuous.CacheContinuousQueryExternalNodeFilterTest1%][IgniteTestResources]
> Critical system error detected. Will be handled accordingly to configured
> handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
> super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
> [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
> failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION,
> err=java.lang.AssertionError]]
> java.lang.AssertionError
> at
> org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$LocalRoutineInfo.<init>(GridContinuousProcessor.java:2117)
> at
> org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.processStartRequest(GridContinuousProcessor.java:1447)
> at
> org.apache.ignite.internal.processors.continuous.GridContinuousProcessor.access$400(GridContinuousProcessor.java:117)
> at
> org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:220)
> at
> org.apache.ignite.internal.processors.continuous.GridContinuousProcessor$2.onCustomEvent(GridContinuousProcessor.java:211)
> at
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onDiscovery0(GridDiscoveryManager.java:670)
> at
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.lambda$onDiscovery$0(GridDiscoveryManager.java:533)
> at
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body0(GridDiscoveryManager.java:2635)
> at
> org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryMessageNotifierWorker.body(GridDiscoveryManager.java:2673)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> Reproducer:
> {code:java}
> /** */
> public class CacheContinuousQueryExternalNodeFilterTest extends
> GridCommonAbstractTest {
> /** */
> private static final String EXT_EVT_FILTER_CLS =
> "org.apache.ignite.tests.p2p.GridEventConsumeFilter";
> /** */
> private static final URL[] URLS;
> static {
> try {
> URLS = new URL[] {new URL(getProperty("p2p.uri.cls.second"))};
> }
> catch (MalformedURLException e) {
> throw new RuntimeException(e);
> }
> }
> /** */
> private final ClassLoader extLdr = getExternalClassLoader();
> /** */
> private final ClassLoader secondExtLdr = new URLClassLoader(URLS,
> U.gridClassLoader());
> /** {@inheritDoc} */
> @Override protected IgniteConfiguration getConfiguration(String
> igniteInstanceName) throws Exception {
> IgniteConfiguration cfg = super.getConfiguration(igniteInstanceName);
> cfg.setPeerClassLoadingEnabled(false);
> cfg.setFailureHandler(new StopNodeOrHaltFailureHandler());
> if (getTestIgniteInstanceName(0).equals(igniteInstanceName))
> cfg.setClassLoader(secondExtLdr);
> else
> cfg.setClassLoader(extLdr);
> return cfg;
> }
> /** */
> @Test
> public void test() throws Exception {
> startGrids(2);
> IgniteEx cli = startClientGrid(2);
> Class<IgnitePredicate<Event>> rmtFilter =
> (Class<IgnitePredicate<Event>>)extLdr
> .loadClass(EXT_EVT_FILTER_CLS);
> try {
> cli.events().remoteListen(null, rmtFilter.newInstance(),
> EVT_CACHE_OBJECT_PUT);
> }
> catch (Throwable ignored) {
> // No-op.
> }
> // waits for all node to handle an error occurred during processing
> of StartRoutineDiscoveryMessage
> U.sleep(3000);
> }
> }
> {code}
> The root cause of described above behavior is:
> First node that can't unmarshall StartRoutineDiscoveryMessage leaves
> GridContinuousHandler property as null and since StartRoutineDiscoveryMessage
> is mutable marshalls it again and sends it to the next node. So every node in
> the ring after that node can't obtain GridContinuousHandler (even it can
> properly unmarshall the message) and fails with assertion.
> To solve this problem it is proposed to skip registration of CQ if some nodes
> has already failed to register CQ. Anyway it will be stopped after query
> initiator receives an ACK message.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)