[ https://issues.apache.org/jira/browse/STORM-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ethan Li reassigned STORM-3687: ------------------------------- Assignee: Ethan Li > Add a warning about possible issues on a mixed cluster if the > StormCommon.systemTopology implementation is changed > ------------------------------------------------------------------------------------------------------------------ > > Key: STORM-3687 > URL: https://issues.apache.org/jira/browse/STORM-3687 > Project: Apache Storm > Issue Type: Task > Reporter: Ethan Li > Assignee: Ethan Li > Priority: Minor > Fix For: 2.3.0 > > Time Spent: 20m > Remaining Estimate: 0h > > During rolling upgrade, some supervisors are still at 2.2, and others are > running on 2.3. So some workers are running with storm-client-2.2.jar and > others are on 2.3. > Because of > https://github.com/apache/storm/commit/93a7f770d508668bc7af183e08535813fff6f805 > (STORM-3660) > the "_credentials" stream is removed. > So the map of streamId to streamName for system component is changed > refer to the code at: > https://github.com/apache/storm/blob/v2.2.0/storm-client/src/jvm/org/apache/storm/serialization/SerializationFactory.java#L218-L222 > So a worker running on storm-client-2.2 might send out __metrics from its > systembolt, while the MetricsConsumer running with storm-client-2.3 > interprets it as "__metrics_tick" because the mapping is different: > {code:java} > o.a.s.s.SerializationFactory Thread-15-__system-executor[-1, -1] [INFO] idmap > for system comp {__tick=6, __system=5, __metrics_tick=4, __credentials=1, > __flush=2, __metrics=3} > {code} > vs > {code:java} > o.a.s.s.SerializationFactory Netty-server-localhost-6703-worker-1 [INFO] > idmap for system comp {__tick=5, __system=4, __metrics_tick=3, __flush=1, > __metrics=2} > {code} > Hence we see > {code:java} > Caused by: java.lang.RuntimeException: java.lang.ClassCastException: > org.apache.storm.metric.api.IMetricsConsumer$TaskInfo cannot be cast to > java.lang.Integer > at org.apache.storm.executor.Executor.accept(Executor.java:293) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:113) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.JCQueue.consume(JCQueue.java:89) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:167) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:153) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.Utils$1.run(Utils.java:398) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > ... 1 more > Caused by: java.lang.ClassCastException: > org.apache.storm.metric.api.IMetricsConsumer$TaskInfo cannot be cast to > java.lang.Integer > at org.apache.storm.tuple.TupleImpl.getInteger(TupleImpl.java:121) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.executor.Executor.metricsTick(Executor.java:320) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:213) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.executor.Executor.accept(Executor.java:286) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:113) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.JCQueue.consume(JCQueue.java:89) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:167) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:153) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.Utils$1.run(Utils.java:398) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > ... 1 more > 2020-08-04 18:26:08.017 o.a.s.m.c > 2020-08-04 18:26:08.027 o.a.s.u.Utils > Thread-19-__metrics_org.apache.storm.metric.LoggingMetricsConsumer-executor[1555, > 1555] [ERROR] Halting process: Worker died > java.lang.RuntimeException: Halting process: Worker died > at org.apache.storm.utils.Utils.exitProcess(Utils.java:518) > [storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.Utils$3.run(Utils.java:870) > [storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.error.ReportErrorAndDie.uncaughtException(ReportErrorAndDie.java:41) > [storm-client-2.3.0.y.jar:2.3.0.y] > at java.lang.Thread.dispatchUncaughtException(Thread.java:1959) > [?:1.8.0_242] > 2020-08-04 18:26:08.027 o.a.s.u.Utils > Thread-18-__metrics_org.apache.storm.metric.LoggingMetricsConsumer-executor[1553, > 1553] [ERROR] Halting process: Worker died > java.lang.RuntimeException: Halting process: Worker died > at org.apache.storm.utils.Utils.exitProcess(Utils.java:518) > [storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.Utils$3.run(Utils.java:870) > [storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.error.ReportErrorAndDie.uncaughtException(ReportErrorAndDie.java:41) > [storm-client-2.3.0.y.jar:2.3.0.y] > at java.lang.Thread.dispatchUncaughtException(Thread.java:1959) > [?:1.8.0_242] > 2020-08-04 18:26:08.028 o.a.s.u.Utils ShutdownHook-sleepKill-3s [INFO] > Halting after 3 seconds > {code} > Since mixed cluster is not guaranteed to work, I am not going to fix this for > now. But will add some warnings in the code so people can be aware of it. -- This message was sent by Atlassian Jira (v8.3.4#803005)