[ 
https://issues.apache.org/jira/browse/STORM-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Li resolved STORM-3687.
-----------------------------
    Fix Version/s: 2.3.0
       Resolution: Fixed

> Add a warning about possible issues on a mixed cluster if the 
> StormCommon.systemTopology implementation is changed
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: STORM-3687
>                 URL: https://issues.apache.org/jira/browse/STORM-3687
>             Project: Apache Storm
>          Issue Type: Task
>            Reporter: Ethan Li
>            Priority: Minor
>             Fix For: 2.3.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> During rolling upgrade, some supervisors are still at 2.2, and others are 
> running on 2.3. So some workers are running with storm-client-2.2.jar and 
> others are on 2.3.
> Because of 
> https://github.com/apache/storm/commit/93a7f770d508668bc7af183e08535813fff6f805
>  (STORM-3660)
> the "_credentials" stream is removed.
> So the map of streamId to streamName for system component is changed 
> refer to the code at:
> https://github.com/apache/storm/blob/v2.2.0/storm-client/src/jvm/org/apache/storm/serialization/SerializationFactory.java#L218-L222
> So a worker running on storm-client-2.2 might send out __metrics from its 
> systembolt, while the MetricsConsumer running with storm-client-2.3 
> interprets it as "__metrics_tick" because the mapping is different:
> {code:java}
> o.a.s.s.SerializationFactory Thread-15-__system-executor[-1, -1] [INFO] idmap 
> for system comp {__tick=6, __system=5, __metrics_tick=4, __credentials=1, 
> __flush=2, __metrics=3}
> {code}
> vs
> {code:java}
> o.a.s.s.SerializationFactory Netty-server-localhost-6703-worker-1 [INFO] 
> idmap for system comp {__tick=5, __system=4, __metrics_tick=3, __flush=1, 
> __metrics=2}
> {code}
> Hence we see
> {code:java}
> Caused by: java.lang.RuntimeException: java.lang.ClassCastException: 
> org.apache.storm.metric.api.IMetricsConsumer$TaskInfo cannot be cast to 
> java.lang.Integer
>         at org.apache.storm.executor.Executor.accept(Executor.java:293) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:113) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.JCQueue.consume(JCQueue.java:89) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:167) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:153) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.Utils$1.run(Utils.java:398) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         ... 1 more
> Caused by: java.lang.ClassCastException: 
> org.apache.storm.metric.api.IMetricsConsumer$TaskInfo cannot be cast to 
> java.lang.Integer
>         at org.apache.storm.tuple.TupleImpl.getInteger(TupleImpl.java:121) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.executor.Executor.metricsTick(Executor.java:320) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:213)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.executor.Executor.accept(Executor.java:286) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:113) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.JCQueue.consume(JCQueue.java:89) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:167) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:153) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.Utils$1.run(Utils.java:398) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         ... 1 more
> 2020-08-04 18:26:08.017 o.a.s.m.c
> 2020-08-04 18:26:08.027 o.a.s.u.Utils 
> Thread-19-__metrics_org.apache.storm.metric.LoggingMetricsConsumer-executor[1555,
>  1555] [ERROR] Halting process: Worker died
> java.lang.RuntimeException: Halting process: Worker died
>         at org.apache.storm.utils.Utils.exitProcess(Utils.java:518) 
> [storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.Utils$3.run(Utils.java:870) 
> [storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.error.ReportErrorAndDie.uncaughtException(ReportErrorAndDie.java:41)
>  [storm-client-2.3.0.y.jar:2.3.0.y]
>         at java.lang.Thread.dispatchUncaughtException(Thread.java:1959) 
> [?:1.8.0_242]
> 2020-08-04 18:26:08.027 o.a.s.u.Utils 
> Thread-18-__metrics_org.apache.storm.metric.LoggingMetricsConsumer-executor[1553,
>  1553] [ERROR] Halting process: Worker died
> java.lang.RuntimeException: Halting process: Worker died
>         at org.apache.storm.utils.Utils.exitProcess(Utils.java:518) 
> [storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.Utils$3.run(Utils.java:870) 
> [storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.error.ReportErrorAndDie.uncaughtException(ReportErrorAndDie.java:41)
>  [storm-client-2.3.0.y.jar:2.3.0.y]
>         at java.lang.Thread.dispatchUncaughtException(Thread.java:1959) 
> [?:1.8.0_242]
> 2020-08-04 18:26:08.028 o.a.s.u.Utils ShutdownHook-sleepKill-3s [INFO] 
> Halting after 3 seconds
> {code}
> Since mixed cluster is not guaranteed to work, I am not going to fix this for 
> now. But will add some warnings in the code so people can be aware of it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to