[ 
https://issues.apache.org/jira/browse/STORM-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268047#comment-17268047
 ] 

Ethan Li edited comment on STORM-3735 at 1/19/21, 5:13 PM:
-----------------------------------------------------------

With STORM-3682 (code: https://github.com/apache/storm/pull/3371/), I believe 
NodeInfo is no longer needed in kryo registration since the related code is 
removed with this code change #3371. We can evaluate it and maybe remove the 
NodeInfo from kryo registration in the future.


was (Author: ethanli):
With STORM-3682 (code: https://github.com/apache/storm/pull/3371/), I believe 
NodeInfo is not longer needed in kryo registration since the related code is 
removed with this code change #3371. We can evaluate it and maybe remove the 
NodeInfo from kryo registration in the future.

> Kyro serialization fails on some metric tuples when 
> topology.fall.back.on.java.serialization is false
> -----------------------------------------------------------------------------------------------------
>
>                 Key: STORM-3735
>                 URL: https://issues.apache.org/jira/browse/STORM-3735
>             Project: Apache Storm
>          Issue Type: Bug
>    Affects Versions: 2.0.0, 2.1.0, 2.2.0
>            Reporter: Ethan Li
>            Assignee: Ethan Li
>            Priority: Major
>             Fix For: 2.3.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> When a metric consumer is used, metrics will be sent from all executors to 
> the consumer. In some of the metrics,  it includes NodeInfo object, and kryo 
> serialization will fail if topology.fall.back.on.java.serialization is false.
> {code:title=worker logs}
> 2021-01-13 20:16:37.017 o.a.s.e.ExecutorTransfer 
> Thread-16-__system-executor[-1, -1] [INFO] TRANSFERRING tuple [dest: 5 tuple: 
> source: __system:-1, stream: __metrics, id: {}, [TASK_INFO: { host: 
> openstorm14blue-n4.blue.ygrid.yahoo.com:6703 comp: __system[-1]}, [
> [CGroupCpuStat = {nr.throttled-percentage=46.544980443285525, 
> nr.period-count=767, nr.throttled-count=357, throttled.time-ms=27208}], 
> [CGroupMemoryLimit = 1342177280], [__recv-iconnection = {dequeuedMessages=0, 
> enqueued={/10.215.73.210:47038=3169}}], [__send-ico
> nnection = {NodeInfo(node:149a917b-bc75-49c8-b351-f74b8ae0fbed-10.215.73.210, 
> port:[6701])={reconnects=1, src=/10.215.73.210:34938, pending=0, 
> dest=openstorm14blue-n4.blue.ygrid.yahoo.com/10.215.73.210:6701, sent=1896, 
> lostOnSend=0}, NodeInfo(node:149a917b-bc75-
> 49c8-b351-f74b8ae0fbed-10.215.73.210, port:[6702])={reconnects=8, 
> src=/10.215.73.210:39476, pending=0, 
> dest=openstorm14blue-n4.blue.ygrid.yahoo.com/10.215.73.210:6702, sent=2115, 
> lostOnSend=0}, 
> NodeInfo(node:b77b5ec6-15ee-4bd2-a9b8-12fcadde7744-10.215.73.211, po
> rt:[6700])={reconnects=125, pending=0, 
> dest=openstorm14blue-n5.blue.ygrid.yahoo.com/10.215.73.211:6700, sent=108, 
> lostOnSend=1331}}], [CGroupMemory = 316485632], [CGroupCpu = {user-ms=36960, 
> sys-ms=25860}], [memory.pools.Metaspace.usage = 0.9695890907929322], [m
> emory.heap.max = 1073741824], [receive-queue-overflow = 0], 
> [memory.pools.Compressed-Class-Space.used = 6237424], 
> [memory.pools.Compressed-Class-Space.max = 1073741824], [memory.non-heap.init 
> = 2555904], [worker-transfer-queue-overflow = 0], [memory.pools.Metasp
> ace.committed = 42074112], [receive-queue-sojourn_time_ms = 0.0], 
> [threads.waiting.count = 5], [memory.pools.G1-Eden-Space.usage = 
> 0.2777777777777778], [memory.pools.Metaspace.used = 40798320], 
> [memory.total.used = 101783888], [memory.pools.Code-Cache.init = 255
> 5904], [memory.non-heap.committed = 63832064], [GC.G1-Young-Generation.time = 
> 677], [receive-queue-insert_failures = 0.0], [memory.total.init = 130482176], 
> [GC.G1-Old-Generation.count = 0], [memory.pools.Metaspace.init = 0], 
> [memory.pools.G1-Survivor-Space.commi
> tted = 5242880], [worker-transfer-queue-population = 0], 
> [memory.pools.Compressed-Class-Space.committed = 6684672], 
> [threads.timed_waiting.count = 31], [memory.pools.G1-Eden-Space.init = 
> 7340032], [memory.pools.Metaspace.max = -1], [memory.pools.G1-Survivor-Spac
> e.used = 5242880], [memory.heap.init = 127926272], 
> [memory.pools.G1-Old-Gen.used-after-gc = 0], [worker-transfer-queue-capacity 
> = 1024], [memory.pools.G1-Survivor-Space.used-after-gc = 5242880], 
> [memory.pools.G1-Old-Gen.committed = 47185920], [memory.pools.G1-Ed
> en-Space.committed = 75497472], [receive-queue-arrival_rate_secs = 
> 0.109421162052741], [memory.pools.Compressed-Class-Space.usage = 
> 0.0058090537786483765], [TGT-TimeToExpiryMsecs = 71282993], 
> [threads.runnable.count = 15], [worker-transfer-queue-insert_failures
> = 0.0], [worker-transfer-queue-sojourn_time_ms = 0.0], [memory.heap.committed 
> = 127926272], [memory.non-heap.max = -1], [threads.daemon.count = 29], 
> [memory.pools.Code-Cache.max = 251658240], 
> [worker-transfer-queue-arrival_rate_secs = 90.47776674390379], [memory
> .heap.usage = 0.037109360098838806], [memory.pools.G1-Old-Gen.init = 
> 120586240], [memory.pools.Code-Cache.committed = 15138816], 
> [receive-queue-pct_full = 0.0], [worker-transfer-queue-pct_full = 0.0], 
> [receive-queue-population = 0], [memory.pools.Compressed-Clas
> s-Space.init = 0], [memory.pools.Code-Cache.usage = 0.059299468994140625], 
> [worker-transfer-queue-dropped_messages = 0], [GC.G1-Young-Generation.count = 
> 18], [memory.pools.Code-Cache.used = 14923200], 
> [memory.pools.G1-Old-Gen.usage = 0.012695297598838806], [memo
> ry.non-heap.usage = -6.196368E7], [memory.total.max = 1073741823], 
> [threads.count = 51], [memory.heap.used = 39845872], 
> [memory.pools.G1-Survivor-Space.init = 0], [memory.pools.G1-Old-Gen.used = 
> 13631472], [receive-queue-dropped_messages = 0], [threads.terminate
> d.count = 0], [memory.pools.G1-Eden-Space.max = -1], [uptimeSecs = 76], 
> [threads.deadlock.count = 0], [threads.blocked.count = 0], [newWorkerEvent = 
> 1], [receive-queue-capacity = 32768], [threads.new.count = 0], [startTimeSecs 
> = 1610568920], [memory.pools.G1-Ede
> n-Space.used-after-gc = 0], [memory.pools.G1-Eden-Space.used = 20971520], 
> [GC.G1-Old-Generation.time = 0], [memory.non-heap.used = 61964384], 
> [memory.pools.G1-Old-Gen.max = 1073741824], 
> [memory.pools.G1-Survivor-Space.max = -1], [memory.pools.G1-Survivor-Space.u
> sage = 1.0], [memory.total.committed = 191823872], [doHeartbeat-calls.count = 
> 64], [doHeartbeat-calls.m1_rate = 1.0730202200365234E-6], 
> [doHeartbeat-calls.m5_rate = 1.1636999000665182E-6], 
> [doHeartbeat-calls.m15_rate = 1.1870955900857726E-6], [doHeartbeat-calls.
> mean_rate = 1.0067076836696486E-6]]] PROC_START_TIME(sampled): null 
> EXEC_START_TIME(sampled): null]
> ...
> 2021-01-13 20:16:37.030 o.a.s.u.Utils Thread-16-__system-executor[-1, -1] 
> [ERROR] Async loop died!
> java.lang.RuntimeException: com.esotericsoftware.kryo.KryoException: 
> java.lang.IllegalArgumentException: Class is not registered: 
> org.apache.storm.generated.NodeInfo
> Note: To register this class use: 
> kryo.register(org.apache.storm.generated.NodeInfo.class);
> Serialization trace:
> value (org.apache.storm.metric.api.IMetricsConsumer$DataPoint)
>         at org.apache.storm.executor.Executor.accept(Executor.java:294) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:113) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.JCQueue.consume(JCQueue.java:89) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:159) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:145) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.Utils$1.run(Utils.java:401) 
> [storm-client-2.3.0.y.jar:2.3.0.y]
>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262]
> Caused by: com.esotericsoftware.kryo.KryoException: 
> java.lang.IllegalArgumentException: Class is not registered: 
> org.apache.storm.generated.NodeInfo
> Note: To register this class use: 
> kryo.register(org.apache.storm.generated.NodeInfo.class);
> Serialization trace:
> value (org.apache.storm.metric.api.IMetricsConsumer$DataPoint)
>         at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:101) 
> ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
>  ~[kryo-3.0.3.jar:?]
>         at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) 
> ~[kryo-3.0.3.jar:?]
> Serialization trace:
> value (org.apache.storm.metric.api.IMetricsConsumer$DataPoint)
>         at org.apache.storm.executor.Executor.accept(Executor.java:294) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:113) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.JCQueue.consume(JCQueue.java:89) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:159) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:145) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.utils.Utils$1.run(Utils.java:401) 
> [storm-client-2.3.0.y.jar:2.3.0.y]
>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262]
> Caused by: com.esotericsoftware.kryo.KryoException: 
> java.lang.IllegalArgumentException: Class is not registered: 
> org.apache.storm.generated.NodeInfo
> Note: To register this class use: 
> kryo.register(org.apache.storm.generated.NodeInfo.class);
> Serialization trace:
> value (org.apache.storm.metric.api.IMetricsConsumer$DataPoint)
>         at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:101) 
> ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
>  ~[kryo-3.0.3.jar:?]
>         at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) 
> ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
>  ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
>  ~[kryo-3.0.3.jar:?]
>         at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) 
> ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
>  ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
>  ~[kryo-3.0.3.jar:?]
>         at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534) 
> ~[kryo-3.0.3.jar:?]
>         at 
> org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:38)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:40)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.daemon.worker.WorkerTransfer.tryTransferRemote(WorkerTransfer.java:118)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.daemon.worker.WorkerState.tryTransferRemote(WorkerState.java:553)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.ExecutorTransfer.tryTransfer(ExecutorTransfer.java:68)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.daemon.Task.sendUnanchored(Task.java:215) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.executor.Executor.metricsTick(Executor.java:345) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:205)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.executor.Executor.accept(Executor.java:290) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         ... 6 more
> Caused by: java.lang.IllegalArgumentException: Class is not registered: 
> org.apache.storm.generated.NodeInfo
> Note: To register this class use: 
> kryo.register(org.apache.storm.generated.NodeInfo.class);
>         at com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:488) 
> ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.util.DefaultClassResolver.writeClass(DefaultClassResolver.java:97)
>  ~[kryo-3.0.3.jar:?]
>         at com.esotericsoftware.kryo.Kryo.writeClass(Kryo.java:517) 
> ~[kryo-3.0.3.jar:?]
>         at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:622) 
> ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:106)
>  ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:39)
>  ~[kryo-3.0.3.jar:?]
>         at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:552) 
> ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:80) 
> ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
>  ~[kryo-3.0.3.jar:?]
>         at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) 
> ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
>  ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
>  ~[kryo-3.0.3.jar:?]
>         at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) 
> ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
>  ~[kryo-3.0.3.jar:?]
>         at 
> com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
>  ~[kryo-3.0.3.jar:?]
>         at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534) 
> ~[kryo-3.0.3.jar:?]
>         at 
> org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:38)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:40)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.daemon.worker.WorkerTransfer.tryTransferRemote(WorkerTransfer.java:118)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.daemon.worker.WorkerState.tryTransferRemote(WorkerState.java:553)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.ExecutorTransfer.tryTransfer(ExecutorTransfer.java:68)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.daemon.Task.sendUnanchored(Task.java:215) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.executor.Executor.metricsTick(Executor.java:345) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at 
> org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:205)
>  ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         at org.apache.storm.executor.Executor.accept(Executor.java:290) 
> ~[storm-client-2.3.0.y.jar:2.3.0.y]
>         ... 6 more
> {code}
> The related metric is "__send-iconnection" from 
> https://github.com/apache/storm/blob/7bef73a6faa14558ef254efe74cbe4bfef81c2e2/storm-client/src/jvm/org/apache/storm/daemon/metrics/BuiltinMetricsUtil.java#L40-L43
> Note that this can only be reproduced when metrics are sent across workers 
> (otherwise there is no serialization).
> The work around is one of the following
> 1) add org.apache.storm.generated.NodeInfo to topology.kryo.register in 
> topology conf
> 2) set topology.fall.back.on.java.serialization true or unset 
> topology.fall.back.on.java.serialization since the default is true
> The fix is to register NodeInfo class in kryo.
> https://github.com/apache/storm/blob/7bef73a6faa14558ef254efe74cbe4bfef81c2e2/storm-client/src/jvm/org/apache/storm/serialization/SerializationFactory.java#L67-L77



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to