[ https://issues.apache.org/jira/browse/STORM-3735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268047#comment-17268047 ]
Ethan Li edited comment on STORM-3735 at 1/19/21, 5:13 PM: ----------------------------------------------------------- With STORM-3682 (code: https://github.com/apache/storm/pull/3371/), I believe NodeInfo is no longer needed in kryo registration since the related code is removed with this code change #3371. We can evaluate it and maybe remove the NodeInfo from kryo registration in the future. was (Author: ethanli): With STORM-3682 (code: https://github.com/apache/storm/pull/3371/), I believe NodeInfo is not longer needed in kryo registration since the related code is removed with this code change #3371. We can evaluate it and maybe remove the NodeInfo from kryo registration in the future. > Kyro serialization fails on some metric tuples when > topology.fall.back.on.java.serialization is false > ----------------------------------------------------------------------------------------------------- > > Key: STORM-3735 > URL: https://issues.apache.org/jira/browse/STORM-3735 > Project: Apache Storm > Issue Type: Bug > Affects Versions: 2.0.0, 2.1.0, 2.2.0 > Reporter: Ethan Li > Assignee: Ethan Li > Priority: Major > Fix For: 2.3.0 > > Time Spent: 20m > Remaining Estimate: 0h > > When a metric consumer is used, metrics will be sent from all executors to > the consumer. In some of the metrics, it includes NodeInfo object, and kryo > serialization will fail if topology.fall.back.on.java.serialization is false. > {code:title=worker logs} > 2021-01-13 20:16:37.017 o.a.s.e.ExecutorTransfer > Thread-16-__system-executor[-1, -1] [INFO] TRANSFERRING tuple [dest: 5 tuple: > source: __system:-1, stream: __metrics, id: {}, [TASK_INFO: { host: > openstorm14blue-n4.blue.ygrid.yahoo.com:6703 comp: __system[-1]}, [ > [CGroupCpuStat = {nr.throttled-percentage=46.544980443285525, > nr.period-count=767, nr.throttled-count=357, throttled.time-ms=27208}], > [CGroupMemoryLimit = 1342177280], [__recv-iconnection = {dequeuedMessages=0, > enqueued={/10.215.73.210:47038=3169}}], [__send-ico > nnection = {NodeInfo(node:149a917b-bc75-49c8-b351-f74b8ae0fbed-10.215.73.210, > port:[6701])={reconnects=1, src=/10.215.73.210:34938, pending=0, > dest=openstorm14blue-n4.blue.ygrid.yahoo.com/10.215.73.210:6701, sent=1896, > lostOnSend=0}, NodeInfo(node:149a917b-bc75- > 49c8-b351-f74b8ae0fbed-10.215.73.210, port:[6702])={reconnects=8, > src=/10.215.73.210:39476, pending=0, > dest=openstorm14blue-n4.blue.ygrid.yahoo.com/10.215.73.210:6702, sent=2115, > lostOnSend=0}, > NodeInfo(node:b77b5ec6-15ee-4bd2-a9b8-12fcadde7744-10.215.73.211, po > rt:[6700])={reconnects=125, pending=0, > dest=openstorm14blue-n5.blue.ygrid.yahoo.com/10.215.73.211:6700, sent=108, > lostOnSend=1331}}], [CGroupMemory = 316485632], [CGroupCpu = {user-ms=36960, > sys-ms=25860}], [memory.pools.Metaspace.usage = 0.9695890907929322], [m > emory.heap.max = 1073741824], [receive-queue-overflow = 0], > [memory.pools.Compressed-Class-Space.used = 6237424], > [memory.pools.Compressed-Class-Space.max = 1073741824], [memory.non-heap.init > = 2555904], [worker-transfer-queue-overflow = 0], [memory.pools.Metasp > ace.committed = 42074112], [receive-queue-sojourn_time_ms = 0.0], > [threads.waiting.count = 5], [memory.pools.G1-Eden-Space.usage = > 0.2777777777777778], [memory.pools.Metaspace.used = 40798320], > [memory.total.used = 101783888], [memory.pools.Code-Cache.init = 255 > 5904], [memory.non-heap.committed = 63832064], [GC.G1-Young-Generation.time = > 677], [receive-queue-insert_failures = 0.0], [memory.total.init = 130482176], > [GC.G1-Old-Generation.count = 0], [memory.pools.Metaspace.init = 0], > [memory.pools.G1-Survivor-Space.commi > tted = 5242880], [worker-transfer-queue-population = 0], > [memory.pools.Compressed-Class-Space.committed = 6684672], > [threads.timed_waiting.count = 31], [memory.pools.G1-Eden-Space.init = > 7340032], [memory.pools.Metaspace.max = -1], [memory.pools.G1-Survivor-Spac > e.used = 5242880], [memory.heap.init = 127926272], > [memory.pools.G1-Old-Gen.used-after-gc = 0], [worker-transfer-queue-capacity > = 1024], [memory.pools.G1-Survivor-Space.used-after-gc = 5242880], > [memory.pools.G1-Old-Gen.committed = 47185920], [memory.pools.G1-Ed > en-Space.committed = 75497472], [receive-queue-arrival_rate_secs = > 0.109421162052741], [memory.pools.Compressed-Class-Space.usage = > 0.0058090537786483765], [TGT-TimeToExpiryMsecs = 71282993], > [threads.runnable.count = 15], [worker-transfer-queue-insert_failures > = 0.0], [worker-transfer-queue-sojourn_time_ms = 0.0], [memory.heap.committed > = 127926272], [memory.non-heap.max = -1], [threads.daemon.count = 29], > [memory.pools.Code-Cache.max = 251658240], > [worker-transfer-queue-arrival_rate_secs = 90.47776674390379], [memory > .heap.usage = 0.037109360098838806], [memory.pools.G1-Old-Gen.init = > 120586240], [memory.pools.Code-Cache.committed = 15138816], > [receive-queue-pct_full = 0.0], [worker-transfer-queue-pct_full = 0.0], > [receive-queue-population = 0], [memory.pools.Compressed-Clas > s-Space.init = 0], [memory.pools.Code-Cache.usage = 0.059299468994140625], > [worker-transfer-queue-dropped_messages = 0], [GC.G1-Young-Generation.count = > 18], [memory.pools.Code-Cache.used = 14923200], > [memory.pools.G1-Old-Gen.usage = 0.012695297598838806], [memo > ry.non-heap.usage = -6.196368E7], [memory.total.max = 1073741823], > [threads.count = 51], [memory.heap.used = 39845872], > [memory.pools.G1-Survivor-Space.init = 0], [memory.pools.G1-Old-Gen.used = > 13631472], [receive-queue-dropped_messages = 0], [threads.terminate > d.count = 0], [memory.pools.G1-Eden-Space.max = -1], [uptimeSecs = 76], > [threads.deadlock.count = 0], [threads.blocked.count = 0], [newWorkerEvent = > 1], [receive-queue-capacity = 32768], [threads.new.count = 0], [startTimeSecs > = 1610568920], [memory.pools.G1-Ede > n-Space.used-after-gc = 0], [memory.pools.G1-Eden-Space.used = 20971520], > [GC.G1-Old-Generation.time = 0], [memory.non-heap.used = 61964384], > [memory.pools.G1-Old-Gen.max = 1073741824], > [memory.pools.G1-Survivor-Space.max = -1], [memory.pools.G1-Survivor-Space.u > sage = 1.0], [memory.total.committed = 191823872], [doHeartbeat-calls.count = > 64], [doHeartbeat-calls.m1_rate = 1.0730202200365234E-6], > [doHeartbeat-calls.m5_rate = 1.1636999000665182E-6], > [doHeartbeat-calls.m15_rate = 1.1870955900857726E-6], [doHeartbeat-calls. > mean_rate = 1.0067076836696486E-6]]] PROC_START_TIME(sampled): null > EXEC_START_TIME(sampled): null] > ... > 2021-01-13 20:16:37.030 o.a.s.u.Utils Thread-16-__system-executor[-1, -1] > [ERROR] Async loop died! > java.lang.RuntimeException: com.esotericsoftware.kryo.KryoException: > java.lang.IllegalArgumentException: Class is not registered: > org.apache.storm.generated.NodeInfo > Note: To register this class use: > kryo.register(org.apache.storm.generated.NodeInfo.class); > Serialization trace: > value (org.apache.storm.metric.api.IMetricsConsumer$DataPoint) > at org.apache.storm.executor.Executor.accept(Executor.java:294) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:113) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.JCQueue.consume(JCQueue.java:89) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:159) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:145) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.Utils$1.run(Utils.java:401) > [storm-client-2.3.0.y.jar:2.3.0.y] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262] > Caused by: com.esotericsoftware.kryo.KryoException: > java.lang.IllegalArgumentException: Class is not registered: > org.apache.storm.generated.NodeInfo > Note: To register this class use: > kryo.register(org.apache.storm.generated.NodeInfo.class); > Serialization trace: > value (org.apache.storm.metric.api.IMetricsConsumer$DataPoint) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:101) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518) > ~[kryo-3.0.3.jar:?] > at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) > ~[kryo-3.0.3.jar:?] > Serialization trace: > value (org.apache.storm.metric.api.IMetricsConsumer$DataPoint) > at org.apache.storm.executor.Executor.accept(Executor.java:294) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:113) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.JCQueue.consume(JCQueue.java:89) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:159) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:145) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.utils.Utils$1.run(Utils.java:401) > [storm-client-2.3.0.y.jar:2.3.0.y] > at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262] > Caused by: com.esotericsoftware.kryo.KryoException: > java.lang.IllegalArgumentException: Class is not registered: > org.apache.storm.generated.NodeInfo > Note: To register this class use: > kryo.register(org.apache.storm.generated.NodeInfo.class); > Serialization trace: > value (org.apache.storm.metric.api.IMetricsConsumer$DataPoint) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:101) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518) > ~[kryo-3.0.3.jar:?] > at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40) > ~[kryo-3.0.3.jar:?] > at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40) > ~[kryo-3.0.3.jar:?] > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534) > ~[kryo-3.0.3.jar:?] > at > org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:38) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:40) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.daemon.worker.WorkerTransfer.tryTransferRemote(WorkerTransfer.java:118) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.daemon.worker.WorkerState.tryTransferRemote(WorkerState.java:553) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.ExecutorTransfer.tryTransfer(ExecutorTransfer.java:68) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.daemon.Task.sendUnanchored(Task.java:215) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.executor.Executor.metricsTick(Executor.java:345) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:205) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.executor.Executor.accept(Executor.java:290) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > ... 6 more > Caused by: java.lang.IllegalArgumentException: Class is not registered: > org.apache.storm.generated.NodeInfo > Note: To register this class use: > kryo.register(org.apache.storm.generated.NodeInfo.class); > at com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:488) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.util.DefaultClassResolver.writeClass(DefaultClassResolver.java:97) > ~[kryo-3.0.3.jar:?] > at com.esotericsoftware.kryo.Kryo.writeClass(Kryo.java:517) > ~[kryo-3.0.3.jar:?] > at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:622) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:106) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:39) > ~[kryo-3.0.3.jar:?] > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:552) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:80) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518) > ~[kryo-3.0.3.jar:?] > at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40) > ~[kryo-3.0.3.jar:?] > at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100) > ~[kryo-3.0.3.jar:?] > at > com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40) > ~[kryo-3.0.3.jar:?] > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534) > ~[kryo-3.0.3.jar:?] > at > org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:38) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:40) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.daemon.worker.WorkerTransfer.tryTransferRemote(WorkerTransfer.java:118) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.daemon.worker.WorkerState.tryTransferRemote(WorkerState.java:553) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.ExecutorTransfer.tryTransfer(ExecutorTransfer.java:68) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.daemon.Task.sendUnanchored(Task.java:215) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.executor.Executor.metricsTick(Executor.java:345) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at > org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:205) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > at org.apache.storm.executor.Executor.accept(Executor.java:290) > ~[storm-client-2.3.0.y.jar:2.3.0.y] > ... 6 more > {code} > The related metric is "__send-iconnection" from > https://github.com/apache/storm/blob/7bef73a6faa14558ef254efe74cbe4bfef81c2e2/storm-client/src/jvm/org/apache/storm/daemon/metrics/BuiltinMetricsUtil.java#L40-L43 > Note that this can only be reproduced when metrics are sent across workers > (otherwise there is no serialization). > The work around is one of the following > 1) add org.apache.storm.generated.NodeInfo to topology.kryo.register in > topology conf > 2) set topology.fall.back.on.java.serialization true or unset > topology.fall.back.on.java.serialization since the default is true > The fix is to register NodeInfo class in kryo. > https://github.com/apache/storm/blob/7bef73a6faa14558ef254efe74cbe4bfef81c2e2/storm-client/src/jvm/org/apache/storm/serialization/SerializationFactory.java#L67-L77 -- This message was sent by Atlassian Jira (v8.3.4#803005)