Ethan Li created STORM-3735:
-------------------------------

             Summary: Kyro serialization fails on some metric tuples when 
topology.fall.back.on.java.serialization is false
                 Key: STORM-3735
                 URL: https://issues.apache.org/jira/browse/STORM-3735
             Project: Apache Storm
          Issue Type: Bug
            Reporter: Ethan Li


When a metric consumer is used, metrics will be sent from all executors to the 
consumer. In some of the metrics,  it includes NodeInfo object, and kryo 
serialization will fail if topology.fall.back.on.java.serialization is false.

{code:title=worker logs}
2021-01-13 20:16:37.017 o.a.s.e.ExecutorTransfer 
Thread-16-__system-executor[-1, -1] [INFO] TRANSFERRING tuple [dest: 5 tuple: 
source: __system:-1, stream: __metrics, id: {}, [TASK_INFO: { host: 
openstorm14blue-n4.blue.ygrid.yahoo.com:6703 comp: __system[-1]}, [
[CGroupCpuStat = {nr.throttled-percentage=46.544980443285525, 
nr.period-count=767, nr.throttled-count=357, throttled.time-ms=27208}], 
[CGroupMemoryLimit = 1342177280], [__recv-iconnection = {dequeuedMessages=0, 
enqueued={/10.215.73.210:47038=3169}}], [__send-ico
nnection = {NodeInfo(node:149a917b-bc75-49c8-b351-f74b8ae0fbed-10.215.73.210, 
port:[6701])={reconnects=1, src=/10.215.73.210:34938, pending=0, 
dest=openstorm14blue-n4.blue.ygrid.yahoo.com/10.215.73.210:6701, sent=1896, 
lostOnSend=0}, NodeInfo(node:149a917b-bc75-
49c8-b351-f74b8ae0fbed-10.215.73.210, port:[6702])={reconnects=8, 
src=/10.215.73.210:39476, pending=0, 
dest=openstorm14blue-n4.blue.ygrid.yahoo.com/10.215.73.210:6702, sent=2115, 
lostOnSend=0}, 
NodeInfo(node:b77b5ec6-15ee-4bd2-a9b8-12fcadde7744-10.215.73.211, po
rt:[6700])={reconnects=125, pending=0, 
dest=openstorm14blue-n5.blue.ygrid.yahoo.com/10.215.73.211:6700, sent=108, 
lostOnSend=1331}}], [CGroupMemory = 316485632], [CGroupCpu = {user-ms=36960, 
sys-ms=25860}], [memory.pools.Metaspace.usage = 0.9695890907929322], [m
emory.heap.max = 1073741824], [receive-queue-overflow = 0], 
[memory.pools.Compressed-Class-Space.used = 6237424], 
[memory.pools.Compressed-Class-Space.max = 1073741824], [memory.non-heap.init = 
2555904], [worker-transfer-queue-overflow = 0], [memory.pools.Metasp
ace.committed = 42074112], [receive-queue-sojourn_time_ms = 0.0], 
[threads.waiting.count = 5], [memory.pools.G1-Eden-Space.usage = 
0.2777777777777778], [memory.pools.Metaspace.used = 40798320], 
[memory.total.used = 101783888], [memory.pools.Code-Cache.init = 255
5904], [memory.non-heap.committed = 63832064], [GC.G1-Young-Generation.time = 
677], [receive-queue-insert_failures = 0.0], [memory.total.init = 130482176], 
[GC.G1-Old-Generation.count = 0], [memory.pools.Metaspace.init = 0], 
[memory.pools.G1-Survivor-Space.commi
tted = 5242880], [worker-transfer-queue-population = 0], 
[memory.pools.Compressed-Class-Space.committed = 6684672], 
[threads.timed_waiting.count = 31], [memory.pools.G1-Eden-Space.init = 
7340032], [memory.pools.Metaspace.max = -1], [memory.pools.G1-Survivor-Spac
e.used = 5242880], [memory.heap.init = 127926272], 
[memory.pools.G1-Old-Gen.used-after-gc = 0], [worker-transfer-queue-capacity = 
1024], [memory.pools.G1-Survivor-Space.used-after-gc = 5242880], 
[memory.pools.G1-Old-Gen.committed = 47185920], [memory.pools.G1-Ed
en-Space.committed = 75497472], [receive-queue-arrival_rate_secs = 
0.109421162052741], [memory.pools.Compressed-Class-Space.usage = 
0.0058090537786483765], [TGT-TimeToExpiryMsecs = 71282993], 
[threads.runnable.count = 15], [worker-transfer-queue-insert_failures
= 0.0], [worker-transfer-queue-sojourn_time_ms = 0.0], [memory.heap.committed = 
127926272], [memory.non-heap.max = -1], [threads.daemon.count = 29], 
[memory.pools.Code-Cache.max = 251658240], 
[worker-transfer-queue-arrival_rate_secs = 90.47776674390379], [memory
.heap.usage = 0.037109360098838806], [memory.pools.G1-Old-Gen.init = 
120586240], [memory.pools.Code-Cache.committed = 15138816], 
[receive-queue-pct_full = 0.0], [worker-transfer-queue-pct_full = 0.0], 
[receive-queue-population = 0], [memory.pools.Compressed-Clas
s-Space.init = 0], [memory.pools.Code-Cache.usage = 0.059299468994140625], 
[worker-transfer-queue-dropped_messages = 0], [GC.G1-Young-Generation.count = 
18], [memory.pools.Code-Cache.used = 14923200], [memory.pools.G1-Old-Gen.usage 
= 0.012695297598838806], [memo
ry.non-heap.usage = -6.196368E7], [memory.total.max = 1073741823], 
[threads.count = 51], [memory.heap.used = 39845872], 
[memory.pools.G1-Survivor-Space.init = 0], [memory.pools.G1-Old-Gen.used = 
13631472], [receive-queue-dropped_messages = 0], [threads.terminate
d.count = 0], [memory.pools.G1-Eden-Space.max = -1], [uptimeSecs = 76], 
[threads.deadlock.count = 0], [threads.blocked.count = 0], [newWorkerEvent = 
1], [receive-queue-capacity = 32768], [threads.new.count = 0], [startTimeSecs = 
1610568920], [memory.pools.G1-Ede
n-Space.used-after-gc = 0], [memory.pools.G1-Eden-Space.used = 20971520], 
[GC.G1-Old-Generation.time = 0], [memory.non-heap.used = 61964384], 
[memory.pools.G1-Old-Gen.max = 1073741824], [memory.pools.G1-Survivor-Space.max 
= -1], [memory.pools.G1-Survivor-Space.u
sage = 1.0], [memory.total.committed = 191823872], [doHeartbeat-calls.count = 
64], [doHeartbeat-calls.m1_rate = 1.0730202200365234E-6], 
[doHeartbeat-calls.m5_rate = 1.1636999000665182E-6], 
[doHeartbeat-calls.m15_rate = 1.1870955900857726E-6], [doHeartbeat-calls.
mean_rate = 1.0067076836696486E-6]]] PROC_START_TIME(sampled): null 
EXEC_START_TIME(sampled): null]

...

2021-01-13 20:16:37.030 o.a.s.u.Utils Thread-16-__system-executor[-1, -1] 
[ERROR] Async loop died!
java.lang.RuntimeException: com.esotericsoftware.kryo.KryoException: 
java.lang.IllegalArgumentException: Class is not registered: 
org.apache.storm.generated.NodeInfo
Note: To register this class use: 
kryo.register(org.apache.storm.generated.NodeInfo.class);
Serialization trace:
value (org.apache.storm.metric.api.IMetricsConsumer$DataPoint)
        at org.apache.storm.executor.Executor.accept(Executor.java:294) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:113) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at org.apache.storm.utils.JCQueue.consume(JCQueue.java:89) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:159) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:145) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at org.apache.storm.utils.Utils$1.run(Utils.java:401) 
[storm-client-2.3.0.y.jar:2.3.0.y]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262]
Caused by: com.esotericsoftware.kryo.KryoException: 
java.lang.IllegalArgumentException: Class is not registered: 
org.apache.storm.generated.NodeInfo
Note: To register this class use: 
kryo.register(org.apache.storm.generated.NodeInfo.class);
Serialization trace:
value (org.apache.storm.metric.api.IMetricsConsumer$DataPoint)
        at 
com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:101) 
~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
 ~[kryo-3.0.3.jar:?]
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) 
~[kryo-3.0.3.jar:?]
Serialization trace:
value (org.apache.storm.metric.api.IMetricsConsumer$DataPoint)
        at org.apache.storm.executor.Executor.accept(Executor.java:294) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at org.apache.storm.utils.JCQueue.consumeImpl(JCQueue.java:113) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at org.apache.storm.utils.JCQueue.consume(JCQueue.java:89) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:159) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.executor.bolt.BoltExecutor$1.call(BoltExecutor.java:145) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at org.apache.storm.utils.Utils$1.run(Utils.java:401) 
[storm-client-2.3.0.y.jar:2.3.0.y]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262]
Caused by: com.esotericsoftware.kryo.KryoException: 
java.lang.IllegalArgumentException: Class is not registered: 
org.apache.storm.generated.NodeInfo
Note: To register this class use: 
kryo.register(org.apache.storm.generated.NodeInfo.class);
Serialization trace:
value (org.apache.storm.metric.api.IMetricsConsumer$DataPoint)
        at 
com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:101) 
~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
 ~[kryo-3.0.3.jar:?]
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) 
~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
 ~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
 ~[kryo-3.0.3.jar:?]
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) 
~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
 ~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
 ~[kryo-3.0.3.jar:?]
        at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534) 
~[kryo-3.0.3.jar:?]
        at 
org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:38)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:40)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.daemon.worker.WorkerTransfer.tryTransferRemote(WorkerTransfer.java:118)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.daemon.worker.WorkerState.tryTransferRemote(WorkerState.java:553)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.executor.ExecutorTransfer.tryTransfer(ExecutorTransfer.java:68)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at org.apache.storm.daemon.Task.sendUnanchored(Task.java:215) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at org.apache.storm.executor.Executor.metricsTick(Executor.java:345) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:205)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at org.apache.storm.executor.Executor.accept(Executor.java:290) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        ... 6 more
Caused by: java.lang.IllegalArgumentException: Class is not registered: 
org.apache.storm.generated.NodeInfo
Note: To register this class use: 
kryo.register(org.apache.storm.generated.NodeInfo.class);
        at com.esotericsoftware.kryo.Kryo.getRegistration(Kryo.java:488) 
~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.util.DefaultClassResolver.writeClass(DefaultClassResolver.java:97)
 ~[kryo-3.0.3.jar:?]
        at com.esotericsoftware.kryo.Kryo.writeClass(Kryo.java:517) 
~[kryo-3.0.3.jar:?]
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:622) 
~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:106)
 ~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.MapSerializer.write(MapSerializer.java:39)
 ~[kryo-3.0.3.jar:?]
        at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:552) 
~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:80) 
~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:518)
 ~[kryo-3.0.3.jar:?]
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) 
~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
 ~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
 ~[kryo-3.0.3.jar:?]
        at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:628) 
~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:100)
 ~[kryo-3.0.3.jar:?]
        at 
com.esotericsoftware.kryo.serializers.CollectionSerializer.write(CollectionSerializer.java:40)
 ~[kryo-3.0.3.jar:?]
        at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:534) 
~[kryo-3.0.3.jar:?]
        at 
org.apache.storm.serialization.KryoValuesSerializer.serializeInto(KryoValuesSerializer.java:38)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.serialization.KryoTupleSerializer.serialize(KryoTupleSerializer.java:40)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.daemon.worker.WorkerTransfer.tryTransferRemote(WorkerTransfer.java:118)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.daemon.worker.WorkerState.tryTransferRemote(WorkerState.java:553)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.executor.ExecutorTransfer.tryTransfer(ExecutorTransfer.java:68)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at org.apache.storm.daemon.Task.sendUnanchored(Task.java:215) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at org.apache.storm.executor.Executor.metricsTick(Executor.java:345) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        at 
org.apache.storm.executor.bolt.BoltExecutor.tupleActionFn(BoltExecutor.java:205)
 ~[storm-client-2.3.0.y.jar:2.3.0.y]
        at org.apache.storm.executor.Executor.accept(Executor.java:290) 
~[storm-client-2.3.0.y.jar:2.3.0.y]
        ... 6 more
{code}

The related metric is "__send-iconnection" from 
https://github.com/apache/storm/blob/7bef73a6faa14558ef254efe74cbe4bfef81c2e2/storm-client/src/jvm/org/apache/storm/daemon/metrics/BuiltinMetricsUtil.java#L40-L43

Note that this can only be reproduced when metrics are sent across workers 
(otherwise there is no serialization).

The work around is one of the following
1) add org.apache.storm.generated.NodeInfo to topology.kryo.register in 
topology conf
2) set topology.fall.back.on.java.serialization true or unset 
topology.fall.back.on.java.serialization since the default is true


The fix is to register NodeInfo class in kryo.
https://github.com/apache/storm/blob/7bef73a6faa14558ef254efe74cbe4bfef81c2e2/storm-client/src/jvm/org/apache/storm/serialization/SerializationFactory.java#L67-L77




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to