devinbost edited a comment on issue #6332: (Flaky-test) Intermittent failure of ProxyParserTest.testRegexSubscription URL: https://github.com/apache/pulsar/issues/6332#issuecomment-590708692 At a closer look, that result only occurred after the first failure. During the first failure, the logs revealed something a little different. With the additional log messages, they looked like this: ``` 04:35:52.434 [pulsar-client-io-304-1:org.apache.pulsar.client.impl.MultiTopicsConsumerImpl@882] INFO org.apache.pulsar.client.impl.MultiTopicsConsumerImpl - [MultiTopicsConsumer-70113] [regex-sub-proxy-parser-test-1582605352335] Success subscribe new topic persistent://sample/test/local/topic2 in topics consumer, partitions: 0, allTopicPartitionsNumber: 2 04:35:52.436 [pulsar-client-io-304-1:org.apache.pulsar.client.impl.ConsumerImpl@632] INFO org.apache.pulsar.client.impl.ConsumerImpl - ConsumerImpl.connectionOpened(..)15 04:35:52.438 [pulsar-proxy-io-39-1:org.apache.pulsar.proxy.server.ProxyConnection@122] INFO org.apache.pulsar.proxy.server.ProxyConnection - ProxyConnection.channelRegistered(..)01 04:35:52.438 [pulsar-proxy-io-39-1:org.apache.pulsar.proxy.server.ProxyConnection@124] INFO org.apache.pulsar.proxy.server.ProxyConnection - ProxyConnection.channelRegistered(..)02 04:35:52.438 [pulsar-proxy-io-39-1:org.apache.pulsar.proxy.server.ProxyConnection@143] INFO org.apache.pulsar.proxy.server.ProxyConnection - ProxyConnection.channelActive(..)01 04:35:52.438 [pulsar-proxy-io-39-1:org.apache.pulsar.proxy.server.ProxyConnection@145] INFO org.apache.pulsar.proxy.server.ProxyConnection - ProxyConnection.channelActive(..)02 04:35:52.438 [pulsar-proxy-io-39-1:org.apache.pulsar.proxy.server.ProxyConnection@147] INFO org.apache.pulsar.proxy.server.ProxyConnection - [/10.1.0.4:40490] New connection opened 04:35:52.439 [pulsar-proxy-io-39-1:org.apache.pulsar.proxy.server.ProxyConnection@211] INFO org.apache.pulsar.proxy.server.ProxyConnection - [/10.1.0.4:40490] complete connection, init proxy handler. authenticated with none role null, hasProxyToBrokerUrl: false 04:35:52.440 [pulsar-client-io-304-1:org.apache.pulsar.client.impl.ConnectionPool@156] INFO org.apache.pulsar.client.impl.ConnectionPool - [[id: 0xc9ddb5f2, L:/10.1.0.4:40490 - R:fv-az60.ct13ep5w0hzutbi1ytkrtqhxsg.cx.internal.cloudapp.net/10.1.0.4:37359]] Connected to server 04:35:52.440 [pulsar-client-io-304-1:org.apache.pulsar.client.impl.ProducerImpl@1067] INFO org.apache.pulsar.client.impl.ProducerImpl - [persistent://sample/test/local/topic1] [null] Creating producer on cnx [id: 0xc9ddb5f2, L:/10.1.0.4:40490 - R:fv-az60.ct13ep5w0hzutbi1ytkrtqhxsg.cx.internal.cloudapp.net/10.1.0.4:37359] 04:35:52.440 [pulsar-proxy-io-39-1:io.netty.channel.DefaultChannelPipeline@1164] WARN io.netty.channel.DefaultChannelPipeline - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception. java.lang.UnsupportedOperationException: null ``` So, we reach this line: `ConsumerImpl.connectionOpened(..)15` which implies that `ConsumerImpl.connectionOpened(..)` was able to successfully run to completion. After more careful inspection, I also discovered that the UnsupportedOperationException actually alternates (non-deterministically) between the exception farther above (https://github.com/apache/pulsar/issues/6332#issuecomment-587319767) and this one: ``` 04:35:48.906 [pulsar-proxy-io-39-1:org.apache.pulsar.proxy.server.ProxyConnection@168] WARN org.apache.pulsar.proxy.server.ProxyConnection - [/10.1.0.4:40182] Got exception UnsupportedOperationException : null java.lang.UnsupportedOperationException: null at org.apache.pulsar.common.protocol.PulsarDecoder.handleProducer(PulsarDecoder.java:461) ~[pulsar-common-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT] at org.apache.pulsar.common.protocol.PulsarDecoder.channelRead(PulsarDecoder.java:189) ~[pulsar-common-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT] at org.apache.pulsar.proxy.server.ProxyConnection.channelRead(ProxyConnection.java:180) ~[classes/:?] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.43.Final.jar:4.1.43.Final] at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:326) [netty-codec-4.1.43.Final.jar:4.1.43.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:300) [netty-codec-4.1.43.Final.jar:4.1.43.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.43.Final.jar:4.1.43.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422) [netty-transport-4.1.43.Final.jar:4.1.43.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931) [netty-transport-4.1.43.Final.jar:4.1.43.Final] at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:792) [netty-transport-native-epoll-4.1.43.Final-linux-x86_64.jar:4.1.43.Final] at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:502) [netty-transport-native-epoll-4.1.43.Final-linux-x86_64.jar:4.1.43.Final] at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:407) [netty-transport-native-epoll-4.1.43.Final-linux-x86_64.jar:4.1.43.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050) [netty-common-4.1.43.Final.jar:4.1.43.Final] at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.43.Final.jar:4.1.43.Final] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.43.Final.jar:4.1.43.Final] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242] ``` There is one key difference between the two methods that trigger the `java.lang.UnsupportedOperationException: null` result: 1. at org.apache.pulsar.common.protocol.PulsarDecoder.handleProducer(PulsarDecoder.java:461) ~[pulsar-common-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT] 2. at org.apache.pulsar.common.protocol.PulsarDecoder.handleSubscribe(PulsarDecoder.java:457) ~[pulsar-common-2.6.0-SNAPSHOT.jar:2.6.0-SNAPSHOT] Notice the first one is `handleProducer` and the second one is `handleSubscribe`. This may be an important clue to identifying the cause of this problem. Notably, ProxyConnection, ServerCnx, ClientCnx, ProxyClientCnx, and ServerConnection all extend PulsarHandler, which extends PulsarDecoder. So, perhaps we're trying to call methods on an object that could be one or the other of that type. Perhaps there's non-determinism in which of these objects we end up with at some point in the process.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
