Hi all, I apologize for flooding the list with questions lately. I guess I’m having a rough week.
I thought my app was finally running fine after Damian’s help on Monday, but it turns out that it hasn’t been (successfully) consuming 2 of the topics it should be (out of 11 total). I’ve been trying to debug this all day but I’m out of ideas, hence this message. Some background on my streams app: * Kafka Streams 0.10.0.1, Java 8, JRuby 9.1.5.0 * currently running a single instance * consuming from 11 topics with a total of 130 partitions * num.stream.threads is 130 * constructing a single KafkaStreams and KStreamBuilder * calling KStreamBuilder.stream once for each topic that last is per this message from Damian on Monday: https://lists.apache.org/thread.html/727ed4e6fba9bf350e500e0d3d1087f868337d7abccad1a38d06500f@%3Cusers.kafka.apache.org%3E The app is successfully consuming from 9 of the topics, but for some mysterious reason it is _not_ consuming from 2 of the topics. Two specific topics, consistently. I first noticed the problem when looking at my lag graph in Datadog, which is broken out by topic — I noticed that certain topics seemed missing. So I ran kafka-consumer-groups to get a closer look, and both topics show up in the list, with all their partitions, but their CURRENT-OFFSET value is “unknown” for every thread/consumer. I’ve been trying to figure out what’s different about these 2 topics, but so far I’ve had no luck, I just can’t find any differences. What I’ve tried so far: * consuming from the topics with kafkacat → looks good * scrutinizing my app’s logs for errors or warnings related to these topics → see nothing * stopping the app, changing its config to consume from only these 2 topics → nothing, same result * running a different instance of the app with different IDs, consuming from 1 of the problematic topics only → nothing, it just sits there In that last case, I took a look at the threads with jconsole. All the StreamThreads are just sitting there with this status: State: RUNNABLE Total blocked: 23 Total waited: 0 Stack trace: sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked sun.nio.ch.Util$3@17e90d96 - locked java.util.Collections$UnmodifiableSet@2a14b4cc - locked sun.nio.ch.EPollSelectorImpl@5ffc44a4 sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) org.apache.kafka.common.network.Selector.select(Selector.java:454) org.apache.kafka.common.network.Selector.poll(Selector.java:277) org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260) org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360) ... At this point I’m at the end of my rope — I’m out of ideas. I would very much appreciate any suggestions for how to proceed. I’d be happy to supply logs files, etc. Thank you! Avi ———— Software Architect @ Park Assist We’re hiring! http://tech.parkassist.com/jobs/