Hi all,

I apologize for flooding the list with questions lately. I guess I’m having a 
rough week.

I thought my app was finally running fine after Damian’s help on Monday, but it 
turns out that it hasn’t been (successfully) consuming 2 of the topics it 
should be (out of 11 total).

I’ve been trying to debug this all day but I’m out of ideas, hence this message.

Some background on my streams app:

* Kafka Streams 0.10.0.1, Java 8, JRuby 9.1.5.0
* currently running a single instance
* consuming from 11 topics with a total of 130 partitions
* num.stream.threads is 130
* constructing a single KafkaStreams and KStreamBuilder
* calling KStreamBuilder.stream once for each topic

that last is per this message from Damian on Monday:

https://lists.apache.org/thread.html/727ed4e6fba9bf350e500e0d3d1087f868337d7abccad1a38d06500f@%3Cusers.kafka.apache.org%3E

The app is successfully consuming from 9 of the topics, but for some mysterious 
reason it is _not_ consuming from 2 of the topics. Two specific topics, 
consistently.

I first noticed the problem when looking at my lag graph in Datadog, which is 
broken out by topic — I noticed that certain topics seemed missing.

So I ran kafka-consumer-groups to get a closer look, and both topics show up in 
the list, with all their partitions, but their CURRENT-OFFSET value is 
“unknown” for every thread/consumer.

I’ve been trying to figure out what’s different about these 2 topics, but so 
far I’ve had no luck, I just can’t find any differences.

What I’ve tried so far:

* consuming from the topics with kafkacat → looks good

* scrutinizing my app’s logs for errors or warnings related to these topics → 
see nothing

* stopping the app, changing its config to consume from only these 2 topics → 
nothing, same result

* running a different instance of the app with different IDs, consuming from 1 
of the problematic topics only → nothing, it just sits there

In that last case, I took a look at the threads with jconsole. All the 
StreamThreads are just sitting there with this status:

State: RUNNABLE
Total blocked: 23  Total waited: 0

Stack trace: 
sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
   - locked sun.nio.ch.Util$3@17e90d96
   - locked java.util.Collections$UnmodifiableSet@2a14b4cc
   - locked sun.nio.ch.EPollSelectorImpl@5ffc44a4
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
org.apache.kafka.common.network.Selector.select(Selector.java:454)
org.apache.kafka.common.network.Selector.poll(Selector.java:277)
org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
...

At this point I’m at the end of my rope — I’m out of ideas. I would very much 
appreciate any suggestions for how to proceed. I’d be happy to supply logs 
files, etc.

Thank you!
Avi


————
Software Architect @ Park Assist
We’re hiring! http://tech.parkassist.com/jobs/

Reply via email to