[jira] [Commented] (KAFKA-7088) Kafka streams thread waits infinitely on transaction init

2018-06-26 Thread Lukasz Gluchowski (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523725#comment-16523725
 ] 

Lukasz Gluchowski commented on KAFKA-7088:
--

Sorry but DEBUG is still to verbose. It is impacting resources used by the 
application

> Kafka streams thread waits infinitely on transaction init
> -
>
> Key: KAFKA-7088
> URL: https://issues.apache.org/jira/browse/KAFKA-7088
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 1.0.1
> Environment: Linux 4.14.33-51.37.amzn1.x86_64 #1 SMP Thu May 3 
> 20:07:43 UTC 2018 
> kafka-streams (client) 1.0.1
> kafka broker 1.1.0
> Java version:
> OpenJDK Runtime Environment (build 1.8.0_171-b10)
> OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)
> kakfa config overrides:
> num.stream.threads: 6
> session.timeout.ms: 1
> request.timeout.ms: 11000
> fetch.max.wait.ms: 500
> max.poll.records: 1000
> topic has 24 partitions
>Reporter: Lukasz Gluchowski
>Priority: Major
>  Labels: eos
>
> A kafka stream application thread stops processing without any feedback. The 
> topic has 24 partitions and I noticed that processing stopped only for some 
> partitions. I will describe what happened to partition:10. The application is 
> still running (now for about 8 hours) and that thread is hanging there and no 
> rebalancing that took place.
> There is no error (we have a custom `Thread.UncaughtExceptionHandler` which 
> was not called). I noticed that after couple of minutes stream stopped 
> processing (at offset 32606948 where log-end-offset is 33472402). 
> Broker itself is not reporting any active consumer in that consumer group and 
> the only info I was able to gather was from thread dump:
> {code:java}
> "mp_ads_publisher_pro_madstorage-web-corotos-prod-9db804ae-2a7a-431f-be09-392ab38cd8a2-StreamThread-33"
>  #113 prio=5 os_prio=0 tid=0x7fe07c6349b0 nid=0xf7a waiting on condition 
> [0x7fe0215d4000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0xfec6a2f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at 
> org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:50)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.initTransactions(KafkaProducer.java:554)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:151)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:404)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:365)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.createTasks(StreamThread.java:350)
> at 
> org.apache.kafka.streams.processor.internals.TaskManager.addStreamTasks(TaskManager.java:137)
> at 
> org.apache.kafka.streams.processor.internals.TaskManager.createTasks(TaskManager.java:88)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:259)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:264)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:367)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1146)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744){code}
>  
> I tried restarting application once but the situation repeated. Thread read 
> some data, committed offset and 

[jira] [Commented] (KAFKA-7088) Kafka streams thread waits infinitely on transaction init

2018-06-25 Thread Lukasz Gluchowski (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522004#comment-16522004
 ] 

Lukasz Gluchowski commented on KAFKA-7088:
--

[~yuzhih...@gmail.com] I turned TRACE logging on non production environment to 
try it out and it was filling the disk very quickly. Sorry but I can't turn it 
on on production.

> Kafka streams thread waits infinitely on transaction init
> -
>
> Key: KAFKA-7088
> URL: https://issues.apache.org/jira/browse/KAFKA-7088
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 1.0.1
> Environment: Linux 4.14.33-51.37.amzn1.x86_64 #1 SMP Thu May 3 
> 20:07:43 UTC 2018 
> kafka-streams (client) 1.0.1
> kafka broker 1.1.0
> Java version:
> OpenJDK Runtime Environment (build 1.8.0_171-b10)
> OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)
> kakfa config overrides:
> num.stream.threads: 6
> session.timeout.ms: 1
> request.timeout.ms: 11000
> fetch.max.wait.ms: 500
> max.poll.records: 1000
> topic has 24 partitions
>Reporter: Lukasz Gluchowski
>Priority: Major
>  Labels: eos
>
> A kafka stream application thread stops processing without any feedback. The 
> topic has 24 partitions and I noticed that processing stopped only for some 
> partitions. I will describe what happened to partition:10. The application is 
> still running (now for about 8 hours) and that thread is hanging there and no 
> rebalancing that took place.
> There is no error (we have a custom `Thread.UncaughtExceptionHandler` which 
> was not called). I noticed that after couple of minutes stream stopped 
> processing (at offset 32606948 where log-end-offset is 33472402). 
> Broker itself is not reporting any active consumer in that consumer group and 
> the only info I was able to gather was from thread dump:
> {code:java}
> "mp_ads_publisher_pro_madstorage-web-corotos-prod-9db804ae-2a7a-431f-be09-392ab38cd8a2-StreamThread-33"
>  #113 prio=5 os_prio=0 tid=0x7fe07c6349b0 nid=0xf7a waiting on condition 
> [0x7fe0215d4000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0xfec6a2f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at 
> org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:50)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.initTransactions(KafkaProducer.java:554)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:151)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:404)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:365)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.createTasks(StreamThread.java:350)
> at 
> org.apache.kafka.streams.processor.internals.TaskManager.addStreamTasks(TaskManager.java:137)
> at 
> org.apache.kafka.streams.processor.internals.TaskManager.createTasks(TaskManager.java:88)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:259)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:264)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:367)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1146)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744){code}
>  
> I tried restarting 

[jira] [Comment Edited] (KAFKA-7088) Kafka streams thread waits infinitely on transaction init

2018-06-22 Thread Lukasz Gluchowski (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520396#comment-16520396
 ] 

Lukasz Gluchowski edited comment on KAFKA-7088 at 6/22/18 2:55 PM:
---

[~yuzhih...@gmail.com] It was set to EXACTLY_ONCE. I switched to "at least 
once" and the problem went away.


was (Author: lgluchowski):
I switched from "exactly once" to "at least once" and the problem went away.

> Kafka streams thread waits infinitely on transaction init
> -
>
> Key: KAFKA-7088
> URL: https://issues.apache.org/jira/browse/KAFKA-7088
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 1.0.1
> Environment: Linux 4.14.33-51.37.amzn1.x86_64 #1 SMP Thu May 3 
> 20:07:43 UTC 2018 
> kafka-streams (client) 1.0.1
> kafka broker 1.1.0
> Java version:
> OpenJDK Runtime Environment (build 1.8.0_171-b10)
> OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)
> kakfa config overrides:
> num.stream.threads: 6
> session.timeout.ms: 1
> request.timeout.ms: 11000
> fetch.max.wait.ms: 500
> max.poll.records: 1000
> topic has 24 partitions
>Reporter: Lukasz Gluchowski
>Priority: Major
>  Labels: eos
>
> A kafka stream application thread stops processing without any feedback. The 
> topic has 24 partitions and I noticed that processing stopped only for some 
> partitions. I will describe what happened to partition:10. The application is 
> still running (now for about 8 hours) and that thread is hanging there and no 
> rebalancing that took place.
> There is no error (we have a custom `Thread.UncaughtExceptionHandler` which 
> was not called). I noticed that after couple of minutes stream stopped 
> processing (at offset 32606948 where log-end-offset is 33472402). 
> Broker itself is not reporting any active consumer in that consumer group and 
> the only info I was able to gather was from thread dump:
> {code:java}
> "mp_ads_publisher_pro_madstorage-web-corotos-prod-9db804ae-2a7a-431f-be09-392ab38cd8a2-StreamThread-33"
>  #113 prio=5 os_prio=0 tid=0x7fe07c6349b0 nid=0xf7a waiting on condition 
> [0x7fe0215d4000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0xfec6a2f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at 
> org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:50)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.initTransactions(KafkaProducer.java:554)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:151)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:404)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:365)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.createTasks(StreamThread.java:350)
> at 
> org.apache.kafka.streams.processor.internals.TaskManager.addStreamTasks(TaskManager.java:137)
> at 
> org.apache.kafka.streams.processor.internals.TaskManager.createTasks(TaskManager.java:88)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:259)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:264)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:367)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1146)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
> at 
> 

[jira] [Commented] (KAFKA-7088) Kafka streams thread waits infinitely on transaction init

2018-06-22 Thread Lukasz Gluchowski (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520396#comment-16520396
 ] 

Lukasz Gluchowski commented on KAFKA-7088:
--

I switched from "exactly once" to "at least once" and the problem went away.

> Kafka streams thread waits infinitely on transaction init
> -
>
> Key: KAFKA-7088
> URL: https://issues.apache.org/jira/browse/KAFKA-7088
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 1.0.1
> Environment: Linux 4.14.33-51.37.amzn1.x86_64 #1 SMP Thu May 3 
> 20:07:43 UTC 2018 
> kafka-streams (client) 1.0.1
> kafka broker 1.1.0
> Java version:
> OpenJDK Runtime Environment (build 1.8.0_171-b10)
> OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)
> kakfa config overrides:
> num.stream.threads: 6
> session.timeout.ms: 1
> request.timeout.ms: 11000
> fetch.max.wait.ms: 500
> max.poll.records: 1000
> topic has 24 partitions
>Reporter: Lukasz Gluchowski
>Priority: Major
>  Labels: eos
>
> A kafka stream application thread stops processing without any feedback. The 
> topic has 24 partitions and I noticed that processing stopped only for some 
> partitions. I will describe what happened to partition:10. The application is 
> still running (now for about 8 hours) and that thread is hanging there and no 
> rebalancing that took place.
> There is no error (we have a custom `Thread.UncaughtExceptionHandler` which 
> was not called). I noticed that after couple of minutes stream stopped 
> processing (at offset 32606948 where log-end-offset is 33472402). 
> Broker itself is not reporting any active consumer in that consumer group and 
> the only info I was able to gather was from thread dump:
> {code:java}
> "mp_ads_publisher_pro_madstorage-web-corotos-prod-9db804ae-2a7a-431f-be09-392ab38cd8a2-StreamThread-33"
>  #113 prio=5 os_prio=0 tid=0x7fe07c6349b0 nid=0xf7a waiting on condition 
> [0x7fe0215d4000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0xfec6a2f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at 
> org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:50)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.initTransactions(KafkaProducer.java:554)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:151)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:404)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:365)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.createTasks(StreamThread.java:350)
> at 
> org.apache.kafka.streams.processor.internals.TaskManager.addStreamTasks(TaskManager.java:137)
> at 
> org.apache.kafka.streams.processor.internals.TaskManager.createTasks(TaskManager.java:88)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:259)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:264)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:367)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1146)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744){code}
>  
> I tried restarting application once but the situation repeated. Thread read 
> some data, committed offset and stopped 

[jira] [Commented] (KAFKA-7088) Kafka streams thread waits infinitely on transaction init

2018-06-22 Thread Lukasz Gluchowski (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520069#comment-16520069
 ] 

Lukasz Gluchowski commented on KAFKA-7088:
--

[~guozhang] thank you for your response. The producer is created automatically 
by kafka streams (that's why I categorized it as stream issue).

I didn't override {{ProducerConfig.MAX_BLOCK_MS_CONFIG}}, property so we rely 
on default value (FYI I put all config properties that we explicitly set in the 
"Environment" section of this ticket).

Broker itself is healthy and we didn't experienced any network issues when this 
error happened, we are running multiple kafka streams applications and vast 
majority of the clients have zero lag. If this helps logs from the broker said 
that the consumer group is empty.

Note that this situation is repeatable. I restarted the application and the 
same situation occurred after ~15 minutes.
{code:java}
kgroups --describe --group mp_ads_publisher_pro_madstorage-web-corotos-prod
Note: This will not show information about old Zookeeper-based consumers.
Consumer group 'mp_ads_publisher_pro_madstorage-web-corotos-prod' has no active 
members.
TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
madstorage-web-corotos-prod 10 32606948 33472402 865454 - - -
. (other 23 partitions){code}
 

Let me know if you need more info,

Thanks

> Kafka streams thread waits infinitely on transaction init
> -
>
> Key: KAFKA-7088
> URL: https://issues.apache.org/jira/browse/KAFKA-7088
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 1.0.1
> Environment: Linux 4.14.33-51.37.amzn1.x86_64 #1 SMP Thu May 3 
> 20:07:43 UTC 2018 
> kafka-streams (client) 1.0.1
> kafka broker 1.1.0
> Java version:
> OpenJDK Runtime Environment (build 1.8.0_171-b10)
> OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)
> kakfa config overrides:
> num.stream.threads: 6
> session.timeout.ms: 1
> request.timeout.ms: 11000
> fetch.max.wait.ms: 500
> max.poll.records: 1000
> topic has 24 partitions
>Reporter: Lukasz Gluchowski
>Priority: Major
>  Labels: eos
>
> A kafka stream application thread stops processing without any feedback. The 
> topic has 24 partitions and I noticed that processing stopped only for some 
> partitions. I will describe what happened to partition:10. The application is 
> still running (now for about 8 hours) and that thread is hanging there and no 
> rebalancing that took place.
> There is no error (we have a custom `Thread.UncaughtExceptionHandler` which 
> was not called). I noticed that after couple of minutes stream stopped 
> processing (at offset 32606948 where log-end-offset is 33472402). 
> Broker itself is not reporting any active consumer in that consumer group and 
> the only info I was able to gather was from thread dump:
> {code:java}
> "mp_ads_publisher_pro_madstorage-web-corotos-prod-9db804ae-2a7a-431f-be09-392ab38cd8a2-StreamThread-33"
>  #113 prio=5 os_prio=0 tid=0x7fe07c6349b0 nid=0xf7a waiting on condition 
> [0x7fe0215d4000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0xfec6a2f8> (a 
> java.util.concurrent.CountDownLatch$Sync)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
> at 
> org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:50)
> at 
> org.apache.kafka.clients.producer.KafkaProducer.initTransactions(KafkaProducer.java:554)
> at 
> org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:151)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:404)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:365)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.createTasks(StreamThread.java:350)
> at 
> org.apache.kafka.streams.processor.internals.TaskManager.addStreamTasks(TaskManager.java:137)
> at 
> org.apache.kafka.streams.processor.internals.TaskManager.createTasks(TaskManager.java:88)
> at 
> org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:259)
> at 
> 

[jira] [Updated] (KAFKA-7088) Kafka streams thread waits infinitely on transaction init

2018-06-21 Thread Lukasz Gluchowski (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lukasz Gluchowski updated KAFKA-7088:
-
Description: 
A kafka stream application thread stops processing without any feedback. The 
topic has 24 partitions and I noticed that processing stopped only for some 
partitions. I will describe what happened to partition:10. The application is 
still running (now for about 8 hours) and that thread is hanging there and no 
rebalancing that took place.

There is no error (we have a custom `Thread.UncaughtExceptionHandler` which was 
not called). I noticed that after couple of minutes stream stopped processing 
(at offset 32606948 where log-end-offset is 33472402). 

Broker itself is not reporting any active consumer in that consumer group and 
the only info I was able to gather was from thread dump:
{code:java}
"mp_ads_publisher_pro_madstorage-web-corotos-prod-9db804ae-2a7a-431f-be09-392ab38cd8a2-StreamThread-33"
 #113 prio=5 os_prio=0 tid=0x7fe07c6349b0 nid=0xf7a waiting on condition 
[0x7fe0215d4000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0xfec6a2f8> (a 
java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at 
org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:50)
at 
org.apache.kafka.clients.producer.KafkaProducer.initTransactions(KafkaProducer.java:554)
at 
org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:151)
at 
org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:404)
at 
org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:365)
at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.createTasks(StreamThread.java:350)
at 
org.apache.kafka.streams.processor.internals.TaskManager.addStreamTasks(TaskManager.java:137)
at 
org.apache.kafka.streams.processor.internals.TaskManager.createTasks(TaskManager.java:88)
at 
org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:259)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:264)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:367)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1146)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:)
at 
org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744){code}
 

I tried restarting application once but the situation repeated. Thread read 
some data, committed offset and stopped processing, leaving that thread in wait 
state.

FYI: we have EOS enabled

  was:
A kafka stream application thread stops processing without any feedback. The 
topic has 24 partitions and I noticed that processing stopped only for some 
partitions. I will describe what happened to partition:10. The application is 
still running (now for about 8 hours) and that thread is hanging there and no 
rebalancing that took place.

There is no error (we have a custom `Thread.UncaughtExceptionHandler` which was 
not called). I noticed that after couple of minutes stream stopped processing 
(at offset 32606948 where log-end-offset is 33472402). 

Broker itself is not reporting any active consumer in that consumer group and 
the only info I was able to gather was from thread dump:
{code:java}
"mp_ads_publisher_pro_madstorage-web-corotos-prod-9db804ae-2a7a-431f-be09-392ab38cd8a2-StreamThread-33"
 #113 prio=5 os_prio=0 tid=0x7fe07c6349b0 nid=0xf7a waiting on condition 
[0x7fe0215d4000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0xfec6a2f8> (a 

[jira] [Created] (KAFKA-7088) Kafka streams thread waits infinitely on transaction init

2018-06-21 Thread Lukasz Gluchowski (JIRA)
Lukasz Gluchowski created KAFKA-7088:


 Summary: Kafka streams thread waits infinitely on transaction init
 Key: KAFKA-7088
 URL: https://issues.apache.org/jira/browse/KAFKA-7088
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 1.0.1
 Environment: Linux 4.14.33-51.37.amzn1.x86_64 #1 SMP Thu May 3 
20:07:43 UTC 2018 
kafka-streams (client) 1.0.1
kafka broker 1.1.0
Java version:
OpenJDK Runtime Environment (build 1.8.0_171-b10)
OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)

kakfa config overrides:
num.stream.threads: 6
session.timeout.ms: 1
request.timeout.ms: 11000
fetch.max.wait.ms: 500
max.poll.records: 1000

topic has 24 partitions
Reporter: Lukasz Gluchowski


A kafka stream application thread stops processing without any feedback. The 
topic has 24 partitions and I noticed that processing stopped only for some 
partitions. I will describe what happened to partition:10. The application is 
still running (now for about 8 hours) and that thread is hanging there and no 
rebalancing that took place.

There is no error (we have a custom `Thread.UncaughtExceptionHandler` which was 
not called). I noticed that after couple of minutes stream stopped processing 
(at offset 32606948 where log-end-offset is 33472402). 

Broker itself is not reporting any active consumer in that consumer group and 
the only info I was able to gather was from thread dump:
{code:java}
"mp_ads_publisher_pro_madstorage-web-corotos-prod-9db804ae-2a7a-431f-be09-392ab38cd8a2-StreamThread-33"
 #113 prio=5 os_prio=0 tid=0x7fe07c6349b0 nid=0xf7a waiting on condition 
[0x7fe0215d4000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0xfec6a2f8> (a 
java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at 
org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:50)
at 
org.apache.kafka.clients.producer.KafkaProducer.initTransactions(KafkaProducer.java:554)
at 
org.apache.kafka.streams.processor.internals.StreamTask.(StreamTask.java:151)
at 
org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:404)
at 
org.apache.kafka.streams.processor.internals.StreamThread$TaskCreator.createTask(StreamThread.java:365)
at 
org.apache.kafka.streams.processor.internals.StreamThread$AbstractTaskCreator.createTasks(StreamThread.java:350)
at 
org.apache.kafka.streams.processor.internals.TaskManager.addStreamTasks(TaskManager.java:137)
at 
org.apache.kafka.streams.processor.internals.TaskManager.createTasks(TaskManager.java:88)
at 
org.apache.kafka.streams.processor.internals.StreamThread$RebalanceListener.onPartitionsAssigned(StreamThread.java:259)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:264)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:367)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:316)
at 
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:295)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1146)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:)
at 
org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744){code}
 

I tried restarting application once but the situation repeated. Thread read 
some data, committed offset and stopped processing, leaving that thread in wait 
state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)