Scott Kidder created FLINK-5946:
-----------------------------------
Summary: Kinesis Producer uses KPL that orphans threads that
consume 100% CPU
Key: FLINK-5946
URL: https://issues.apache.org/jira/browse/FLINK-5946
Project: Flink
Issue Type: Bug
Components: Kinesis Connector
Affects Versions: 1.2.0
Reporter: Scott Kidder
It's possible for the Amazon Kinesis Producer Library (KPL) to leave orphaned
threads running after the producer has been instructed to shutdown via the
`destroy()` method. These threads run in a very tight infinite loop that can
push CPU usage to 100%. I've seen this happen on several occasions, though it
does not happen all of the time. Once these threads are orphaned, the only
solution to bring CPU utilization back down is to restart the Flink Task
Manager.
When a KPL producer is instantiated, it creates several threads: one to execute
and monitor the native sender process, and two threads to monitor the process'
stdout and stderr output. It's possible for the process-monitor thread to stop
in such a way that leaves the output monitor threads orphaned.
I've submitted a Github issue and pull-request against the KPL project:
https://github.com/awslabs/amazon-kinesis-producer/issues/93
https://github.com/awslabs/amazon-kinesis-producer/pull/94
This issue is rooted in the Amazon Kinesis Producer Library (KPL) that the
Flink Kinesis streaming connector depends upon. It ought to be fixed in the
KPL, but I want to document it on the Flink project. The Flink KPL dependency
should be updated once the KPL has been fixed.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)