Github user lvdongr commented on the issue:
https://github.com/apache/spark/pull/17203
In our case,we deploy a streaming application whose data source are 20
topics with 30 partitions in kafka cluster(3 brokers). Then the amount of
connection with kafka is very large,up to a thousand, and the consumer will not
got message from kafka sometimesï¼which may lead some jobs to fail. But when
we replace the consumer with uncached ones, the number of connection decreased,
then there were no jobs fail. We are still not sure if the large number of
connection to kafka cause the job fail or not.But we test the result, and we
want to use the uncached consumers for we can keep our streaming jobs running
successfully first. So we think there are some occasions not to use the
uncached consumer,and the developer can choose the way.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]