Kiran Shivappa Japannavar created SPARK-22991:
-------------------------------------------------

             Summary: High read latency with spark streaming 2.2.1 and kafka 
0.10.0.1
                 Key: SPARK-22991
                 URL: https://issues.apache.org/jira/browse/SPARK-22991
             Project: Spark
          Issue Type: Bug
          Components: Spark Core, Structured Streaming
    Affects Versions: 2.2.1
            Reporter: Kiran Shivappa Japannavar
            Priority: Critical


Spark 2.2.1 + Kafka 0.10 + Spark streaming.

Batch duration is 1s, Max rate per partition is 500, poll interval is 120 
seconds, max poll records is 500 and no of partitions in Kafka is 500, enabled 
cache consumer.

While trying to read data from Kafka we are observing very high read latencies 
intermittently.The high latencies results in Kafka consumer session expiration 
and hence the Kafka brokers removes the consumer from the group. The consumer 
keeps retrying and finally fails with the

[org.apache.kafka.clients.NetworkClient] - Disconnecting from node 12 due to 
request timeout
[org.apache.kafka.clients.NetworkClient] - Cancelled request ClientRequest
[org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient] - Cancelled 
FETCH request ClientRequest.**
Due to this a lot of batches are in the queued state.

The high read latencies are occurring whenever multiple clients are parallelly 
trying to read the data from the same Kafka cluster. The Kafka cluster is 
having a large number of brokers and can support high network bandwidth.

When running with spark 1.5 and Kafka 0.8 consumer client against the same 
Kafka cluster we are not seeing any read latencies.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to