[GitHub] [incubator-hudi] reste85 commented on issue #1598: [SUPPORT] Slow upsert time reading from Kafka

GitBox Thu, 14 May 2020 00:58:31 -0700


reste85 commented on issue #1598:
URL: https://github.com/apache/incubator-hudi/issues/1598#issuecomment-628462752



   It seems like when it's reading from kafka it gets stuck. We have 113 mln of 
records in our compacted topic and every run we try to read 50 mln messages. 
First two runs worked like a charm, the third one is getting stuck (so when 
it's trying to read a number of messages < 50mln). If i remove the option 
"spark.network.timeout=500000" from spark-submit conf, i'm getting 
   
   "java.lang.IllegalArgumentException: requirement failed: Failed to get 
records for compacted spark-executor-topic_consumer topic-changelog-11 after 
polling for 310000"
   
   I'm trying to follow this post: 
https://stackoverflow.com/questions/42264669/spark-streaming-assertion-failed-failed-to-get-records-for-spark-executor-a-gro
   
   Using these properties in kafka consumer:
   spark.streaming.kafka.consumer.poll.ms=310000
   request.timeout.ms=30000
   max.poll.interval.ms=25000
   
   Still getting the same error
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-hudi] reste85 commented on issue #1598: [SUPPORT] Slow upsert time reading from Kafka

Reply via email to