I am running a simple Spark structured streaming application that is pulling 
data from a Kafka Topic. I have a Kafka Topic with nearly 1000 partitions. I am 
running this app on 6 node EMR cluster with 4 cores and 16GB RAM. I observed 
that Spark is trying to pull data from all 1024 Kafka partition and after 
running successful for few iteration it is stuck with following exception:
20/04/18 00:51:41 INFO ContextCleaner: Cleaned accumulator 10120/04/18 00:51:41 
INFO ContextCleaner: Cleaned accumulator 6620/04/18 00:51:41 INFO 
ContextCleaner: Cleaned accumulator 7720/04/18 00:51:41 INFO ContextCleaner: 
Cleaned accumulator 78
20/04/18 00:51:41 INFO BlockManagerInfo: Removed broadcast_2_piece0 on  in 
memory (size: 4.5 KB, free: 2.7 GB)20/04/18 00:51:41 INFO BlockManagerInfo: 
Removed broadcast_2_piece0 on ip- in memory (size: 4.5 KB, free: 2.7 
GB)20/04/18 00:51:41 INFO BlockManagerInfo: Removed broadcast_2_piece0 on ip- 
in memory (size: 4.5 KB, free: 2.7 GB)Then Sparks show RUNNING but it is NOT 
Processing any data.

Reply via email to