Hi

I am new to Spark. Following is the problem that I am facing

Test 1) I ran a VM on CDH distribution with only 1 core allocated to it and
I ran simple Streaming example in spark-shell with sending data on 7777
port and trying to read it. With 1 core allocated to this nothing happens
in my streaming program and it does not receive data. Now I restart VM with
2 cores allocated to it and start spark-shell again and ran Streaming
example again and this time it works

Query a): From this test I concluded that Receiver in Streaming will occupy
the core completely even though I am using very less data and it does not
need complete core for same
but it does not assign this core to Executor for calculating
transformation.  And doing comparison of Partition processing and Receiver
processing is that in case of Partitions same
physical cores can parallelly process multiple partitions but Receiver will
not allow its core to process anything else. Is this understanding correct

Test2) Now I restarted VM with 1 core again and start spark-shell --master
local[2]. I have allocated only 1 core to VM but i say to spark-shell to
use 2 cores. and I test streaming program again and it somehow works.

Query b) Now I am more confused and I dont understand when I have only 1
core for VM. I thought previously it did not work because it had only 1
core and Receiver is completely blocking it and not sharing it with
Executor. But when I do start with local[2] and still having only 1 core to
VM it works. So it means that Receiver and Executor are both getting same
physical CPU. Request you to explain how is it different in this case and
what conclusions shall I draw in context of physical CPU usage.

Thanks and Regards
Aniruddh

Reply via email to