Like Spark has 2 levels of processing
a) across different worker.
b) Within same Executor - multiple cores can work on different partitions.

I know in Apache Beam with DataFlow as Runner - partitioning is abstracted. But 
does Dataflow uses multiple cores to process different partitions at same time. 

Objective is to understand what machines should be used to run Pipelines.  Does 
one should give a thought about cores on machine or does it not matter ?

Thanks
Aniruddh

Reply via email to