Like Spark has 2 levels of processing a) across different worker. b) Within same Executor - multiple cores can work on different partitions.
I know in Apache Beam with DataFlow as Runner - partitioning is abstracted. But does Dataflow uses multiple cores to process different partitions at same time. Objective is to understand what machines should be used to run Pipelines. Does one should give a thought about cores on machine or does it not matter ? Thanks Aniruddh
