I use Spark in a cluster shared with other applications. The number of
nodes (and cores) assigned to my job varies depending on how many unrelated
jobs are running in the same cluster.

Is there any way for me to determine at runtime how many cores have been
allocated to my job, so I can select an appropriate partitioning strategy?

I've tried calling of SparkContext.getExecutorMemoryStatus.size, but if I
call this early in the job (which is when I want this info), the executors
haven't attached yet, and I get 0.

Has anyone else found a way to dynamically adjust their partitions to match
unpredictable node allocation?

Reply via email to