Hi All,
After switching to 0.8, and reducing the number of partitions/tasks for a
large scale computation, I have been unable to force Spark to use only
executors on nodes where hbase data is local. I have not been able to find
a setting for spark.locality.wait that makes any difference. It is not an
option for us to let spark chose non data local nodes. Is their some
example code of how to get this to work the way we want? We have our own
input RDD that mimics the NewHadoopRdd and it seems to be doing the correct
thing in all regards wrt to preferred locations.

Do I have to write my own compute Tasks and schedule them myself?

Anyone have any suggestions? I am stumped.

cheers,
Erik

Reply via email to