1 computer as Master and
another computer with 3 workers along with the files to process.
When it fails:
- When running in a cluster with multiple workers and files spread across
multiple computers. Jobs are not assigned to the nodes where the files are
local.
Thanks,
Raajen
I would like to use Spark on a non-distributed file system but am having
trouble getting the driver to assign tasks to the workers that are local to
the files. I have extended InputSplits to create my own version of
FileSplits, so that each worker gets a bit more information than the default
FileSp