Put the file on HDFS, if you have a Hadoop cluster? On Thu, Mar 9, 2023 at 3:02 PM sam smith <qustacksm2123...@gmail.com> wrote:
> Hello, > > I use Yarn client mode to submit my driver program to Hadoop, the dataset > I load is from the local file system, when i invoke load("file://path") > Spark complains about the csv file being not found, which i totally > understand, since the dataset is not in any of the workers or the > applicationMaster but only where the driver program resides. > I tried to share the file using the configurations: > >> *spark.yarn.dist.files* OR *spark.files * > > but both ain't working. > My question is how to share the csv dataset across the nodes at the > specified path? > > Thanks. >