Usually this isn't done as the data is meant to be on a shared/distributed
storage, eg HDFS, S3, etc.
Spark should then read this data into a dataframe and your code logic
applies to the dataframe in a distributed manner.
On Wed, 29 Jan 2020 at 09:37, Tharindu Mathew
wrote:
> That was really
That was really helpful. Thanks! I actually solved my problem using by
creating a venv and using the venv flags. Wondering now how to submit the
data as an archive? Any idea?
On Mon, Jan 27, 2020, 9:25 PM Chris Teoh wrote:
> Use --py-files
>
> See
>
Hi,
I'd like to have a single standalone server, running as root on my machine,
on which jobs can be run from multiple user accounts on the same machine.
However, when I do this, writing files gives me error similar to the
one in this
Stackoverflow question