So I think I have a better idea of the problem now.

The environment is YARN client and IIRC PySpark doesn't run on YARN
cluster.

So my client is heavily loaded which causes iy loose a lot of e executors
which might be part of the problem.

Btw any plans in supporting PySpark in YARN clusters mode?
On Aug 7, 2014 3:04 PM, "Davies Liu" <dav...@databricks.com> wrote:

> What is the environment ? YARN or Mesos or Standalone?
>
> It will be more helpful if you could show more loggings.
>
> On Wed, Aug 6, 2014 at 7:25 PM, Avishek Saha <avishek.s...@gmail.com>
> wrote:
> > Hi,
> >
> > I get a lot of executor lost error for "saveAsTextFile" with PySpark
> > and Hadoop 2.4.
> >
> > For small datasets this error occurs but since the dataset is small it
> > gets eventually written to the file.
> > For large datasets, it takes forever to write the final output.
> >
> > Any help is appreciated.
> > Avishek
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> > For additional commands, e-mail: user-h...@spark.apache.org
> >
>

Reply via email to