Re: DataFrameReader read from S3 org.apache.spark.sql.AnalysisException: Path does not exist

2017-07-13 Thread Sumona Routh
Yes, which is what I eventually did. I wanted to check if there was some
"mode" type, similar to SaveMode with writers. Appears that there genuinely
is no option for this and it has to be handled by the client using the
exception flow.

Thanks,
Sumona

On Wed, Jul 12, 2017 at 4:59 PM Yong Zhang <java8...@hotmail.com> wrote:

> Can't you just catch that exception and return an empty dataframe?
>
>
> Yong
>
>
> --
> *From:* Sumona Routh <sumos...@gmail.com>
> *Sent:* Wednesday, July 12, 2017 4:36 PM
> *To:* user
> *Subject:* DataFrameReader read from S3
> org.apache.spark.sql.AnalysisException: Path does not exist
>
> Hi there,
> I'm trying to read a list of paths from S3 into a dataframe for a window
> of time using the following:
>
> sparkSession.read.parquet(listOfPaths:_*)
>
> In some cases, the path may not be there because there is no data, which
> is an acceptable scenario.
> However, Spark throws an AnalysisException: Path does not exist. Is there
> an option I can set to tell it to gracefully return an empty dataframe if a
> particular path is missing? Looking at the spark code, there is an option
> checkFilesExist, but I don't believe that is set in the particular flow of
> code that I'm accessing.
>
> Thanks!
> Sumona
>
>


Re: DataFrameReader read from S3 org.apache.spark.sql.AnalysisException: Path does not exist

2017-07-12 Thread Yong Zhang
Can't you just catch that exception and return an empty dataframe?


Yong



From: Sumona Routh <sumos...@gmail.com>
Sent: Wednesday, July 12, 2017 4:36 PM
To: user
Subject: DataFrameReader read from S3 org.apache.spark.sql.AnalysisException: 
Path does not exist

Hi there,
I'm trying to read a list of paths from S3 into a dataframe for a window of 
time using the following:

sparkSession.read.parquet(listOfPaths:_*)

In some cases, the path may not be there because there is no data, which is an 
acceptable scenario.
However, Spark throws an AnalysisException: Path does not exist. Is there an 
option I can set to tell it to gracefully return an empty dataframe if a 
particular path is missing? Looking at the spark code, there is an option 
checkFilesExist, but I don't believe that is set in the particular flow of code 
that I'm accessing.

Thanks!
Sumona



DataFrameReader read from S3 org.apache.spark.sql.AnalysisException: Path does not exist

2017-07-12 Thread Sumona Routh
Hi there,
I'm trying to read a list of paths from S3 into a dataframe for a window of
time using the following:

sparkSession.read.parquet(listOfPaths:_*)

In some cases, the path may not be there because there is no data, which is
an acceptable scenario.
However, Spark throws an AnalysisException: Path does not exist. Is there
an option I can set to tell it to gracefully return an empty dataframe if a
particular path is missing? Looking at the spark code, there is an option
checkFilesExist, but I don't believe that is set in the particular flow of
code that I'm accessing.

Thanks!
Sumona