Re: Handling Exception or Control in spark dataframe write()

Steve Loughran Fri, 16 Dec 2016 03:58:49 -0800

> On 14 Dec 2016, at 18:10, bhayat <baki...@gmail.com> wrote:
> 
> Hello,
> 
> I am writing my RDD into parquet format but what i understand that write()
> method is still experimental and i do not know how i will deal with possible
> exceptions.
> 
> For example:
> 
> schemaXXX.write().mode(saveMode).parquet(parquetPathInHdfs);
> 
> In this example i do not know how i will handle exception if parquet path
> does not exist or host is not reachable.



the parent path will be created. You are more likely to see a problem if the 
final path does exist. 

> 
> Do you have any way to do it ?

generally, catch the IOExceptions raised and report them. The HDFS IPC layer 
has a fair amount of retry logic built in to handle transient outages of the 
namenode/datanodes (and long GC pauses, which look similar); when they give up 
you'll see an IOException of some kind or other. All other filesystem API calls 
tend to raise IOExceptions too. try/catch are your friends.

What is hard is: what do you do next? Retry? Give up? I don't think there's a 
clear consensus there

> 
> Thank you,
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Handling-Exception-or-Control-in-spark-dataframe-write-tp28210.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Handling Exception or Control in spark dataframe write()

Reply via email to