subject:"\"Issue with parquet write after join \\\(Spark 1.4.0\\\)\""

Re: Issue with parquet write after join (Spark 1.4.0)

2015-07-01 Thread Michael Armbrust

I would still look at your executor logs. A count() is rewritten by the optimizer to be much more efficient because you don't actually need any of the columns. Also, writing parquet allocates quite a few large buffers. On Wed, Jul 1, 2015 at 5:42 AM, Pooja Jain wrote: > Join is happening succe

Re: Issue with parquet write after join (Spark 1.4.0)

2015-07-01 Thread Raghavendra Pandey

By any chance, are you using time field in your df. Time fields are known to be notorious in rdd conversion. On Jul 1, 2015 6:13 PM, "Pooja Jain" wrote: > Join is happening successfully as I am able to do count() after the join. > > Error is coming only while trying to write in parquet format on

Re: Issue with parquet write after join (Spark 1.4.0)

2015-07-01 Thread Pooja Jain

Join is happening successfully as I am able to do count() after the join. Error is coming only while trying to write in parquet format on hdfs. Thanks, Pooja. On Wed, Jul 1, 2015 at 1:06 PM, Akhil Das wrote: > It says: > > Caused by: java.net.ConnectException: Connection refused: slave2/...:54

Re: Issue with parquet write after join (Spark 1.4.0)

2015-07-01 Thread Akhil Das

It says: Caused by: java.net.ConnectException: Connection refused: slave2/...:54845 Could you look in the executor logs (stderr on slave2) and see what made it shut down? Since you are doing a join there's a high possibility of OOM etc. Thanks Best Regards On Wed, Jul 1, 2015 at 10:20 AM, Pooj

Issue with parquet write after join (Spark 1.4.0)

2015-06-30 Thread Pooja Jain

Hi, We are using Spark 1.4.0 on hadoop using yarn-cluster mode via spark-submit. We are facing parquet write issue after doing dataframe joins We have a full data set and then an incremental data. We are reading them as dataframes, joining them, and then writing the data to the hdfs system in par

Re: Issue with parquet write after join (Spark 1.4.0)

Re: Issue with parquet write after join (Spark 1.4.0)

Re: Issue with parquet write after join (Spark 1.4.0)

Re: Issue with parquet write after join (Spark 1.4.0)

Issue with parquet write after join (Spark 1.4.0)

5 matches

Site Navigation

Mail list logo

Footer information