OUTER JOIN y ON field1=field2")
joined_df.persist(StorageLevel.MEMORY_AND_DISK_ONLY)
joined_df.write.save("/user/data/output")
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Join-operation-failure-tp28414p28422.html
Sent from the A
From: jatinpreet
Sent: Wednesday, February 22, 2017 1:11 AM
To: user@spark.apache.org
Subject: Spark SQL : Join operation failure
Hi,
I am having a hard time running outer join operation on two parquet
datasets. The dataset size is large ~500GB with a lot of culumns in tune of
(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
)
I would appreciate if someone can help me out on this.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Join-operation-failure-tp28414.html
Sent from the Apache Spark User