joins way slow never go to completion

charlie Brown Tue, 27 Jan 2015 20:10:58 -0800

I have about 15 -20 joins to perform. Each of these tables are in the order
of 6 million to 66 million rows. The number of columns range from 20 are
400.


I read the parquet files and obtain schemaRDDs.
Then use join functionality on 2 SchemaRDDs.
I join the previous join results with the next schemaRDD.

Any ideas how to deal with such join intensive spark SQL process?
Any advise how to handle joins in better ways?

I will appreciate all the inputs.

Thanks!

joins way slow never go to completion

Reply via email to