Hi,

This is a general question regarding moving spark SQL query to PySpark, if
needed I will add some more from the errors log and query syntax.
I'm trying to move a spark SQL query to run through PySpark.
The query syntax and spark configuration are the same.
For some reason the query failed to run through PySpark with an java heap
space error.
In the Spark SQL query I'm using insert overwrite partition, while in
pyspark I'm using DF to write the data to a specific location in S3.

Are there any differences in the configuration that you might think I need
to change?


Thanks,

-- 
Tzahi
Data Engineer

Reply via email to