le. gzip files are NOT splittable, so it wasn't properly
> parallelized, which means that the join were causing alot of memory
> pressure. I recompressed it was bzip2 and my job has been running with no
> errors.
>
> Thanks again!
>
>
>
> --
> View this message in
gzip'd json file. gzip files are NOT splittable, so it wasn't properly
parallelized, which means that the join were causing alot of memory
pressure. I recompressed it was bzip2 and my job has been running with no
errors.
Thanks again!
--
View this message in context:
http://apache-
g Row was an even larger 2.6
MB?
Any words of wisdom would be really appreciated!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/OutOfMemory-with-wide-289-column-dataframe-tp26651.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
--