Hello,
DF is checkpointed here. So it is written to HDFS. DF is written in paraquet
format and used default parallelism.
Thanks.
From: wilson
Date: Thursday, April 14, 2022 at 2:54 PM
To: user@spark.apache.org
Subject: {EXT} Re: Spark sql slowness in Spark 3.0.1
just curious, where to write
just curious, where to write?
Anil Dasari wrote:
We are upgrading spark from 2.4.7 to 3.0.1. we use spark sql (hive) to
checkpoint data frames (intermediate data). DF write is very slow in
3.0.1 compared to 2.4.7.
-
To un
The suggestion is to check:
1. Used format for write
2. Used parallelism
On Thu, Apr 14, 2022 at 7:13 PM Anil Dasari wrote:
> Hello,
>
>
>
> We are upgrading spark from 2.4.7 to 3.0.1. we use spark sql (hive) to
> checkpoint data frames (intermediate data). DF write is very slow in 3.0.1
> comp