Re: {EXT} Re: Spark sql slowness in Spark 3.0.1

2022-04-15 Thread Anil Dasari
Hello, DF is checkpointed here. So it is written to HDFS. DF is written in paraquet format and used default parallelism. Thanks. From: wilson Date: Thursday, April 14, 2022 at 2:54 PM To: user@spark.apache.org Subject: {EXT} Re: Spark sql slowness in Spark 3.0.1 just curious, where to write

Re: Spark sql slowness in Spark 3.0.1

2022-04-14 Thread wilson
just curious, where to write? Anil Dasari wrote: We are upgrading spark from 2.4.7 to 3.0.1. we use spark sql (hive) to checkpoint data frames (intermediate data). DF write is very slow in 3.0.1 compared to 2.4.7. - To un

Re: Spark sql slowness in Spark 3.0.1

2022-04-14 Thread Sergey B.
The suggestion is to check: 1. Used format for write 2. Used parallelism On Thu, Apr 14, 2022 at 7:13 PM Anil Dasari wrote: > Hello, > > > > We are upgrading spark from 2.4.7 to 3.0.1. we use spark sql (hive) to > checkpoint data frames (intermediate data). DF write is very slow in 3.0.1 > comp