I think it is not happening because it is a ddl time and upsert operation does
not recreate the partition. It is just a dml statement.
Sent from Yahoo Mail for iPhone
On Friday, May 2, 2025, 7:53 AM, Pradeep wrote:
I have a partitioned hive external table as belowscala> spark.sql("describe
Can you please explain how did you realize it’s wrong? Did you check cloudwatch
for the same metrics and compare? Also are you using do.cache() and expecting
that shuffle read/write to go away ?
Sent from Yahoo Mail for iPhone
On Sunday, May 26, 2024, 7:53 AM, Prem Sahoo wrote:
Can anyone p
What I can immediately think of is,
as you are doing IN in the where clause for a series of timestamps, if you can
consider breaking them and for each epoch timestampYou can load your results to
an intermediate staging table and then do a final aggregate from that table
keeping the group by sam
Hi,My question is about ability to integrate spark streaming with multiple
clusters.Is it a supported use case. An example of that is that two topics
owned by different group and they have their own kakka infra .Can i have two
dataframes as a result of spark.readstream listening to different kaf
Hi,My question is about ability to integrate spark streaming with multiple
clusters.Is it a supported use case. An example of that is that two topics
owned by different group and they have their own kakka infra .Can i have two
dataframes as a result of spark.readstream listening to different kaf