cannot write spark log to s3a

2022-11-09 Thread second_co...@yahoo.com.INVALID
when running spark job, i used "spark.eventLog.dir": "s3a://_some_bucket_on_prem/spark-history",  "spark.eventLog.enabled": true i see the log of the job shows 22/11/10 06:42:30 INFO SingleEventLogFileWriter: Logging events to s3a://_some_bucket_on_prem/spark-history/spark-a2befd8cb91341

[Spark Core] Adaptive dynamic partition pruning

2022-11-09 Thread hajyoussef amine
Hello everyone, Let me take the following spark sql example to demonstrate the issue we're having: ``` Select * FROM small_table Inner join big_table on small_table.foreign_key = big_table.partition_key Inner join bigger_table on big_table.foreign_key = bigger_table.partition_key where

Offline elastic index creation

2022-11-09 Thread Vibhor Gupta
Hi Spark Community, Is there a way to create elastic indexes offline and then import them to an elastic cluster ? We are trying to load an elastic index with around 10B documents (~1.5 to 2 TB data) using spark daily. I know elastic provides a snapshot restore functionality through GCS/S3/Azure