when running spark job, i used
"spark.eventLog.dir": "s3a://_some_bucket_on_prem/spark-history",
"spark.eventLog.enabled": true
i see the log of the job shows
22/11/10 06:42:30 INFO SingleEventLogFileWriter: Logging events to
s3a://_some_bucket_on_prem/spark-history/spark-a2befd8cb91341
Hello everyone,
Let me take the following spark sql example to demonstrate the issue we're
having:
```
Select * FROM small_table
Inner join big_table on small_table.foreign_key =
big_table.partition_key
Inner join bigger_table on big_table.foreign_key =
bigger_table.partition_key
where
Hi Spark Community,
Is there a way to create elastic indexes offline and then import them to an
elastic cluster ?
We are trying to load an elastic index with around 10B documents (~1.5 to 2 TB
data) using spark daily.
I know elastic provides a snapshot restore functionality through GCS/S3/Azure