Re: Apache Spark 3.4.3 (?)

2024-04-06 Thread Mridul Muralidharan
Hi Dongjoon, Thanks for volunteering ! I would suggest to wait for SPARK-47318 to be merged as well for 3.4 Regards, Mridul On Sat, Apr 6, 2024 at 6:49 PM Dongjoon Hyun wrote: > Hi, All. > > Apache Spark 3.4.2 tag was created on Nov 24th and `branch-3.4` has 85 > commits including important

Re: External Spark shuffle service for k8s

2024-04-06 Thread Mich Talebzadeh
Thanks for your suggestion that I take it as a workaround. Whilst this workaround can potentially address storage allocation issues, I was more interested in exploring solutions that offer a more seamless integration with large distributed file systems like HDFS, GCS, or S3. This would ensure

Re: Apache Spark 3.4.3 (?)

2024-04-06 Thread Holden Karau
Sounds good to me :) Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 YouTube Live Streams: https://www.youtube.com/user/holdenkarau On Sat, Apr 6, 2024 at 2:51 PM Dongjoon Hyun wrote: > Hi, All.

Re: Vote on Dynamic resource allocation for structured streaming [SPARK-24815]

2024-04-06 Thread Pavan Kotikalapudi
Hi Jungtaek, Status on current SPARK-24815 : Thomas Graves is reviewing the draft PR . I need to add documentation about the configs and usage details, I am planning to do that this week. He did mention

External Spark shuffle service for k8s

2024-04-06 Thread Mich Talebzadeh
I have seen some older references for shuffle service for k8s, although it is not clear they are talking about a generic shuffle service for k8s. Anyhow with the advent of genai and the need to allow for a larger volume of data, I was wondering if there has been any more work on this matter.

Re: External Spark shuffle service for k8s

2024-04-06 Thread Bjørn Jørgensen
You can make a PVC on K8S call it 300GB make a folder in yours dockerfile WORKDIR /opt/spark/work-dir RUN chmod g+w /opt/spark/work-dir start spark with adding this .config("spark.kubernetes.driver.volumes.persistentVolumeClaim.300gb.options.claimName", "300gb") \

Apache Spark 3.4.3 (?)

2024-04-06 Thread Dongjoon Hyun
Hi, All. Apache Spark 3.4.2 tag was created on Nov 24th and `branch-3.4` has 85 commits including important security and correctness patches like SPARK-45580, SPARK-46092, SPARK-46466, SPARK-46794, and SPARK-46862. https://github.com/apache/spark/releases/tag/v3.4.2 $ git log --oneline