Re: When and how does Spark use metastore statistics?

2023-12-11 Thread Nicholas Chammas
> On Dec 11, 2023, at 6:40 AM, Mich Talebzadeh > wrote: > spark.sql.cbo.strategy: Set to AUTO to use the CBO as the default optimizer, > or NONE to disable it completely. > Hmm, I’ve also never heard of this setting before and can’t seem to find it in the Spark docs or source code.

Re: When and how does Spark use metastore statistics?

2023-12-11 Thread Nicholas Chammas
> On Dec 11, 2023, at 6:40 AM, Mich Talebzadeh > wrote: > > By default, the CBO is enabled in Spark. Note that this is not correct. AQE is enabled

Re: When and how does Spark use metastore statistics?

2023-12-11 Thread Mich Talebzadeh
Some of these have been around outside of spark for years. like CBO and RBO etc but I concur that they have a place in spark's doc. Simply put, statistics provide insights into the characteristics of data, such as distribution, skewness, and cardinalities, which help the optimizer make informed

Re: [VOTE] Release Spark 3.3.4 (RC1)

2023-12-10 Thread L. C. Hsieh
+1 On Sun, Dec 10, 2023 at 6:15 PM Kent Yao wrote: > > +1(non-binding > > Kent Yao > > Yuming Wang 于2023年12月11日周一 09:33写道: > > > > +1 > > > > On Mon, Dec 11, 2023 at 5:55 AM Dongjoon Hyun wrote: > >> > >> +1 > >> > >> Dongjoon > >> > >> On 2023/12/08 21:41:00 Dongjoon Hyun wrote: > >> > Please

Re: Algolia search on website is broken

2023-12-10 Thread Gengliang Wang
Hi Nick, Thank you for reporting the issue with our web crawler. I've found that the issue was due to a change(specifically, pull request #40269 ) in the website's HTML structure, where the JavaScript selector ".container-wrapper" is now ".container".

Re: When and how does Spark use metastore statistics?

2023-12-10 Thread Nicholas Chammas
I’ve done some reading and have a slightly better understanding of statistics now. Every implementation of LeafNode.computeStats

Disabling distributing local conf file during spark-submit

2023-12-10 Thread Eugene Miretsky
Hello, It looks like local conf archives always get copied to the target (HDFS) every time a job is submitted 1. Other

Re: [VOTE] Release Spark 3.3.4 (RC1)

2023-12-10 Thread Kent Yao
+1(non-binding Kent Yao Yuming Wang 于2023年12月11日周一 09:33写道: > > +1 > > On Mon, Dec 11, 2023 at 5:55 AM Dongjoon Hyun wrote: >> >> +1 >> >> Dongjoon >> >> On 2023/12/08 21:41:00 Dongjoon Hyun wrote: >> > Please vote on releasing the following candidate as Apache Spark version >> > 3.3.4. >> >

Re: [VOTE] Release Spark 3.3.4 (RC1)

2023-12-10 Thread Yuming Wang
+1 On Mon, Dec 11, 2023 at 5:55 AM Dongjoon Hyun wrote: > +1 > > Dongjoon > > On 2023/12/08 21:41:00 Dongjoon Hyun wrote: > > Please vote on releasing the following candidate as Apache Spark version > > 3.3.4. > > > > The vote is open until December 15th 1AM (PST) and passes if a majority > +1

Re: [VOTE] Release Spark 3.3.4 (RC1)

2023-12-10 Thread Dongjoon Hyun
+1 Dongjoon On 2023/12/08 21:41:00 Dongjoon Hyun wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.3.4. > > The vote is open until December 15th 1AM (PST) and passes if a majority +1 > PMC votes are cast, with a minimum of 3 +1 votes. > > [ ] +1 Release this

Re: Spark on Yarn with Java 17

2023-12-10 Thread Jason Xu
Doogjoon and Luca, it's great to learn that there is a way to run different JVM versions for Spark and Hadoop binaries. I had concerns about Java compatibility issues without this solution. Thank you! Luca, thank you for providing a how-to guide for this. It's really helpful! On Sat, Dec 9, 2023

Re: Algolia search on website is broken

2023-12-10 Thread Nicholas Chammas
Pinging Gengliang and Xiao about this, per these docs . It looks like to fix this problem you need access to the Algolia Crawler Admin Console.

unsubscribe

2023-12-10 Thread bruce COTTMAN
- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

unsubscribe

2023-12-10 Thread Stevens, Clay
Clay

unsubscribe

2023-12-10 Thread Rajanikant V

unsubscribe

2023-12-09 Thread Ravi Chinoy
-- Regards Ravi Chinoy Phone: (415) 230 9971

RE: Spark on Yarn with Java 17

2023-12-09 Thread Luca Canali
Jason, In case you need a pointer on how to run Spark with a version of Java different than the version used by the Hadoop processes, as indicated by Dongjoon, this is an example of what we do on our Hadoop clusters:

Re: Spark on Yarn with Java 17

2023-12-09 Thread Dongjoon Hyun
Please try Apache Spark 3.3+ (SPARK-33772) with Java 17 on your cluster simply, Jason. I believe you can set up for your Spark 3.3+ jobs to run with Java 17 while your cluster(DataNode/NameNode/ResourceManager/NodeManager) is still sitting on Java 8. Dongjoon. On Fri, Dec 8, 2023 at 11:12 PM

Re: Spark on Yarn with Java 17

2023-12-08 Thread Jason Xu
Dongjoon, thank you for the fast response! Apache Spark 4.0.0 depends on only Apache Hadoop client library. To better understand your answer, does that mean a Spark application built with Java 17 can successfully run on a Hadoop cluster on version 3.3 and Java 8 runtime? On Fri, Dec 8, 2023 at

Re: Spark on Yarn with Java 17

2023-12-08 Thread Dongjoon Hyun
Hi, Jason. Apache Spark 4.0.0 depends on only Apache Hadoop client library. You can track all `Apache Spark 4` activities including Hadoop dependency here. https://issues.apache.org/jira/browse/SPARK-44111 (Prepare Apache Spark 4.0.0) According to the release history, the original suggested

Spark on Yarn with Java 17

2023-12-08 Thread Jason Xu
Hi Spark devs, According to the Spark 3.5 release notes, Spark 4 will no longer support Java 8 and 11 (link ). My company is using Spark on Yarn with Java 8 now. When considering a future upgrade to Spark 4, one issue

[VOTE] Release Spark 3.3.4 (RC1)

2023-12-08 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version 3.3.4. The vote is open until December 15th 1AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.3.4 [ ] -1 Do not release this package

Re: Apache Spark 3.3.4 EOL Release?

2023-12-08 Thread Dongjoon Hyun
Thank you, Mridul, and Kent, too. Additionally, thank you for volunteering as a release manager, Jungtaek, For the 3.3.4 EOL release, I've already been testing and preparing for one week since my first email. So, why don't you proceed with the Apache Spark 3.5.1 release? It has 142 patches

Re: Apache Spark 3.3.4 EOL Release?

2023-12-07 Thread Jungtaek Lim
+1 to release 3.3.4 and consider 3.3 as EOL. Btw, it'd be probably ideal if we could encourage taking an opportunity of experiencing the release process to people who hadn't had a time to go through (when there are people who are happy to take it). If you don't mind and we are not very strict on

Re: SSH Tunneling issue with Apache Spark

2023-12-06 Thread Nicholas Chammas
This is not a question for the dev list. Moving dev to bcc. One thing I would try is to connect to this database using JDBC + SSH tunnel, but without Spark. That way you can focus on getting the JDBC connection to work without Spark complicating the picture for you. > On Dec 5, 2023, at 8:12 

SSH Tunneling issue with Apache Spark

2023-12-05 Thread Venkatesan Muniappan
Hi Team, I am facing an issue with SSH Tunneling in Apache Spark. The behavior is same as the one in this Stackoverflow question but there are no answers there. This is what I am trying:

When and how does Spark use metastore statistics?

2023-12-05 Thread Nicholas Chammas
I’m interested in improving some of the documentation relating to the table and column statistics that get stored in the metastore, and how Spark uses them. But I’m not clear on a few things, so I’m writing to you with some questions. 1. The documentation for 

Algolia search on website is broken

2023-12-05 Thread Nicholas Chammas
Should I report this instead on Jira? Apologies if the dev list is not the right place. Search on the website appears to be broken. For example, here is a search for “analyze”:  And here is the same search using DDG

unsubscribe

2023-12-05 Thread Kalpana Jalawadi

Re: Apache Spark 3.3.4 EOL Release?

2023-12-04 Thread Kent Yao
+1 Thank you for driving this EOL release, Dongjoon! Kent Yao On 2023/12/04 19:40:10 Mridul Muralidharan wrote: > +1 > > Regards, > Mridul > > On Mon, Dec 4, 2023 at 11:40 AM L. C. Hsieh wrote: > > > +1 > > > > Thanks Dongjoon! > > > > On Mon, Dec 4, 2023 at 9:26 AM Yang Jie wrote: > > > >

Re: Should Spark 4.x use Java modules (those you define with module-info.java sources)?

2023-12-04 Thread Sean Owen
It already does. I think that's not the same idea? On Mon, Dec 4, 2023, 8:12 PM Almog Tavor wrote: > I think Spark should start shading it’s problematic deps similar to how > it’s done in Flink > > On Mon, 4 Dec 2023 at 2:57 Sean Owen wrote: > >> I am not sure we can control that - the Scala

Re: Should Spark 4.x use Java modules (those you define with module-info.java sources)?

2023-12-04 Thread Almog Tavor
I think Spark should start shading it’s problematic deps similar to how it’s done in Flink On Mon, 4 Dec 2023 at 2:57 Sean Owen wrote: > I am not sure we can control that - the Scala _x.y suffix has particular > meaning in the Scala ecosystem for artifacts and thus the naming of .jar > files.

Re: Apache Spark 3.3.4 EOL Release?

2023-12-04 Thread Mridul Muralidharan
+1 Regards, Mridul On Mon, Dec 4, 2023 at 11:40 AM L. C. Hsieh wrote: > +1 > > Thanks Dongjoon! > > On Mon, Dec 4, 2023 at 9:26 AM Yang Jie wrote: > > > > +1 for a 3.3.4 EOL Release. Thanks Dongjoon. > > > > Jie Yang > > > > On 2023/12/04 15:08:25 Tom Graves wrote: > > > +1 for a 3.3.4 EOL

Re: Apache Spark 3.3.4 EOL Release?

2023-12-04 Thread Dongjoon Hyun
Thank you all. Dongjoon. On Mon, Dec 4, 2023 at 9:40 AM L. C. Hsieh wrote: > +1 > > Thanks Dongjoon! > > On Mon, Dec 4, 2023 at 9:26 AM Yang Jie wrote: > > > > +1 for a 3.3.4 EOL Release. Thanks Dongjoon. > > > > Jie Yang > > > > On 2023/12/04 15:08:25 Tom Graves wrote: > > > +1 for a 3.3.4

Re: Apache Spark 3.3.4 EOL Release?

2023-12-04 Thread L. C. Hsieh
+1 Thanks Dongjoon! On Mon, Dec 4, 2023 at 9:26 AM Yang Jie wrote: > > +1 for a 3.3.4 EOL Release. Thanks Dongjoon. > > Jie Yang > > On 2023/12/04 15:08:25 Tom Graves wrote: > > +1 for a 3.3.4 EOL Release. Thanks Dongjoon. > > Tom > > On Friday, December 1, 2023 at 02:48:22 PM CST,

Re: Apache Spark 3.3.4 EOL Release?

2023-12-04 Thread Yang Jie
+1 for a 3.3.4 EOL Release. Thanks Dongjoon. Jie Yang On 2023/12/04 15:08:25 Tom Graves wrote: > +1 for a 3.3.4 EOL Release. Thanks Dongjoon. > Tom > On Friday, December 1, 2023 at 02:48:22 PM CST, Dongjoon Hyun > wrote: > > Hi, All. > > Since the Apache Spark 3.3.0 RC6 vote passed

Re: Apache Spark 3.3.4 EOL Release?

2023-12-04 Thread Tom Graves
+1 for a 3.3.4 EOL Release. Thanks Dongjoon. Tom On Friday, December 1, 2023 at 02:48:22 PM CST, Dongjoon Hyun wrote: Hi, All. Since the Apache Spark 3.3.0 RC6 vote passed on Jun 14, 2022, branch-3.3 has been maintained and served well until now. -

Re: [DISCUSS] SPIP: ShuffleManager short name registration via SparkPlugin

2023-12-04 Thread Alessandro Bellina
Hello devs, We are going to be tabling the SPIP proposal given that we don't see responses in the discussion thread. We still believe that making custom ShuffleManagers easier to configure is worthwhile, given interactions with our users, but we can revisit this later. If anyone in the list has

unsubscribe

2023-12-04 Thread Duy Pham

`orc-format` 1.0 (ORC-1531) for Apache ORC 2.0

2023-12-03 Thread Dongjoon Hyun
Hi, All. As one of the key parts of Apache ORC 2.0, we've been discussing a new repository and module, `orc-format`, in the following. https://github.com/apache/orc/issues/1543 Now, we are ready to create a new repo. Please take a look at the POC repo and code and let us know your thoughts.

Re: Should Spark 4.x use Java modules (those you define with module-info.java sources)?

2023-12-03 Thread Sean Owen
I am not sure we can control that - the Scala _x.y suffix has particular meaning in the Scala ecosystem for artifacts and thus the naming of .jar files. And we need to work with the Scala ecosystem. What can't handle these files, Spring Boot? does it somehow assume the .jar file name relates to

Should Spark 4.x use Java modules (those you define with module-info.java sources)?

2023-12-03 Thread Marc Le Bihan
Hello,     Last month, I've attempted the experience of upgrading my Spring-Boot 2 Java project, that relies heavily on Spark 3.4.2, to Spring-Boot 3. It didn't succeed yet, but was informative.     Spring-Boot 2 → 3 means especially javax.* becoming jakarka.* : javax.activation,

unsubscribe

2023-12-03 Thread Kalpana Jalawadi

Unsubscribe

2023-12-03 Thread Kalpana Jalawadi

Re: [FYI] SPARK-45981: Improve Python language test coverage

2023-12-02 Thread Hyukjin Kwon
Awesome! On Sat, Dec 2, 2023 at 2:33 PM Dongjoon Hyun wrote: > Hi, All. > > As a part of Apache Spark 4.0.0 (SPARK-44111), the Apache Spark community > starts to have test coverage for all supported Python versions from Today. > > - https://github.com/apache/spark/actions/runs/7061665420 > >

Apache Spark 3.3.4 EOL Release?

2023-12-01 Thread Dongjoon Hyun
Hi, All. Since the Apache Spark 3.3.0 RC6 vote passed on Jun 14, 2022, branch-3.3 has been maintained and served well until now. - https://github.com/apache/spark/releases/tag/v3.3.0 (tagged on Jun 9th, 2022) - https://lists.apache.org/thread/zg6k1spw6k1c7brgo6t7qldvsqbmfytm (vote result on June

[FYI] SPARK-45981: Improve Python language test coverage

2023-12-01 Thread Dongjoon Hyun
Hi, All. As a part of Apache Spark 4.0.0 (SPARK-44111), the Apache Spark community starts to have test coverage for all supported Python versions from Today. - https://github.com/apache/spark/actions/runs/7061665420 Here is a summary. 1. Main CI: All PRs and commits on `master` branch are

10x to 100x faster df.groupby().applyInPandas()

2023-12-01 Thread Enrico Minack
Hi devs, I am looking for some PySpark dev that is interested in some 10x to 100x speed up of df.groupby().applyInPandas() for small groups. A PoC and benchmark can be found at https://github.com/apache/spark/pull/37360#issuecomment-1228293766. I suppose, the same approach could be taken

Re:[ANNOUNCE] Apache Spark 3.4.2 released

2023-11-30 Thread beliefer
Congratulations! At 2023-12-01 01:23:55, "Dongjoon Hyun" wrote: We are happy to announce the availability of Apache Spark 3.4.2! Spark 3.4.2 is a maintenance release containing many fixes including security and correctness domains. This release is based on the branch-3.4 maintenance

Unsubscribe

2023-11-30 Thread Devarshi Vyas

unsubscribe

2023-11-30 Thread Sandeep Vinayak
- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-30 Thread Kumar K
+1 On Fri, Nov 10, 2023 at 8:51 PM Khalid Mammadov wrote: > +1 > > On Fri, 10 Nov 2023, 15:23 Peter Toth, wrote: > >> +1 >> >> On Fri, Nov 10, 2023, 14:09 Bjørn Jørgensen >> wrote: >> >>> +1 >>> >>> fre. 10. nov. 2023 kl. 08:39 skrev Nan Zhu : >>> just curious what happened on google’s

[ANNOUNCE] Apache Spark 3.4.2 released

2023-11-30 Thread Dongjoon Hyun
We are happy to announce the availability of Apache Spark 3.4.2! Spark 3.4.2 is a maintenance release containing many fixes including security and correctness domains. This release is based on the branch-3.4 maintenance branch of Spark. We strongly recommend all 3.4 users to upgrade to this

[VOTE][RESULT] Release Spark 3.4.2 (RC1)

2023-11-30 Thread Dongjoon Hyun
The vote passes with 6 +1s (3 binding +1s) and one non-binding -1. Thanks to all who helped with the release! (* = binding) +1: - Dongjoon Hyun * - Kent Yao - Yang Jie - Mridul Muralidharan * - Liang-Chi Hsieh * - Jia Fan +0: None -1: - Marc Le Bihan

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-30 Thread Dongjoon Hyun
Thank you all. This vote passed. I will conclude this vote. Dongjoon. On 2023/11/30 09:53:17 Jia Fan wrote: > +1 > > L. C. Hsieh 于2023年11月30日周四 12:33写道: > > > +1 > > > > Thanks Dongjoon! > > > > On Wed, Nov 29, 2023 at 7:53 PM Mridul Muralidharan > > wrote: > > > > > > +1 > > > > > >

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-30 Thread Jia Fan
+1 L. C. Hsieh 于2023年11月30日周四 12:33写道: > +1 > > Thanks Dongjoon! > > On Wed, Nov 29, 2023 at 7:53 PM Mridul Muralidharan > wrote: > > > > +1 > > > > Signatures, digests, etc check out fine. > > Checked out tag and build/tested with -Phive -Pyarn -Pmesos -Pkubernetes > > > > Regards, > > Mridul

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-29 Thread L. C. Hsieh
+1 Thanks Dongjoon! On Wed, Nov 29, 2023 at 7:53 PM Mridul Muralidharan wrote: > > +1 > > Signatures, digests, etc check out fine. > Checked out tag and build/tested with -Phive -Pyarn -Pmesos -Pkubernetes > > Regards, > Mridul > > On Wed, Nov 29, 2023 at 5:08 AM Yang Jie wrote: >> >>

Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-29 Thread Anish Shrigondekar
Hi dev, Addressed the comments that Jungtaek had on the doc. Bumping the thread once again to see if other folks have any feedback on the proposal. Thanks, Anish On Mon, Nov 27, 2023 at 8:15 PM Jungtaek Lim wrote: > Kindly bump for better reach after the long holiday. Please kindly review >

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-29 Thread Shiqi Sun
Hi Zhou, Thanks for the reply. For the language choice, since I don't think I've used many k8s components written in Java on k8s, I can't really tell, but at least for the components written in Golang, they are well-organized, easy to read/maintain and run well in general. In addition, goroutines

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-29 Thread Mridul Muralidharan
+1 Signatures, digests, etc check out fine. Checked out tag and build/tested with -Phive -Pyarn -Pmesos -Pkubernetes Regards, Mridul On Wed, Nov 29, 2023 at 5:08 AM Yang Jie wrote: > +1(non-binding) > > Jie Yang > > On 2023/11/29 02:08:04 Kent Yao wrote: > > +1(non-binding) > > > > Kent Yao >

[sql] how to connect query stage to Spark job/stages?

2023-11-29 Thread Chenghao Lyu
Hi, I am seeking advice on measuring the performance of each QueryStage (QS) when AQE is enabled in Spark SQL. Specifically, I need help to automatically map a QS to its corresponding jobs (or stages) to get the QS runtime metrics. I recorded the QS structure via a customized injected Query

Re: Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread Yang Jie
Thank you very much for the feedback from Dongjoon and Xiao Li. After carefully reading https://lists.apache.org/thread/mrx0y078cf3ozs7czykvv864y6dr55xq, I have decided to abandon the deletion of HiveContext. As Xiao Li said, its maintenance cost is not high, but it will increase the cost of

Re: Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread Xiao Li
Thank you for raising it in the dev list. I do not think we should remove HiveContext based on the cost of break and maintenance. FYI, when releasing Spark 3.0, we had a lot of discussions about the related topics https://lists.apache.org/thread/mrx0y078cf3ozs7czykvv864y6dr55xq Dongjoon Hyun

Re: Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread Dongjoon Hyun
Thank you for the heads-up. I agree with your intention and the fact that it's not useful in Apache Spark 4.0.0. However, as you know, historically, it was removed once and explicitly added back to the Apache Spark 3.0 via the vote. SPARK-31088 Add back HiveContext and createExternalTable (As a

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-29 Thread Yang Jie
+1(non-binding) Jie Yang On 2023/11/29 02:08:04 Kent Yao wrote: > +1(non-binding) > > Kent Yao > > On 2023/11/27 01:12:53 Dongjoon Hyun wrote: > > Hi, Marc. > > > > Given that it exists in 3.4.0 and 3.4.1, I don't think it's a release > > blocker for Apache Spark 3.4.2. > > > > When the

Remove HiveContext from Apache Spark 4.0

2023-11-29 Thread 杨杰
Hi all, In SPARK-46171 (apache/spark#44077 [1]), I’m trying to remove the deprecated HiveContext from Apache Spark 4.0 since HiveContext has been marked as deprecated after Spark 2.0. This is a long-deprecated API, it should be replaced with SparkSession with enableHiveSupport now, so I think

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-28 Thread Kent Yao
+1(non-binding) Kent Yao On 2023/11/27 01:12:53 Dongjoon Hyun wrote: > Hi, Marc. > > Given that it exists in 3.4.0 and 3.4.1, I don't think it's a release > blocker for Apache Spark 3.4.2. > > When the patch is ready, we can consider it for 3.4.3. > > In addition, note that we categorized

[RESULT][VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-28 Thread Kent Yao
Hi Spark dev, The vote[1] has now closed. The results are: +1 Votes(*=binding): - Mridul Muralidharan* - Ye Zhou - Dongjoon Hyun* - Reynold Xin* - Yang Jie - Gengliang Wang* - Ruifeng Zheng* - Binjie Yang - Kent Yao 0 Votes: None -1 Votes: None The vote is successful with 5 binding +1 votes.

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-28 Thread Kent Yao
+1(non-binding) I will raise a new thread for the result. Thank you all for the vote. Thanks Kent On 2023/11/28 02:48:33 Binjie Yang wrote: > + 1 > > Thanks, > Binjie Yang > > On 2023/11/27 02:27:22 Ruifeng Zheng wrote: > > +1 > > > > On Sun, Nov 26, 2023 at 6:58 AM Gengliang Wang wrote: >

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-28 Thread Zhou Jiang
Hi Shiqi, Thanks for the cross-posting here - sorry for the response delay during the holiday break :) We prefer Java for the operator project as it's JVM-based and widely familiar within the Spark community. This choice aims to facilitate better adoption and ease of onboarding for future

Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-27 Thread Jungtaek Lim
Kindly bump for better reach after the long holiday. Please kindly review the proposal which opens the chance to address complex use cases of streaming. Thanks! On Thu, Nov 23, 2023 at 8:19 AM Jungtaek Lim wrote: > Thanks Anish for proposing SPIP and initiating this thread! I believe this >

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-27 Thread Binjie Yang
+ 1 Thanks, Binjie Yang On 2023/11/27 02:27:22 Ruifeng Zheng wrote: > +1 > > On Sun, Nov 26, 2023 at 6:58 AM Gengliang Wang wrote: > > > +1 > > > > On Sat, Nov 25, 2023 at 2:50 AM yangjie01 > > wrote: > > > >> +1 > >> > >> > >> > >> *发件人**: *Reynold Xin > >> *日期**: *2023年11月25日 星期六 14:35 >

Join push down in DSv2

2023-11-27 Thread Stefan Hagedorn
Hi, At the Spark Summit 2017 Ioana Delaney presented an approach for join pushdown in Apache Spark [1]. Is there any intent to actually bring this into Spark, especially in the DSv2 interface? Does anyone know if there's ongoing work or a document about this? [1]

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-26 Thread Ruifeng Zheng
+1 On Sun, Nov 26, 2023 at 6:58 AM Gengliang Wang wrote: > +1 > > On Sat, Nov 25, 2023 at 2:50 AM yangjie01 > wrote: > >> +1 >> >> >> >> *发件人**: *Reynold Xin >> *日期**: *2023年11月25日 星期六 14:35 >> *收件人**: *Dongjoon Hyun >> *抄送**: *Ye Zhou , Mridul Muralidharan < >> mri...@gmail.com>, Kent Yao ,

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-26 Thread Dongjoon Hyun
Hi, Marc. Given that it exists in 3.4.0 and 3.4.1, I don't think it's a release blocker for Apache Spark 3.4.2. When the patch is ready, we can consider it for 3.4.3. In addition, note that we categorized release-blocker-level issues by marking 'Blocker' priority with `Target Version` before

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-25 Thread Gengliang Wang
+1 On Sat, Nov 25, 2023 at 2:50 AM yangjie01 wrote: > +1 > > > > *发件人**: *Reynold Xin > *日期**: *2023年11月25日 星期六 14:35 > *收件人**: *Dongjoon Hyun > *抄送**: *Ye Zhou , Mridul Muralidharan < > mri...@gmail.com>, Kent Yao , dev > *主题**: *Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-25 Thread Marc Le Bihan
-1 If you can wait that the last remaining problem with Generics (?) is entirely solved, that causes this exception to be thrown : java.lang.ClassCastException: class [Ljava.lang.Object; cannot becast to class [Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and

Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-25 Thread Dongjoon Hyun
+1 Dongjoon. On 2023/11/25 10:48:41 Dongjoon Hyun wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.4.2. > > The vote is open until November 30th 1AM (PST) and passes if a majority +1 > PMC votes are cast, with a minimum of 3 +1 votes. > > [ ] +1 Release

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-25 Thread yangjie01
+1 发件人: Reynold Xin 日期: 2023年11月25日 星期六 14:35 收件人: Dongjoon Hyun 抄送: Ye Zhou , Mridul Muralidharan , Kent Yao , dev 主题: Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files +1 On Fri, Nov 24, 2023 at 10:19 PM, Dongjoon Hyun mailto:dongjoon.h...@gmail.com>> wrote: +1 Thanks,

[VOTE] Release Spark 3.4.2 (RC1)

2023-11-25 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version 3.4.2. The vote is open until November 30th 1AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.4.2 [ ] -1 Do not release this package

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Reynold Xin
+1 On Fri, Nov 24, 2023 at 10:19 PM, Dongjoon Hyun < dongjoon.h...@gmail.com > wrote: > > +1 > > > Thanks, > Dongjoon. > > On Fri, Nov 24, 2023 at 7:14 PM Ye Zhou < zhouyejoe@ gmail. com ( > zhouye...@gmail.com ) > wrote: > > >> +1(non-binding) >> >> On Fri, Nov 24, 2023 at 11:16 Mridul

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Dongjoon Hyun
+1 Thanks, Dongjoon. On Fri, Nov 24, 2023 at 7:14 PM Ye Zhou wrote: > +1(non-binding) > > On Fri, Nov 24, 2023 at 11:16 Mridul Muralidharan > wrote: > >> >> +1 >> >> Regards, >> Mridul >> >> On Fri, Nov 24, 2023 at 8:21 AM Kent Yao wrote: >> >>> Hi Spark Dev, >>> >>> Following the discussion

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Ye Zhou
+1(non-binding) On Fri, Nov 24, 2023 at 11:16 Mridul Muralidharan wrote: > > +1 > > Regards, > Mridul > > On Fri, Nov 24, 2023 at 8:21 AM Kent Yao wrote: > >> Hi Spark Dev, >> >> Following the discussion [1], I'd like to start the vote for the SPIP [2]. >> >> The SPIP aims to improve the test

Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Mridul Muralidharan
+1 Regards, Mridul On Fri, Nov 24, 2023 at 8:21 AM Kent Yao wrote: > Hi Spark Dev, > > Following the discussion [1], I'd like to start the vote for the SPIP [2]. > > The SPIP aims to improve the test coverage and develop experience for > Spark UI-related javascript codes. > > This thread will

[VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Kent Yao
Hi Spark Dev, Following the discussion [1], I'd like to start the vote for the SPIP [2]. The SPIP aims to improve the test coverage and develop experience for Spark UI-related javascript codes. This thread will be open for at least the next 72 hours. Please vote accordingly, [ ] +1: Accept

Re: [DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-24 Thread Kent Yao
Thank you all. I will start an official vote for this SPIP. Kent On 2023/11/22 03:05:42 Mridul Muralidharan wrote: > This should be a very good addition ! > > Regards, > Mridul > > On Tue, Nov 21, 2023 at 7:46 PM Dongjoon Hyun > wrote: > > > Thank you for proposing a new UI test framework

Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-22 Thread Shiqi Sun
Hi all, Sorry for being late to the party. I went through the SPIP doc and I think this is a great proposal! I left a comment in the SPIP doc a couple days ago, but I don't see much activity there and no one replied, so I wanted to cross-post it here to get some feedback. I'm Shiqi Sun, and I

Re: [DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-22 Thread Jungtaek Lim
Thanks Anish for proposing SPIP and initiating this thread! I believe this SPIP will help a bunch of complex use cases on streaming. dev@: We are coincidentally initiating this discussion in thanksgiving holidays. We understand people in the US may not have time to review the SPIP, and we plan to

[DISCUSS] SPIP: Structured Streaming - Arbitrary State API v2

2023-11-22 Thread Anish Shrigondekar
Hi dev, I would like to start a discussion on "Structured Streaming - Arbitrary State API v2". This proposal aims to address a bunch of limitations we see today using mapGroupsWithState/flatMapGroupsWithState operator. The detailed set of limitations is described in the SPIP doc. We propose to

Re: [DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-21 Thread Mridul Muralidharan
This should be a very good addition ! Regards, Mridul On Tue, Nov 21, 2023 at 7:46 PM Dongjoon Hyun wrote: > Thank you for proposing a new UI test framework for Apache Spark 4.0. > > It looks very useful. > > Thanks, > Dongjoon. > > > On Tue, Nov 21, 2023 at 1:51 AM Kent Yao wrote: > >> Hi

Re: [DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-21 Thread Wenchen Fan
+1, very useful! On Wed, Nov 22, 2023 at 10:29 AM Dongjoon Hyun wrote: > Thank you for proposing a new UI test framework for Apache Spark 4.0. > > It looks very useful. > > Thanks, > Dongjoon. > > > On Tue, Nov 21, 2023 at 1:51 AM Kent Yao wrote: > >> Hi Spark Dev, >> >> This is a call to

Help for testing Windows specific fix (SPARK-23015)

2023-11-21 Thread Hyukjin Kwon
Hi all, I used to have my Windows environment in another laptop but that laptop is broken now so I don't have Windows env to test Windows PRs out (e.g., https://github.com/apache/spark/pull/43706). If anyone has a Windows env, would appreciate it if you take a look at this. Thanks.

Re: [DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-21 Thread Dongjoon Hyun
Thank you for proposing a new UI test framework for Apache Spark 4.0. It looks very useful. Thanks, Dongjoon. On Tue, Nov 21, 2023 at 1:51 AM Kent Yao wrote: > Hi Spark Dev, > > This is a call to discuss a new SPIP: Testing Framework for > Spark UI Javascript files [1]. The SPIP aims to

[DISCUSS] SPIP: Testing Framework for Spark UI Javascript files

2023-11-21 Thread Kent Yao
Hi Spark Dev, This is a call to discuss a new SPIP: Testing Framework for Spark UI Javascript files [1]. The SPIP aims to improve the test coverage and develop experience for Spark UI-related javascript codes. The Jest [2], a JavaScript Testing Framework licensed under MIT, will be used to build

[VOTE][RESULT] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-17 Thread L. C. Hsieh
Hi all, The vote passes with 19 +1s (11 binding +1s). Thanks to all who reviews the SPIP doc and votes! (* = binding) +1: - Ye Zhou - L. C. Hsieh (*) - Chao Sun (*) - Vakaris Baškirov - DB Tsai (*) - Holden Karau (*) - Lucian Neghina - Mridul Muralidharan (*) - Huaxin Gao (*) - Cheng Pan -

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-16 Thread Gabor Somogyi
+1 (non-binding) I think it's good from directional perspective. Apache Flink is already using this approach for quite some time in production. The overall conclusion is that it's a big gain :) G On Tue, Nov 14, 2023 at 6:42 PM L. C. Hsieh wrote: > Hi all, > > I’d like to start a vote for

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Jungtaek Lim
+1 (non-binding) On Thu, Nov 16, 2023 at 4:23 AM Ilan Filonenko wrote: > +1 (non-binding) > > On Wed, Nov 15, 2023 at 12:57 PM Xiao Li wrote: > >> +1 >> >> bo yang 于2023年11月15日周三 05:55写道: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 7:18 PM huaxin gao >>> wrote: >>> +1 On Tue,

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Ruifeng Zheng
+1 On Thu, Nov 16, 2023 at 8:34 AM Ilan Filonenko wrote: > +1 (non-binding) > > On Wed, Nov 15, 2023 at 12:57 PM Xiao Li wrote: > >> +1 >> >> bo yang 于2023年11月15日周三 05:55写道: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 7:18 PM huaxin gao >>> wrote: >>> +1 On Tue, Nov 14, 2023 at

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Ilan Filonenko
+1 (non-binding) On Wed, Nov 15, 2023 at 12:57 PM Xiao Li wrote: > +1 > > bo yang 于2023年11月15日周三 05:55写道: > >> +1 >> >> On Tue, Nov 14, 2023 at 7:18 PM huaxin gao >> wrote: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 10:45 AM Holden Karau >>> wrote: >>> +1 On Tue, Nov 14, 2023

Re: [VOTE] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-15 Thread Xiao Li
+1 bo yang 于2023年11月15日周三 05:55写道: > +1 > > On Tue, Nov 14, 2023 at 7:18 PM huaxin gao wrote: > >> +1 >> >> On Tue, Nov 14, 2023 at 10:45 AM Holden Karau >> wrote: >> >>> +1 >>> >>> On Tue, Nov 14, 2023 at 10:21 AM DB Tsai wrote: >>> +1 DB Tsai | https://www.dbtsai.com/ |

<    3   4   5   6   7   8   9   10   11   12   >