Re: [VOTE] Release Spark 3.4.2 (RC1)

2023-11-28 Thread Kent Yao
+1(non-binding)

Kent Yao

On 2023/11/27 01:12:53 Dongjoon Hyun wrote:
> Hi, Marc.
> 
> Given that it exists in 3.4.0 and 3.4.1, I don't think it's a release
> blocker for Apache Spark 3.4.2.
> 
> When the patch is ready, we can consider it for 3.4.3.
> 
> In addition, note that we categorized release-blocker-level issues by
> marking 'Blocker' priority with `Target Version` before the vote.
> 
> Best,
> Dongjoon.
> 
> 
> On Sat, Nov 25, 2023 at 12:01 PM Marc Le Bihan  wrote:
> 
> > -1 If you can wait that the last remaining problem with Generics (?) is
> > entirely solved, that causes this exception to be thrown :
> >
> > java.lang.ClassCastException: class [Ljava.lang.Object; cannot be cast to 
> > class [Ljava.lang.reflect.TypeVariable; ([Ljava.lang.Object; and 
> > [Ljava.lang.reflect.TypeVariable; are in module java.base of loader 
> > 'bootstrap')
> > at 
> > org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:116)
> > at 
> > org.apache.spark.sql.catalyst.JavaTypeInference$.$anonfun$encoderFor$1(JavaTypeInference.scala:140)
> > at scala.collection.ArrayOps$.map$extension(ArrayOps.scala:929)
> > at 
> > org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:138)
> > at 
> > org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:60)
> > at 
> > org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:53)
> > at 
> > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:62)
> > at org.apache.spark.sql.Encoders$.bean(Encoders.scala:179)
> > at org.apache.spark.sql.Encoders.bean(Encoders.scala)
> >
> >
> > https://issues.apache.org/jira/browse/SPARK-45311
> >
> > Thanks !
> >
> > Marc Le Bihan
> >
> >
> > On 25/11/2023 11:48, Dongjoon Hyun wrote:
> >
> > Please vote on releasing the following candidate as Apache Spark version
> > 3.4.2.
> >
> > The vote is open until November 30th 1AM (PST) and passes if a majority +1
> > PMC votes are cast, with a minimum of 3 +1 votes.
> >
> > [ ] +1 Release this package as Apache Spark 3.4.2
> > [ ] -1 Do not release this package because ...
> >
> > To learn more about Apache Spark, please see https://spark.apache.org/
> >
> > The tag to be voted on is v3.4.2-rc1 (commit
> > 0c0e7d4087c64efca259b4fb656b8be643be5686)
> > https://github.com/apache/spark/tree/v3.4.2-rc1
> >
> > The release files, including signatures, digests, etc. can be found at:
> > https://dist.apache.org/repos/dist/dev/spark/v3.4.2-rc1-bin/
> >
> > Signatures used for Spark RCs can be found in this file:
> > https://dist.apache.org/repos/dist/dev/spark/KEYS
> >
> > The staging repository for this release can be found at:
> > https://repository.apache.org/content/repositories/orgapachespark-1450/
> >
> > The documentation corresponding to this release can be found at:
> > https://dist.apache.org/repos/dist/dev/spark/v3.4.2-rc1-docs/
> >
> > The list of bug fixes going into 3.4.2 can be found at the following URL:
> > https://issues.apache.org/jira/projects/SPARK/versions/12353368
> >
> > This release is using the release script of the tag v3.4.2-rc1.
> >
> > FAQ
> >
> > =
> > How can I help test this release?
> > =
> >
> > If you are a Spark user, you can help us test this release by taking
> > an existing Spark workload and running on this release candidate, then
> > reporting any regressions.
> >
> > If you're working in PySpark you can set up a virtual env and install
> > the current RC and see if anything important breaks, in the Java/Scala
> > you can add the staging repository to your projects resolvers and test
> > with the RC (make sure to clean up the artifact cache before/after so
> > you don't end up building with a out of date RC going forward).
> >
> > ===
> > What should happen to JIRA tickets still targeting 3.4.2?
> > ===
> >
> > The current list of open tickets targeted at 3.4.2 can be found at:
> > https://issues.apache.org/jira/projects/SPARK and search for "Target
> > Version/s" = 3.4.2
> >
> > Committers should look at those and triage. Extremely important bug
> > fixes, documentation, and API tweaks that impact compatibility should
> > be worked on immediately. Everything else please retarget to an
> > appropriate release.
> >
> > ==
> > But my bug isn't fixed?
> > ==
> >
> > In order to make timely releases, we will typically not hold the
> > release unless the bug in question is a regression from the previous
> > release. That being said, if there is something which is a regression
> > that has not been correctly targeted please ping me or a committer to
> > help target the issue.
> >
> >
> >
> 

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



[RESULT][VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-28 Thread Kent Yao
Hi Spark dev,

The vote[1] has now closed. The results are:

+1 Votes(*=binding):

- Mridul Muralidharan*
- Ye Zhou
- Dongjoon Hyun*
- Reynold Xin*
- Yang Jie
- Gengliang Wang*
- Ruifeng Zheng*
- Binjie Yang
- Kent Yao

0 Votes: None

-1 Votes: None

The vote is successful with 5 binding +1 votes.

Best Regards,
Kent Yao

[1] https://lists.apache.org/thread/0361btx8gf94cr67bnstbrz6ngbzqj8j

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files

2023-11-28 Thread Kent Yao
+1(non-binding)

I will raise a new thread for the result. Thank you all for the vote.

Thanks
Kent

On 2023/11/28 02:48:33 Binjie Yang wrote:
> + 1
> 
> Thanks,
> Binjie Yang
> 
> On 2023/11/27 02:27:22 Ruifeng Zheng wrote:
> > +1
> > 
> > On Sun, Nov 26, 2023 at 6:58 AM Gengliang Wang  wrote:
> > 
> > > +1
> > >
> > > On Sat, Nov 25, 2023 at 2:50 AM yangjie01 
> > > wrote:
> > >
> > >> +1
> > >>
> > >>
> > >>
> > >> *发件人**: *Reynold Xin 
> > >> *日期**: *2023年11月25日 星期六 14:35
> > >> *收件人**: *Dongjoon Hyun 
> > >> *抄送**: *Ye Zhou , Mridul Muralidharan <
> > >> mri...@gmail.com>, Kent Yao , dev 
> > >> *主题**: *Re: [VOTE] SPIP: Testing Framework for Spark UI Javascript files
> > >>
> > >>
> > >>
> > >> +1
> > >>
> > >> [image: 图像已被发件人删除。]
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> On Fri, Nov 24, 2023 at 10:19 PM, Dongjoon Hyun 
> > >> wrote:
> > >>
> > >> +1
> > >>
> > >>
> > >>
> > >> Thanks,
> > >>
> > >> Dongjoon.
> > >>
> > >>
> > >>
> > >> On Fri, Nov 24, 2023 at 7:14 PM Ye Zhou  wrote:
> > >>
> > >> +1(non-binding)
> > >>
> > >>
> > >>
> > >> On Fri, Nov 24, 2023 at 11:16 Mridul Muralidharan 
> > >> wrote:
> > >>
> > >>
> > >>
> > >> +1
> > >>
> > >>
> > >>
> > >> Regards,
> > >>
> > >> Mridul
> > >>
> > >>
> > >>
> > >> On Fri, Nov 24, 2023 at 8:21 AM Kent Yao  wrote:
> > >>
> > >> Hi Spark Dev,
> > >>
> > >> Following the discussion [1], I'd like to start the vote for the SPIP 
> > >> [2].
> > >>
> > >> The SPIP aims to improve the test coverage and develop experience for
> > >> Spark UI-related javascript codes.
> > >>
> > >> This thread will be open for at least the next 72 hours.  Please vote
> > >> accordingly,
> > >>
> > >> [ ] +1: Accept the proposal as an official SPIP
> > >> [ ] +0
> > >> [ ] -1: I don’t think this is a good idea because …
> > >>
> > >>
> > >> Thank you!
> > >> Kent Yao
> > >>
> > >> [1] https://lists.apache.org/thread/5rqrho4ldgmqlc173y2229pfll5sgkff
> > >> 
> > >> [2]
> > >> https://docs.google.com/document/d/1hWl5Q2CNNOjN5Ubyoa28XmpJtDyD9BtGtiEG2TT94rg/edit?usp=sharing
> > >> 
> > >>
> > >> -
> > >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> > >>
> > >>
> > >>
> > >
> > 
> > -- 
> > Ruifeng Zheng
> > E-mail: zrfli...@gmail.com
> > 
> 
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> 
> 

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [DISCUSSION] SPIP: An Official Kubernetes Operator for Apache Spark

2023-11-28 Thread Zhou Jiang
Hi Shiqi,

Thanks for the cross-posting here - sorry for the response delay during the
holiday break :)
We prefer Java for the operator project as it's JVM-based and widely
familiar within the Spark community. This choice aims to facilitate better
adoption and ease of onboarding for future maintainers. In addition, the
Java API client can also be considered as a mature option widely used, by
Spark itself and by other operator implementations like Flink.
For easier onboarding and potential migration, we'll consider
compatibility with existing CRD designs - the goal is to maintain
compatibility as best as possible while minimizing duplication efforts.
I'm enthusiastic about the idea of lean, version agnostic submission
worker. It aligns with one of the primary goals in the operator design.
Let's continue exploring this idea further in design doc.

Thanks,
Zhou


On Wed, Nov 22, 2023 at 3:35 PM Shiqi Sun  wrote:

> Hi all,
>
> Sorry for being late to the party. I went through the SPIP doc and I think
> this is a great proposal! I left a comment in the SPIP doc a couple days
> ago, but I don't see much activity there and no one replied, so I wanted to
> cross-post it here to get some feedback.
>
> I'm Shiqi Sun, and I work for Big Data Platform in Salesforce. My team has
> been running the Spark on k8s operator
>  (OSS from
> Google) in my company to serve Spark users on production for 4+ years, and
> we've been actively contributing to the Spark on k8s operator OSS and also,
> occasionally, the Spark OSS. According to our experience, Google's Spark
> Operator has its own problems, like its close coupling with the spark
> version, as well as the JVM overhead during job submission. However on the
> other side, it's been a great component in our team's service in the
> company, especially being written in golang, it's really easy to have it
> interact with k8s, and also its CRD covers a lot of different use cases, as
> it has been built up through time thanks to many users' contribution during
> these years. There were also a handful of sessions of Google's Spark
> Operator Spark Summit that made it widely adopted.
>
> For this SPIP, I really love the idea of this proposal for the official
> k8s operator of Spark project, as well as the separate layer of the
> submission worker and being spark version agnostic. I think we can get the
> best of the two:
> 1. I would advocate the new project to still use golang for the
> implementation, as golang is the go-to cloud native language that works the
> best with k8s.
> 2. We make sure the functionality of the current Google's spark operator
> CRD is preserved in the new official Spark Operator; if we can make it
> compatible or even merge the two projects to make it the new official
> operator in spark project, it would be the best.
> 3. The new Spark Operator should continue being spark agnostic and
> continue having this lightweight/separate layer of submission worker. We've
> seen scalability issues caused by the heavy JVM during spark-submit in
> Google's Spark Operator and we implemented an internal version of fix for
> it within our company.
>
> We can continue the discussion in more detail, but generally I love this
> move of the official spark operator, and I really appreciate the effort! In
> the SPIP doc. I see my comment has gained several upvotes from someone I
> don't know, so I believe there are other spark/spark operator users who
> agree with some of my points. Let me know what you all think and let's
> continue the discussion, so that we can make this operator a great new
> component of the Open Source Spark Project!
>
> Thanks!
>
> Shiqi
>
> On Mon, Nov 13, 2023 at 11:50 PM L. C. Hsieh  wrote:
>
>> Thanks for all the support from the community for the SPIP proposal.
>>
>> Since all questions/discussion are settled down (if I didn't miss any
>> major ones), if no more questions or concerns, I'll be the shepherd
>> for this SPIP proposal and call for a vote tomorrow.
>>
>> Thank you all!
>>
>> On Mon, Nov 13, 2023 at 6:43 PM Zhou Jiang 
>> wrote:
>> >
>> > Hi Holden,
>> >
>> > Thanks a lot for your feedback!
>> > Yes, this proposal attempts to integrate existing solutions, especially
>> from CRD perspective. The proposed schema retains similarity with current
>> designs, while reducing duplicates and maintaining a single source of truth
>> from conf properties. It also tends to be close to native integration with
>> k8s to minimize schema changes for new features.
>> > For dependencies, packing everything is the easiest way to get started.
>> It would be straightforward to add --packages and --repositories support
>> for Maven dependencies. It's technically possible to pull dependencies in
>> cloud storage from init containers (if defined by user). It could be tricky
>> to design a general solution that supports different cloud providers from
>> the operator layer. An enhancement that I can