Re: [VOTE] Release Spark 3.4.3 (RC2)

2024-04-14 Thread Dongjoon Hyun
I'll start with my +1.

- Checked checksum and signature
- Checked Scala/Java/R/Python/SQL Document's Spark version 
- Checked published Maven artifacts
- All CIs passed.

Thanks,
Dongjoon.

On 2024/04/15 04:22:26 Dongjoon Hyun wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 3.4.3.
> 
> The vote is open until April 18th 1AM (PDT) and passes if a majority +1 PMC
> votes are cast, with a minimum of 3 +1 votes.
> 
> [ ] +1 Release this package as Apache Spark 3.4.3
> [ ] -1 Do not release this package because ...
> 
> To learn more about Apache Spark, please see https://spark.apache.org/
> 
> The tag to be voted on is v3.4.3-rc2 (commit
> 1eb558c3a6fbdd59e5a305bc3ab12ce748f6511f)
> https://github.com/apache/spark/tree/v3.4.3-rc2
> 
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.4.3-rc2-bin/
> 
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
> 
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1453/
> 
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.4.3-rc2-docs/
> 
> The list of bug fixes going into 3.4.3 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12353987
> 
> This release is using the release script of the tag v3.4.3-rc2.
> 
> FAQ
> 
> =
> How can I help test this release?
> =
> 
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
> 
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
> 
> ===
> What should happen to JIRA tickets still targeting 3.4.3?
> ===
> 
> The current list of open tickets targeted at 3.4.3 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 3.4.3
> 
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
> 
> ==
> But my bug isn't fixed?
> ==
> 
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
> 

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



[VOTE] Release Spark 3.4.3 (RC2)

2024-04-14 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version
3.4.3.

The vote is open until April 18th 1AM (PDT) and passes if a majority +1 PMC
votes are cast, with a minimum of 3 +1 votes.

[ ] +1 Release this package as Apache Spark 3.4.3
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see https://spark.apache.org/

The tag to be voted on is v3.4.3-rc2 (commit
1eb558c3a6fbdd59e5a305bc3ab12ce748f6511f)
https://github.com/apache/spark/tree/v3.4.3-rc2

The release files, including signatures, digests, etc. can be found at:
https://dist.apache.org/repos/dist/dev/spark/v3.4.3-rc2-bin/

Signatures used for Spark RCs can be found in this file:
https://dist.apache.org/repos/dist/dev/spark/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1453/

The documentation corresponding to this release can be found at:
https://dist.apache.org/repos/dist/dev/spark/v3.4.3-rc2-docs/

The list of bug fixes going into 3.4.3 can be found at the following URL:
https://issues.apache.org/jira/projects/SPARK/versions/12353987

This release is using the release script of the tag v3.4.3-rc2.

FAQ

=
How can I help test this release?
=

If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC and see if anything important breaks, in the Java/Scala
you can add the staging repository to your projects resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with a out of date RC going forward).

===
What should happen to JIRA tickets still targeting 3.4.3?
===

The current list of open tickets targeted at 3.4.3 can be found at:
https://issues.apache.org/jira/projects/SPARK and search for "Target
Version/s" = 3.4.3

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==
But my bug isn't fixed?
==

In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.


Re: [VOTE] SPARK-44444: Use ANSI SQL mode by default

2024-04-14 Thread Jungtaek Lim
+1 (non-binding), thanks Dongjoon.

On Sun, Apr 14, 2024 at 7:22 AM Dongjoon Hyun 
wrote:

> Please vote on SPARK-4 to use ANSI SQL mode by default.
> The technical scope is defined in the following PR which is
> one line of code change and one line of migration guide.
>
> - DISCUSSION:
> https://lists.apache.org/thread/ztlwoz1v1sn81ssks12tb19x37zozxlz
> - JIRA: https://issues.apache.org/jira/browse/SPARK-4
> - PR: https://github.com/apache/spark/pull/46013
>
> The vote is open until April 17th 1AM (PST) and passes
> if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Use ANSI SQL mode by default
> [ ] -1 Do not use ANSI SQL mode by default because ...
>
> Thank you in advance.
>
> Dongjoon
>


Re: [VOTE] SPARK-44444: Use ANSI SQL mode by default

2024-04-14 Thread Wenchen Fan
+1

On Sun, Apr 14, 2024 at 6:28 AM Dongjoon Hyun  wrote:

> I'll start from my +1.
>
> Dongjoon.
>
> On 2024/04/13 22:22:05 Dongjoon Hyun wrote:
> > Please vote on SPARK-4 to use ANSI SQL mode by default.
> > The technical scope is defined in the following PR which is
> > one line of code change and one line of migration guide.
> >
> > - DISCUSSION:
> > https://lists.apache.org/thread/ztlwoz1v1sn81ssks12tb19x37zozxlz
> > - JIRA: https://issues.apache.org/jira/browse/SPARK-4
> > - PR: https://github.com/apache/spark/pull/46013
> >
> > The vote is open until April 17th 1AM (PST) and passes
> > if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
> >
> > [ ] +1 Use ANSI SQL mode by default
> > [ ] -1 Do not use ANSI SQL mode by default because ...
> >
> > Thank you in advance.
> >
> > Dongjoon
> >
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [DISCUSS] Spark 4.0.0 release

2024-04-14 Thread Jungtaek Lim
W.r.t. state data source - reader (SPARK-45511
), there are several
follow-up tickets, but we don't plan to address them soon. The current
implementation is the final shape for Spark 4.0.0, unless there are demands
on the follow-up tickets.

We may want to check the plan for transformWithState - my understanding is
that we want to release the feature to 4.0.0, but there are several
remaining works to be done. While the tentative timeline for releasing is
June 2024, what would be the tentative timeline for the RC cut?
(cc. Anish to add more context on the plan for transformWithState)

On Sat, Apr 13, 2024 at 3:15 AM Wenchen Fan  wrote:

> Hi all,
>
> It's close to the previously proposed 4.0.0 release date (June 2024), and
> I think it's time to prepare for it and discuss the ongoing projects:
>
>- ANSI by default
>- Spark Connect GA
>- Structured Logging
>- Streaming state store data source
>- new data type VARIANT
>- STRING collation support
>- Spark k8s operator versioning
>
> Please help to add more items to this list that are missed here. I would
> like to volunteer as the release manager for Apache Spark 4.0.0 if there is
> no objection. Thank you all for the great work that fills Spark 4.0!
>
> Wenchen Fan
>


Re: [VOTE] SPARK-44444: Use ANSI SQL mode by default

2024-04-14 Thread yangjie01
+1 for me

Jie Yang

发件人: Mich Talebzadeh 
日期: 2024年4月14日 星期日 15:41
收件人: Dongjoon Hyun , Spark dev list 

主题: Re: [VOTE] SPARK-4: Use ANSI SQL mode by default

+ 1 for me

It makes it more compatible with the other ANSI SQL compliant products.

Mich Talebzadeh,
Technologist | Solutions Architect | Data Engineer  | Generative AI
London
United Kingdom


 [图像已被发件人删除。]   view my Linkedin 
profile

 
https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: The information provided is correct to the best of my knowledge but 
of course cannot be guaranteed . It is essential to note that, as with any 
advice, quote "one test result is worth one-thousand expert opinions (Werner 

 Von 
Braun)".


On Sun, 14 Apr 2024 at 00:39, Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
Please vote on SPARK-4 to use ANSI SQL mode by default.
The technical scope is defined in the following PR which is
one line of code change and one line of migration guide.

- DISCUSSION: 
https://lists.apache.org/thread/ztlwoz1v1sn81ssks12tb19x37zozxlz
- JIRA: 
https://issues.apache.org/jira/browse/SPARK-4
- PR: 
https://github.com/apache/spark/pull/46013

The vote is open until April 17th 1AM (PST) and passes
if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.

[ ] +1 Use ANSI SQL mode by default
[ ] -1 Do not use ANSI SQL mode by default because ...

Thank you in advance.

Dongjoon


Re: [VOTE] Add new `Versions` in Apache Spark JIRA for Versioning of Spark Operator

2024-04-14 Thread Hussein Awala
+1 (non-binding) to using an independent version for the Spark Kubernetes
Operator with a compatibility matrix with Spark versions.

On Fri, Apr 12, 2024 at 5:31 AM L. C. Hsieh  wrote:

> Hi all,
>
> Thanks for all discussions in the thread of "Versioning of Spark
> Operator":
> https://lists.apache.org/thread/zhc7nb2sxm8jjxdppq8qjcmlf4rcsthh
>
> I would like to create this vote to get the consensus for versioning
> of the Spark Kubernetes Operator.
>
> The proposal is to use an independent versioning for the Spark
> Kubernetes Operator.
>
> Please vote on adding new `Versions` in Apache Spark JIRA which can be
> used for places like "Fix Version/s" in the JIRA tickets of the
> operator.
>
> The new `Versions` will be `kubernetes-operator-` prefix, for example
> `kubernetes-operator-0.1.0`.
>
> The vote is open until April 15th 1AM (PST) and passes if a majority
> +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Adding the new `Versions` for Spark Kubernetes Operator in
> Apache Spark JIRA
> [ ] -1 Do not add the new `Versions` because ...
>
> Thank you.
>
>
> Note that this is not a SPIP vote and also not a release vote. I don't
> find similar votes in previous threads. This is made similarly like a
> SPIP or a release vote. So I think it should be okay. Please correct
> me if this vote format is not good for you.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [VOTE] SPARK-44444: Use ANSI SQL mode by default

2024-04-14 Thread Christiano Anderson

+1

On 14/04/2024 00:22, Dongjoon Hyun wrote:

Please vote on SPARK-4 to use ANSI SQL mode by default.
The technical scope is defined in the following PR which is
one line of code change and one line of migration guide.

- DISCUSSION: 
https://lists.apache.org/thread/ztlwoz1v1sn81ssks12tb19x37zozxlz 

- JIRA: https://issues.apache.org/jira/browse/SPARK-4 

- PR: https://github.com/apache/spark/pull/46013 



The vote is open until April 17th 1AM (PST) and passes
if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.

[ ] +1 Use ANSI SQL mode by default
[ ] -1 Do not use ANSI SQL mode by default because ...

Thank you in advance.

Dongjoon


-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] SPARK-44444: Use ANSI SQL mode by default

2024-04-14 Thread Mich Talebzadeh
+ 1 for me

It makes it more compatible with the other ANSI SQL compliant products.

Mich Talebzadeh,
Technologist | Solutions Architect | Data Engineer  | Generative AI
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  Von
Braun )".


On Sun, 14 Apr 2024 at 00:39, Dongjoon Hyun  wrote:

> Please vote on SPARK-4 to use ANSI SQL mode by default.
> The technical scope is defined in the following PR which is
> one line of code change and one line of migration guide.
>
> - DISCUSSION:
> https://lists.apache.org/thread/ztlwoz1v1sn81ssks12tb19x37zozxlz
> - JIRA: https://issues.apache.org/jira/browse/SPARK-4
> - PR: https://github.com/apache/spark/pull/46013
>
> The vote is open until April 17th 1AM (PST) and passes
> if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Use ANSI SQL mode by default
> [ ] -1 Do not use ANSI SQL mode by default because ...
>
> Thank you in advance.
>
> Dongjoon
>