[ANNOUNCE] Apache Spark 3.4.3 released

2024-04-18 Thread Dongjoon Hyun
We are happy to announce the availability of Apache Spark 3.4.3!

Spark 3.4.3 is a maintenance release containing many fixes including
security and correctness domains. This release is based on the
branch-3.4 maintenance branch of Spark. We strongly
recommend all 3.4 users to upgrade to this stable release.

To download Spark 3.4.3, head over to the download page:
https://spark.apache.org/downloads.html

To view the release notes:
https://spark.apache.org/releases/spark-release-3-4-3.html

We would like to acknowledge all community members for contributing to this
release. This release would not have been possible without you.

Dongjoon Hyun


Re: Apache Spark 3.4.3 (?)

2024-04-08 Thread Dongjoon Hyun
Thank you, Holden, Mridul,  Kent, Liang-Chi, Mich, Jungtaek.

I added `Target Version: 3.4.3` to SPARK-47318 and am going to continue to 
prepare for RC1 (April 15th).

Dongjoon.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Apache Spark 3.4.3 (?)

2024-04-07 Thread Jungtaek Lim
Sounds like a plan. +1 (non-binding) Thanks for volunteering!

On Sun, Apr 7, 2024 at 5:45 AM Dongjoon Hyun 
wrote:

> Hi, All.
>
> Apache Spark 3.4.2 tag was created on Nov 24th and `branch-3.4` has 85
> commits including important security and correctness patches like
> SPARK-45580, SPARK-46092, SPARK-46466, SPARK-46794, and SPARK-46862.
>
> https://github.com/apache/spark/releases/tag/v3.4.2
>
> $ git log --oneline v3.4.2..HEAD | wc -l
>   85
>
> SPARK-45580 Subquery changes the output schema of the outer query
> SPARK-46092 Overflow in Parquet row group filter creation causes incorrect
> results
> SPARK-46466 Vectorized parquet reader should never do rebase for timestamp
> ntz
> SPARK-46794 Incorrect results due to inferred predicate from checkpoint
> with subquery
> SPARK-46862 Incorrect count() of a dataframe loaded from CSV datasource
> SPARK-45445 Upgrade snappy to 1.1.10.5
> SPARK-47428 Upgrade Jetty to 9.4.54.v20240208
> SPARK-46239 Hide `Jetty` info
>
>
> Currently, I'm checking more applicable patches for branch-3.4. I'd like
> to propose to release Apache Spark 3.4.3 and volunteer as the release
> manager for Apache Spark 3.4.3. If there are no additional blockers, the
> first tentative RC1 vote date is April 15th (Monday).
>
> WDYT?
>
> Dongjoon.
>


Fwd: Apache Spark 3.4.3 (?)

2024-04-07 Thread Mich Talebzadeh
Mich Talebzadeh,
Technologist | Solutions Architect | Data Engineer  | Generative AI
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


-- Forwarded message -
From: Mich Talebzadeh 
Date: Sun, 7 Apr 2024 at 11:56
Subject: Re: Apache Spark 3.4.3 (?)
To: Dongjoon Hyun 


Yes given that a good number of people are using some flavour of 3.4.n,
this will be a good fit.

+1 for me


Mich Talebzadeh,
Technologist | Solutions Architect | Data Engineer  | Generative AI
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".


On Sat, 6 Apr 2024 at 23:02, Dongjoon Hyun  wrote:

> Hi, All.
>
> Apache Spark 3.4.2 tag was created on Nov 24th and `branch-3.4` has 85
> commits including important security and correctness patches like
> SPARK-45580, SPARK-46092, SPARK-46466, SPARK-46794, and SPARK-46862.
>
> https://github.com/apache/spark/releases/tag/v3.4.2
>
> $ git log --oneline v3.4.2..HEAD | wc -l
>   85
>
> SPARK-45580 Subquery changes the output schema of the outer query
> SPARK-46092 Overflow in Parquet row group filter creation causes incorrect
> results
> SPARK-46466 Vectorized parquet reader should never do rebase for timestamp
> ntz
> SPARK-46794 Incorrect results due to inferred predicate from checkpoint
> with subquery
> SPARK-46862 Incorrect count() of a dataframe loaded from CSV datasource
> SPARK-45445 Upgrade snappy to 1.1.10.5
> SPARK-47428 Upgrade Jetty to 9.4.54.v20240208
> SPARK-46239 Hide `Jetty` info
>
>
> Currently, I'm checking more applicable patches for branch-3.4. I'd like
> to propose to release Apache Spark 3.4.3 and volunteer as the release
> manager for Apache Spark 3.4.3. If there are no additional blockers, the
> first tentative RC1 vote date is April 15th (Monday).
>
> WDYT?
>
>
> Dongjoon.
>


Re: Apache Spark 3.4.3 (?)

2024-04-07 Thread L. C. Hsieh
+1

Thanks Dongjoon!

On Sun, Apr 7, 2024 at 1:56 AM Kent Yao  wrote:
>
> +1, thank you, Dongjoon
>
>
> Kent
>
> Holden Karau  于2024年4月7日周日 14:54写道:
> >
> > Sounds good to me :)
> >
> > Twitter: https://twitter.com/holdenkarau
> > Books (Learning Spark, High Performance Spark, etc.): 
> > https://amzn.to/2MaRAG9
> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau
> >
> >
> > On Sat, Apr 6, 2024 at 2:51 PM Dongjoon Hyun  
> > wrote:
> >>
> >> Hi, All.
> >>
> >> Apache Spark 3.4.2 tag was created on Nov 24th and `branch-3.4` has 85 
> >> commits including important security and correctness patches like 
> >> SPARK-45580, SPARK-46092, SPARK-46466, SPARK-46794, and SPARK-46862.
> >>
> >> https://github.com/apache/spark/releases/tag/v3.4.2
> >>
> >> $ git log --oneline v3.4.2..HEAD | wc -l
> >>   85
> >>
> >> SPARK-45580 Subquery changes the output schema of the outer query
> >> SPARK-46092 Overflow in Parquet row group filter creation causes incorrect 
> >> results
> >> SPARK-46466 Vectorized parquet reader should never do rebase for timestamp 
> >> ntz
> >> SPARK-46794 Incorrect results due to inferred predicate from checkpoint 
> >> with subquery
> >> SPARK-46862 Incorrect count() of a dataframe loaded from CSV datasource
> >> SPARK-45445 Upgrade snappy to 1.1.10.5
> >> SPARK-47428 Upgrade Jetty to 9.4.54.v20240208
> >> SPARK-46239 Hide `Jetty` info
> >>
> >>
> >> Currently, I'm checking more applicable patches for branch-3.4. I'd like 
> >> to propose to release Apache Spark 3.4.3 and volunteer as the release 
> >> manager for Apache Spark 3.4.3. If there are no additional blockers, the 
> >> first tentative RC1 vote date is April 15th (Monday).
> >>
> >> WDYT?
> >>
> >>
> >> Dongjoon.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Apache Spark 3.4.3 (?)

2024-04-07 Thread Kent Yao
+1, thank you, Dongjoon


Kent

Holden Karau  于2024年4月7日周日 14:54写道:
>
> Sounds good to me :)
>
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>
>
> On Sat, Apr 6, 2024 at 2:51 PM Dongjoon Hyun  wrote:
>>
>> Hi, All.
>>
>> Apache Spark 3.4.2 tag was created on Nov 24th and `branch-3.4` has 85 
>> commits including important security and correctness patches like 
>> SPARK-45580, SPARK-46092, SPARK-46466, SPARK-46794, and SPARK-46862.
>>
>> https://github.com/apache/spark/releases/tag/v3.4.2
>>
>> $ git log --oneline v3.4.2..HEAD | wc -l
>>   85
>>
>> SPARK-45580 Subquery changes the output schema of the outer query
>> SPARK-46092 Overflow in Parquet row group filter creation causes incorrect 
>> results
>> SPARK-46466 Vectorized parquet reader should never do rebase for timestamp 
>> ntz
>> SPARK-46794 Incorrect results due to inferred predicate from checkpoint with 
>> subquery
>> SPARK-46862 Incorrect count() of a dataframe loaded from CSV datasource
>> SPARK-45445 Upgrade snappy to 1.1.10.5
>> SPARK-47428 Upgrade Jetty to 9.4.54.v20240208
>> SPARK-46239 Hide `Jetty` info
>>
>>
>> Currently, I'm checking more applicable patches for branch-3.4. I'd like to 
>> propose to release Apache Spark 3.4.3 and volunteer as the release manager 
>> for Apache Spark 3.4.3. If there are no additional blockers, the first 
>> tentative RC1 vote date is April 15th (Monday).
>>
>> WDYT?
>>
>>
>> Dongjoon.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Apache Spark 3.4.3 (?)

2024-04-06 Thread Mridul Muralidharan
Hi Dongjoon,

  Thanks for volunteering !
I would suggest to wait for SPARK-47318 to be merged as well for 3.4

Regards,
Mridul

On Sat, Apr 6, 2024 at 6:49 PM Dongjoon Hyun 
wrote:

> Hi, All.
>
> Apache Spark 3.4.2 tag was created on Nov 24th and `branch-3.4` has 85
> commits including important security and correctness patches like
> SPARK-45580, SPARK-46092, SPARK-46466, SPARK-46794, and SPARK-46862.
>
> https://github.com/apache/spark/releases/tag/v3.4.2
>
> $ git log --oneline v3.4.2..HEAD | wc -l
>   85
>
> SPARK-45580 Subquery changes the output schema of the outer query
> SPARK-46092 Overflow in Parquet row group filter creation causes incorrect
> results
> SPARK-46466 Vectorized parquet reader should never do rebase for timestamp
> ntz
> SPARK-46794 Incorrect results due to inferred predicate from checkpoint
> with subquery
> SPARK-46862 Incorrect count() of a dataframe loaded from CSV datasource
> SPARK-45445 Upgrade snappy to 1.1.10.5
> SPARK-47428 Upgrade Jetty to 9.4.54.v20240208
> SPARK-46239 Hide `Jetty` info
>
>
> Currently, I'm checking more applicable patches for branch-3.4. I'd like
> to propose to release Apache Spark 3.4.3 and volunteer as the release
> manager for Apache Spark 3.4.3. If there are no additional blockers, the
> first tentative RC1 vote date is April 15th (Monday).
>
> WDYT?
>
>
> Dongjoon.
>


Re: Apache Spark 3.4.3 (?)

2024-04-06 Thread Holden Karau
Sounds good to me :)

Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


On Sat, Apr 6, 2024 at 2:51 PM Dongjoon Hyun 
wrote:

> Hi, All.
>
> Apache Spark 3.4.2 tag was created on Nov 24th and `branch-3.4` has 85
> commits including important security and correctness patches like
> SPARK-45580, SPARK-46092, SPARK-46466, SPARK-46794, and SPARK-46862.
>
> https://github.com/apache/spark/releases/tag/v3.4.2
>
> $ git log --oneline v3.4.2..HEAD | wc -l
>   85
>
> SPARK-45580 Subquery changes the output schema of the outer query
> SPARK-46092 Overflow in Parquet row group filter creation causes incorrect
> results
> SPARK-46466 Vectorized parquet reader should never do rebase for timestamp
> ntz
> SPARK-46794 Incorrect results due to inferred predicate from checkpoint
> with subquery
> SPARK-46862 Incorrect count() of a dataframe loaded from CSV datasource
> SPARK-45445 Upgrade snappy to 1.1.10.5
> SPARK-47428 Upgrade Jetty to 9.4.54.v20240208
> SPARK-46239 Hide `Jetty` info
>
>
> Currently, I'm checking more applicable patches for branch-3.4. I'd like
> to propose to release Apache Spark 3.4.3 and volunteer as the release
> manager for Apache Spark 3.4.3. If there are no additional blockers, the
> first tentative RC1 vote date is April 15th (Monday).
>
> WDYT?
>
>
> Dongjoon.
>


Apache Spark 3.4.3 (?)

2024-04-06 Thread Dongjoon Hyun
Hi, All.

Apache Spark 3.4.2 tag was created on Nov 24th and `branch-3.4` has 85
commits including important security and correctness patches like
SPARK-45580, SPARK-46092, SPARK-46466, SPARK-46794, and SPARK-46862.

https://github.com/apache/spark/releases/tag/v3.4.2

$ git log --oneline v3.4.2..HEAD | wc -l
  85

SPARK-45580 Subquery changes the output schema of the outer query
SPARK-46092 Overflow in Parquet row group filter creation causes incorrect
results
SPARK-46466 Vectorized parquet reader should never do rebase for timestamp
ntz
SPARK-46794 Incorrect results due to inferred predicate from checkpoint
with subquery
SPARK-46862 Incorrect count() of a dataframe loaded from CSV datasource
SPARK-45445 Upgrade snappy to 1.1.10.5
SPARK-47428 Upgrade Jetty to 9.4.54.v20240208
SPARK-46239 Hide `Jetty` info


Currently, I'm checking more applicable patches for branch-3.4. I'd like to
propose to release Apache Spark 3.4.3 and volunteer as the release manager
for Apache Spark 3.4.3. If there are no additional blockers, the first
tentative RC1 vote date is April 15th (Monday).

WDYT?

Dongjoon.