subject:"Apache Spark 3.2.2 Release\?"

Re: Apache Spark 3.2.2 Release?

2022-07-08 Thread Dongjoon Hyun

Thank you so much! :)

Dongjoon.

On Thu, Jul 7, 2022 at 6:51 PM Joshua Rosen  wrote:
>
> +1; thanks for coordinating this!
>
> I have a few more correctness bugs to add to the list in your original email 
> (these were originally missing the 'correctness' JIRA label):
>
> - https://issues.apache.org/jira/browse/SPARK-37643 : when 
> charVarcharAsString is true, char datatype partition table query incorrect
> - https://issues.apache.org/jira/browse/SPARK-37865 : Spark should not dedup 
> the groupingExpressions when the first child of Union has duplicate columns
> - https://issues.apache.org/jira/browse/SPARK-38787 : Possible correctness 
> issue on stream-stream join when handling edge case
>
>
> On Thu, Jul 7, 2022 at 6:12 PM Dongjoon Hyun  wrote:
>>
>> Thank you all.
>>
>> I'll check and prepare RC1 for next week.
>>
>> Dongjoon.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Apache Spark 3.2.2 Release?

2022-07-07 Thread Joshua Rosen

+1; thanks for coordinating this!

I have a few more correctness bugs to add to the list in your original
email (these were originally missing the 'correctness' JIRA label):

- https://issues.apache.org/jira/browse/SPARK-37643 : when
charVarcharAsString is true, char datatype partition table query incorrect
- https://issues.apache.org/jira/browse/SPARK-37865 : Spark should not
dedup the groupingExpressions when the first child of Union has duplicate
columns
- https://issues.apache.org/jira/browse/SPARK-38787 : Possible correctness
issue on stream-stream join when handling edge case


On Thu, Jul 7, 2022 at 6:12 PM Dongjoon Hyun 
wrote:

> Thank you all.
>
> I'll check and prepare RC1 for next week.
>
> Dongjoon.
>

Re: Apache Spark 3.2.2 Release?

2022-07-07 Thread Dongjoon Hyun

Thank you all.

I'll check and prepare RC1 for next week.

Dongjoon.

Re: Apache Spark 3.2.2 Release?

2022-07-07 Thread Andrew Ray

+1 (non-binding) Thanks!

On Thu, Jul 7, 2022 at 7:00 AM Yang,Jie(INF)  wrote:

> +1 (non-binding) Thank you Dongjoon ~
>
>
>
> *发件人**: *Ruifeng Zheng 
> *日期**: *2022年7月7日 星期四 16:28
> *收件人**: *dev 
> *主题**: *Re: Apache Spark 3.2.2 Release?
>
>
>
> +1 thank you Dongjoon!
>
>
> --
>
> [image: 图像已被发件人删除。]
>
> Ruifeng Zheng
>
> ruife...@foxmail.com
>
>
>
>
>
>
>
> -- Original --
>
> *From:* "Yikun Jiang" ;
>
> *Date:* Thu, Jul 7, 2022 04:16 PM
>
> *To:* "Mridul Muralidharan";
>
> *Cc:* "Gengliang Wang";"Cheng Su";"Maxim
> Gekk";"Wenchen 
> Fan";"Xiao
> Li";"Xinrong
> Meng";"Yuming Wang" >;"dev";
>
> *Subject:* Re: Apache Spark 3.2.2 Release?
>
>
>
> +1  (non-binding)
>
>
>
> Thanks!
>
>
> Regards,
>
> Yikun
>
>
>
>
>
> On Thu, Jul 7, 2022 at 1:57 PM Mridul Muralidharan 
> wrote:
>
> +1
>
>
>
> Thanks for driving this Dongjoon !
>
>
>
> Regards,
>
> Mridul
>
>
>
> On Thu, Jul 7, 2022 at 12:36 AM Gengliang Wang  wrote:
>
> +1.
>
> Thank you, Dongjoon.
>
>
>
> On Wed, Jul 6, 2022 at 10:21 PM Wenchen Fan  wrote:
>
> +1
>
>
>
> On Thu, Jul 7, 2022 at 10:41 AM Xinrong Meng
>  wrote:
>
> +1
>
>
> Thanks!
>
>
>
> Xinrong Meng
>
> Software Engineer
>
> Databricks
>
>
>
>
>
> On Wed, Jul 6, 2022 at 7:25 PM Xiao Li  wrote:
>
> +1
>
>
>
> Xiao
>
>
>
> Cheng Su  于2022年7月6日周三 19:16写道：
>
> +1 (non-binding)
>
>
>
> Thanks,
>
> Cheng Su
>
>
>
> On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang  wrote:
>
> +1
>
>
>
> On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk
>  wrote:
>
> +1
>
>
>
> On Thu, Jul 7, 2022 at 12:26 AM John Zhuge  wrote:
>
> +1  Thanks for the effort!
>
>
>
> On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen 
> wrote:
>
> +1
>
>
>
> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon :
>
> Yeah +1
>
>
>
> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun 
> wrote:
>
> Hi, All.
>
> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
> including 11 correctness patches arrived at branch-3.2.
>
> Shall we make a new release, Apache Spark 3.2.2, as the third release
> at 3.2 line? I'd like to volunteer as the release manager for Apache
> Spark 3.2.2. I'm thinking about starting the first RC next week.
>
> $ git log --oneline v3.2.1..HEAD | wc -l
>  197
>
> # Correctness issues
>
> SPARK-38075 Hive script transform with order by and limit will
> return fake rows
> SPARK-38204 All state operators are at a risk of inconsistency
> between state partitioning and operator partitioning
> SPARK-38309 SHS has incorrect percentiles for shuffle read bytes
> and shuffle total blocks metrics
> SPARK-38320 (flat)MapGroupsWithState can timeout groups which just
> received inputs in the same microbatch
> SPARK-38614 After Spark update, df.show() shows incorrect
> F.percent_rank results
> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset
> row whose input is not null
> SPARK-38684 Stream-stream outer join has a possible correctness
> issue due to weakly read consistent on outer iterators
> SPARK-39061 Incorrect results or NPE when using Inline function
> against an array of dynamically created structs
> SPARK-39107 Silent change in regexp_replace's handling of empty strings
> SPARK-39259 Timestamps returned by now() and equivalent functions
> are not consistent in subqueries
> SPARK-39293 The accumulator of ArrayAggregate should copy the
> intermediate result if string, struct, array, or map
>
> Best,
> Dongjoon.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
> --
>
> John Zhuge
>
>

Re: Apache Spark 3.2.2 Release?

2022-07-07 Thread Yang,Jie(INF)

+1 (non-binding) Thank you Dongjoon ~

发件人: Ruifeng Zheng 
日期: 2022年7月7日 星期四 16:28
收件人: dev 
主题: Re: Apache Spark 3.2.2 Release?

+1 thank you Dongjoon!


[图像已被发件人删除。]

Ruifeng Zheng
ruife...@foxmail.com




-- Original --
From: "Yikun Jiang" ;
Date: Thu, Jul 7, 2022 04:16 PM
To: "Mridul Muralidharan";
Cc: "Gengliang Wang";"Cheng Su";"Maxim 
Gekk";"Wenchen 
Fan";"Xiao Li";"Xinrong 
Meng";"Yuming 
Wang";"dev";
Subject: Re: Apache Spark 3.2.2 Release?

+1  (non-binding)

Thanks!

Regards,
Yikun


On Thu, Jul 7, 2022 at 1:57 PM Mridul Muralidharan 
mailto:mri...@gmail.com>> wrote:
+1

Thanks for driving this Dongjoon !

Regards,
Mridul

On Thu, Jul 7, 2022 at 12:36 AM Gengliang Wang 
mailto:ltn...@gmail.com>> wrote:
+1.
Thank you, Dongjoon.

On Wed, Jul 6, 2022 at 10:21 PM Wenchen Fan 
mailto:cloud0...@gmail.com>> wrote:
+1

On Thu, Jul 7, 2022 at 10:41 AM Xinrong Meng 
 wrote:
+1


Thanks!



Xinrong Meng

Software Engineer

Databricks


On Wed, Jul 6, 2022 at 7:25 PM Xiao Li 
mailto:gatorsm...@gmail.com>> wrote:
+1

Xiao

Cheng Su mailto:scnj...@gmail.com>> 于2022年7月6日周三 19:16写道：
+1 (non-binding)

Thanks,
Cheng Su

On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang 
mailto:wgy...@gmail.com>> wrote:
+1

On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk  
wrote:
+1

On Thu, Jul 7, 2022 at 12:26 AM John Zhuge 
mailto:jzh...@apache.org>> wrote:
+1  Thanks for the effort!

On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen 
mailto:bjornjorgen...@gmail.com>> wrote:
+1

ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon 
mailto:gurwls...@gmail.com>>:
Yeah +1

On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
Hi, All.

Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
including 11 correctness patches arrived at branch-3.2.

Shall we make a new release, Apache Spark 3.2.2, as the third release
at 3.2 line? I'd like to volunteer as the release manager for Apache
Spark 3.2.2. I'm thinking about starting the first RC next week.

$ git log --oneline v3.2.1..HEAD | wc -l
 197

# Correctness issues

SPARK-38075 Hive script transform with order by and limit will
return fake rows
SPARK-38204 All state operators are at a risk of inconsistency
between state partitioning and operator partitioning
SPARK-38309 SHS has incorrect percentiles for shuffle read bytes
and shuffle total blocks metrics
SPARK-38320 (flat)MapGroupsWithState can timeout groups which just
received inputs in the same microbatch
SPARK-38614 After Spark update, df.show() shows incorrect
F.percent_rank results
SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset
row whose input is not null
SPARK-38684 Stream-stream outer join has a possible correctness
issue due to weakly read consistent on outer iterators
SPARK-39061 Incorrect results or NPE when using Inline function
against an array of dynamically created structs
SPARK-39107 Silent change in regexp_replace's handling of empty strings
SPARK-39259 Timestamps returned by now() and equivalent functions
are not consistent in subqueries
SPARK-39293 The accumulator of ArrayAggregate should copy the
intermediate result if string, struct, array, or map

Best,
Dongjoon.

-
To unsubscribe e-mail: 
dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org>
--
John Zhuge

Re: Apache Spark 3.2.2 Release?

2022-07-07 Thread Ruifeng Zheng

+1 thank you Dongjoon!




RuifengZheng
ruife...@foxmail.com








--Original--
From:   
 "Yikun Jiang"

Re: Apache Spark 3.2.2 Release?

2022-07-07 Thread Yikun Jiang

+1  (non-binding)

Thanks!

Regards,
Yikun


On Thu, Jul 7, 2022 at 1:57 PM Mridul Muralidharan  wrote:

> +1
>
> Thanks for driving this Dongjoon !
>
> Regards,
> Mridul
>
> On Thu, Jul 7, 2022 at 12:36 AM Gengliang Wang  wrote:
>
>> +1.
>> Thank you, Dongjoon.
>>
>> On Wed, Jul 6, 2022 at 10:21 PM Wenchen Fan  wrote:
>>
>>> +1
>>>
>>> On Thu, Jul 7, 2022 at 10:41 AM Xinrong Meng
>>>  wrote:
>>>
 +1

 Thanks!


 Xinrong Meng

 Software Engineer

 Databricks


 On Wed, Jul 6, 2022 at 7:25 PM Xiao Li  wrote:

> +1
>
> Xiao
>
> Cheng Su  于2022年7月6日周三 19:16写道：
>
>> +1 (non-binding)
>>
>> Thanks,
>> Cheng Su
>>
>> On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang  wrote:
>>
>>> +1
>>>
>>> On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk
>>>  wrote:
>>>
 +1

 On Thu, Jul 7, 2022 at 12:26 AM John Zhuge 
 wrote:

> +1  Thanks for the effort!
>
> On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen <
> bjornjorgen...@gmail.com> wrote:
>
>> +1
>>
>> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon > >:
>>
>>> Yeah +1
>>>
>>> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun <
>>> dongjoon.h...@gmail.com> wrote:
>>>
 Hi, All.

 Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
 including 11 correctness patches arrived at branch-3.2.

 Shall we make a new release, Apache Spark 3.2.2, as the third
 release
 at 3.2 line? I'd like to volunteer as the release manager for
 Apache
 Spark 3.2.2. I'm thinking about starting the first RC next week.

 $ git log --oneline v3.2.1..HEAD | wc -l
  197

 # Correctness issues

 SPARK-38075 Hive script transform with order by and limit
 will
 return fake rows
 SPARK-38204 All state operators are at a risk of
 inconsistency
 between state partitioning and operator partitioning
 SPARK-38309 SHS has incorrect percentiles for shuffle read
 bytes
 and shuffle total blocks metrics
 SPARK-38320 (flat)MapGroupsWithState can timeout groups
 which just
 received inputs in the same microbatch
 SPARK-38614 After Spark update, df.show() shows incorrect
 F.percent_rank results
 SPARK-38655 OffsetWindowFunctionFrameBase cannot find the
 offset
 row whose input is not null
 SPARK-38684 Stream-stream outer join has a possible
 correctness
 issue due to weakly read consistent on outer iterators
 SPARK-39061 Incorrect results or NPE when using Inline
 function
 against an array of dynamically created structs
 SPARK-39107 Silent change in regexp_replace's handling of
 empty strings
 SPARK-39259 Timestamps returned by now() and equivalent
 functions
 are not consistent in subqueries
 SPARK-39293 The accumulator of ArrayAggregate should copy
 the
 intermediate result if string, struct, array, or map

 Best,
 Dongjoon.


 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

 --
> John Zhuge
>

Re: Apache Spark 3.2.2 Release?

2022-07-06 Thread Mridul Muralidharan

+1

Thanks for driving this Dongjoon !

Regards,
Mridul

On Thu, Jul 7, 2022 at 12:36 AM Gengliang Wang  wrote:

> +1.
> Thank you, Dongjoon.
>
> On Wed, Jul 6, 2022 at 10:21 PM Wenchen Fan  wrote:
>
>> +1
>>
>> On Thu, Jul 7, 2022 at 10:41 AM Xinrong Meng
>>  wrote:
>>
>>> +1
>>>
>>> Thanks!
>>>
>>>
>>> Xinrong Meng
>>>
>>> Software Engineer
>>>
>>> Databricks
>>>
>>>
>>> On Wed, Jul 6, 2022 at 7:25 PM Xiao Li  wrote:
>>>
 +1

 Xiao

 Cheng Su  于2022年7月6日周三 19:16写道：

> +1 (non-binding)
>
> Thanks,
> Cheng Su
>
> On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang  wrote:
>
>> +1
>>
>> On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk
>>  wrote:
>>
>>> +1
>>>
>>> On Thu, Jul 7, 2022 at 12:26 AM John Zhuge 
>>> wrote:
>>>
 +1  Thanks for the effort!

 On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen <
 bjornjorgen...@gmail.com> wrote:

> +1
>
> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon :
>
>> Yeah +1
>>
>> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun <
>> dongjoon.h...@gmail.com> wrote:
>>
>>> Hi, All.
>>>
>>> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
>>> including 11 correctness patches arrived at branch-3.2.
>>>
>>> Shall we make a new release, Apache Spark 3.2.2, as the third
>>> release
>>> at 3.2 line? I'd like to volunteer as the release manager for
>>> Apache
>>> Spark 3.2.2. I'm thinking about starting the first RC next week.
>>>
>>> $ git log --oneline v3.2.1..HEAD | wc -l
>>>  197
>>>
>>> # Correctness issues
>>>
>>> SPARK-38075 Hive script transform with order by and limit
>>> will
>>> return fake rows
>>> SPARK-38204 All state operators are at a risk of
>>> inconsistency
>>> between state partitioning and operator partitioning
>>> SPARK-38309 SHS has incorrect percentiles for shuffle read
>>> bytes
>>> and shuffle total blocks metrics
>>> SPARK-38320 (flat)MapGroupsWithState can timeout groups
>>> which just
>>> received inputs in the same microbatch
>>> SPARK-38614 After Spark update, df.show() shows incorrect
>>> F.percent_rank results
>>> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the
>>> offset
>>> row whose input is not null
>>> SPARK-38684 Stream-stream outer join has a possible
>>> correctness
>>> issue due to weakly read consistent on outer iterators
>>> SPARK-39061 Incorrect results or NPE when using Inline
>>> function
>>> against an array of dynamically created structs
>>> SPARK-39107 Silent change in regexp_replace's handling of
>>> empty strings
>>> SPARK-39259 Timestamps returned by now() and equivalent
>>> functions
>>> are not consistent in subqueries
>>> SPARK-39293 The accumulator of ArrayAggregate should copy the
>>> intermediate result if string, struct, array, or map
>>>
>>> Best,
>>> Dongjoon.
>>>
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>> --
 John Zhuge

>>>

Re: Apache Spark 3.2.2 Release?

2022-07-06 Thread Gengliang Wang

+1.
Thank you, Dongjoon.

On Wed, Jul 6, 2022 at 10:21 PM Wenchen Fan  wrote:

> +1
>
> On Thu, Jul 7, 2022 at 10:41 AM Xinrong Meng
>  wrote:
>
>> +1
>>
>> Thanks!
>>
>>
>> Xinrong Meng
>>
>> Software Engineer
>>
>> Databricks
>>
>>
>> On Wed, Jul 6, 2022 at 7:25 PM Xiao Li  wrote:
>>
>>> +1
>>>
>>> Xiao
>>>
>>> Cheng Su  于2022年7月6日周三 19:16写道：
>>>
 +1 (non-binding)

 Thanks,
 Cheng Su

 On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang  wrote:

> +1
>
> On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk
>  wrote:
>
>> +1
>>
>> On Thu, Jul 7, 2022 at 12:26 AM John Zhuge  wrote:
>>
>>> +1  Thanks for the effort!
>>>
>>> On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen <
>>> bjornjorgen...@gmail.com> wrote:
>>>
 +1

 ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon :

> Yeah +1
>
> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun <
> dongjoon.h...@gmail.com> wrote:
>
>> Hi, All.
>>
>> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
>> including 11 correctness patches arrived at branch-3.2.
>>
>> Shall we make a new release, Apache Spark 3.2.2, as the third
>> release
>> at 3.2 line? I'd like to volunteer as the release manager for
>> Apache
>> Spark 3.2.2. I'm thinking about starting the first RC next week.
>>
>> $ git log --oneline v3.2.1..HEAD | wc -l
>>  197
>>
>> # Correctness issues
>>
>> SPARK-38075 Hive script transform with order by and limit will
>> return fake rows
>> SPARK-38204 All state operators are at a risk of inconsistency
>> between state partitioning and operator partitioning
>> SPARK-38309 SHS has incorrect percentiles for shuffle read
>> bytes
>> and shuffle total blocks metrics
>> SPARK-38320 (flat)MapGroupsWithState can timeout groups which
>> just
>> received inputs in the same microbatch
>> SPARK-38614 After Spark update, df.show() shows incorrect
>> F.percent_rank results
>> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the
>> offset
>> row whose input is not null
>> SPARK-38684 Stream-stream outer join has a possible
>> correctness
>> issue due to weakly read consistent on outer iterators
>> SPARK-39061 Incorrect results or NPE when using Inline
>> function
>> against an array of dynamically created structs
>> SPARK-39107 Silent change in regexp_replace's handling of
>> empty strings
>> SPARK-39259 Timestamps returned by now() and equivalent
>> functions
>> are not consistent in subqueries
>> SPARK-39293 The accumulator of ArrayAggregate should copy the
>> intermediate result if string, struct, array, or map
>>
>> Best,
>> Dongjoon.
>>
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>> --
>>> John Zhuge
>>>
>>

Re: Apache Spark 3.2.2 Release?

2022-07-06 Thread Wenchen Fan

+1

On Thu, Jul 7, 2022 at 10:41 AM Xinrong Meng
 wrote:

> +1
>
> Thanks!
>
>
> Xinrong Meng
>
> Software Engineer
>
> Databricks
>
>
> On Wed, Jul 6, 2022 at 7:25 PM Xiao Li  wrote:
>
>> +1
>>
>> Xiao
>>
>> Cheng Su  于2022年7月6日周三 19:16写道：
>>
>>> +1 (non-binding)
>>>
>>> Thanks,
>>> Cheng Su
>>>
>>> On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang  wrote:
>>>
 +1

 On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk
  wrote:

> +1
>
> On Thu, Jul 7, 2022 at 12:26 AM John Zhuge  wrote:
>
>> +1  Thanks for the effort!
>>
>> On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen <
>> bjornjorgen...@gmail.com> wrote:
>>
>>> +1
>>>
>>> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon :
>>>
 Yeah +1

 On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun <
 dongjoon.h...@gmail.com> wrote:

> Hi, All.
>
> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
> including 11 correctness patches arrived at branch-3.2.
>
> Shall we make a new release, Apache Spark 3.2.2, as the third
> release
> at 3.2 line? I'd like to volunteer as the release manager for
> Apache
> Spark 3.2.2. I'm thinking about starting the first RC next week.
>
> $ git log --oneline v3.2.1..HEAD | wc -l
>  197
>
> # Correctness issues
>
> SPARK-38075 Hive script transform with order by and limit will
> return fake rows
> SPARK-38204 All state operators are at a risk of inconsistency
> between state partitioning and operator partitioning
> SPARK-38309 SHS has incorrect percentiles for shuffle read
> bytes
> and shuffle total blocks metrics
> SPARK-38320 (flat)MapGroupsWithState can timeout groups which
> just
> received inputs in the same microbatch
> SPARK-38614 After Spark update, df.show() shows incorrect
> F.percent_rank results
> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the
> offset
> row whose input is not null
> SPARK-38684 Stream-stream outer join has a possible correctness
> issue due to weakly read consistent on outer iterators
> SPARK-39061 Incorrect results or NPE when using Inline function
> against an array of dynamically created structs
> SPARK-39107 Silent change in regexp_replace's handling of
> empty strings
> SPARK-39259 Timestamps returned by now() and equivalent
> functions
> are not consistent in subqueries
> SPARK-39293 The accumulator of ArrayAggregate should copy the
> intermediate result if string, struct, array, or map
>
> Best,
> Dongjoon.
>
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
> --
>> John Zhuge
>>
>

Re: Apache Spark 3.2.2 Release?

2022-07-06 Thread Xinrong Meng

+1

Thanks!


Xinrong Meng

Software Engineer

Databricks


On Wed, Jul 6, 2022 at 7:25 PM Xiao Li  wrote:

> +1
>
> Xiao
>
> Cheng Su  于2022年7月6日周三 19:16写道：
>
>> +1 (non-binding)
>>
>> Thanks,
>> Cheng Su
>>
>> On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang  wrote:
>>
>>> +1
>>>
>>> On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk
>>>  wrote:
>>>
 +1

 On Thu, Jul 7, 2022 at 12:26 AM John Zhuge  wrote:

> +1  Thanks for the effort!
>
> On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen <
> bjornjorgen...@gmail.com> wrote:
>
>> +1
>>
>> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon :
>>
>>> Yeah +1
>>>
>>> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun <
>>> dongjoon.h...@gmail.com> wrote:
>>>
 Hi, All.

 Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
 including 11 correctness patches arrived at branch-3.2.

 Shall we make a new release, Apache Spark 3.2.2, as the third
 release
 at 3.2 line? I'd like to volunteer as the release manager for Apache
 Spark 3.2.2. I'm thinking about starting the first RC next week.

 $ git log --oneline v3.2.1..HEAD | wc -l
  197

 # Correctness issues

 SPARK-38075 Hive script transform with order by and limit will
 return fake rows
 SPARK-38204 All state operators are at a risk of inconsistency
 between state partitioning and operator partitioning
 SPARK-38309 SHS has incorrect percentiles for shuffle read bytes
 and shuffle total blocks metrics
 SPARK-38320 (flat)MapGroupsWithState can timeout groups which
 just
 received inputs in the same microbatch
 SPARK-38614 After Spark update, df.show() shows incorrect
 F.percent_rank results
 SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset
 row whose input is not null
 SPARK-38684 Stream-stream outer join has a possible correctness
 issue due to weakly read consistent on outer iterators
 SPARK-39061 Incorrect results or NPE when using Inline function
 against an array of dynamically created structs
 SPARK-39107 Silent change in regexp_replace's handling of empty
 strings
 SPARK-39259 Timestamps returned by now() and equivalent
 functions
 are not consistent in subqueries
 SPARK-39293 The accumulator of ArrayAggregate should copy the
 intermediate result if string, struct, array, or map

 Best,
 Dongjoon.


 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

 --
> John Zhuge
>

Re: Apache Spark 3.2.2 Release?

2022-07-06 Thread Xiao Li

+1

Xiao

Cheng Su  于2022年7月6日周三 19:16写道：

> +1 (non-binding)
>
> Thanks,
> Cheng Su
>
> On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang  wrote:
>
>> +1
>>
>> On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk
>>  wrote:
>>
>>> +1
>>>
>>> On Thu, Jul 7, 2022 at 12:26 AM John Zhuge  wrote:
>>>
 +1  Thanks for the effort!

 On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen <
 bjornjorgen...@gmail.com> wrote:

> +1
>
> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon :
>
>> Yeah +1
>>
>> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun 
>> wrote:
>>
>>> Hi, All.
>>>
>>> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
>>> including 11 correctness patches arrived at branch-3.2.
>>>
>>> Shall we make a new release, Apache Spark 3.2.2, as the third release
>>> at 3.2 line? I'd like to volunteer as the release manager for Apache
>>> Spark 3.2.2. I'm thinking about starting the first RC next week.
>>>
>>> $ git log --oneline v3.2.1..HEAD | wc -l
>>>  197
>>>
>>> # Correctness issues
>>>
>>> SPARK-38075 Hive script transform with order by and limit will
>>> return fake rows
>>> SPARK-38204 All state operators are at a risk of inconsistency
>>> between state partitioning and operator partitioning
>>> SPARK-38309 SHS has incorrect percentiles for shuffle read bytes
>>> and shuffle total blocks metrics
>>> SPARK-38320 (flat)MapGroupsWithState can timeout groups which
>>> just
>>> received inputs in the same microbatch
>>> SPARK-38614 After Spark update, df.show() shows incorrect
>>> F.percent_rank results
>>> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset
>>> row whose input is not null
>>> SPARK-38684 Stream-stream outer join has a possible correctness
>>> issue due to weakly read consistent on outer iterators
>>> SPARK-39061 Incorrect results or NPE when using Inline function
>>> against an array of dynamically created structs
>>> SPARK-39107 Silent change in regexp_replace's handling of empty
>>> strings
>>> SPARK-39259 Timestamps returned by now() and equivalent functions
>>> are not consistent in subqueries
>>> SPARK-39293 The accumulator of ArrayAggregate should copy the
>>> intermediate result if string, struct, array, or map
>>>
>>> Best,
>>> Dongjoon.
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>> --
 John Zhuge

>>>

Re: Apache Spark 3.2.2 Release?

2022-07-06 Thread Cheng Su

+1 (non-binding)

Thanks,
Cheng Su

On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang  wrote:

> +1
>
> On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk
>  wrote:
>
>> +1
>>
>> On Thu, Jul 7, 2022 at 12:26 AM John Zhuge  wrote:
>>
>>> +1  Thanks for the effort!
>>>
>>> On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen 
>>> wrote:
>>>
 +1

 ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon :

> Yeah +1
>
> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun 
> wrote:
>
>> Hi, All.
>>
>> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
>> including 11 correctness patches arrived at branch-3.2.
>>
>> Shall we make a new release, Apache Spark 3.2.2, as the third release
>> at 3.2 line? I'd like to volunteer as the release manager for Apache
>> Spark 3.2.2. I'm thinking about starting the first RC next week.
>>
>> $ git log --oneline v3.2.1..HEAD | wc -l
>>  197
>>
>> # Correctness issues
>>
>> SPARK-38075 Hive script transform with order by and limit will
>> return fake rows
>> SPARK-38204 All state operators are at a risk of inconsistency
>> between state partitioning and operator partitioning
>> SPARK-38309 SHS has incorrect percentiles for shuffle read bytes
>> and shuffle total blocks metrics
>> SPARK-38320 (flat)MapGroupsWithState can timeout groups which just
>> received inputs in the same microbatch
>> SPARK-38614 After Spark update, df.show() shows incorrect
>> F.percent_rank results
>> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset
>> row whose input is not null
>> SPARK-38684 Stream-stream outer join has a possible correctness
>> issue due to weakly read consistent on outer iterators
>> SPARK-39061 Incorrect results or NPE when using Inline function
>> against an array of dynamically created structs
>> SPARK-39107 Silent change in regexp_replace's handling of empty
>> strings
>> SPARK-39259 Timestamps returned by now() and equivalent functions
>> are not consistent in subqueries
>> SPARK-39293 The accumulator of ArrayAggregate should copy the
>> intermediate result if string, struct, array, or map
>>
>> Best,
>> Dongjoon.
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>> --
>>> John Zhuge
>>>
>>

Re: Apache Spark 3.2.2 Release?

2022-07-06 Thread Yuming Wang

+1

On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk 
wrote:

> +1
>
> On Thu, Jul 7, 2022 at 12:26 AM John Zhuge  wrote:
>
>> +1  Thanks for the effort!
>>
>> On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen 
>> wrote:
>>
>>> +1
>>>
>>> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon :
>>>
 Yeah +1

 On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun 
 wrote:

> Hi, All.
>
> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
> including 11 correctness patches arrived at branch-3.2.
>
> Shall we make a new release, Apache Spark 3.2.2, as the third release
> at 3.2 line? I'd like to volunteer as the release manager for Apache
> Spark 3.2.2. I'm thinking about starting the first RC next week.
>
> $ git log --oneline v3.2.1..HEAD | wc -l
>  197
>
> # Correctness issues
>
> SPARK-38075 Hive script transform with order by and limit will
> return fake rows
> SPARK-38204 All state operators are at a risk of inconsistency
> between state partitioning and operator partitioning
> SPARK-38309 SHS has incorrect percentiles for shuffle read bytes
> and shuffle total blocks metrics
> SPARK-38320 (flat)MapGroupsWithState can timeout groups which just
> received inputs in the same microbatch
> SPARK-38614 After Spark update, df.show() shows incorrect
> F.percent_rank results
> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset
> row whose input is not null
> SPARK-38684 Stream-stream outer join has a possible correctness
> issue due to weakly read consistent on outer iterators
> SPARK-39061 Incorrect results or NPE when using Inline function
> against an array of dynamically created structs
> SPARK-39107 Silent change in regexp_replace's handling of empty
> strings
> SPARK-39259 Timestamps returned by now() and equivalent functions
> are not consistent in subqueries
> SPARK-39293 The accumulator of ArrayAggregate should copy the
> intermediate result if string, struct, array, or map
>
> Best,
> Dongjoon.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
> --
>> John Zhuge
>>
>

Re: Apache Spark 3.2.2 Release?

2022-07-06 Thread Maxim Gekk

+1

On Thu, Jul 7, 2022 at 12:26 AM John Zhuge  wrote:

> +1  Thanks for the effort!
>
> On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen 
> wrote:
>
>> +1
>>
>> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon :
>>
>>> Yeah +1
>>>
>>> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun 
>>> wrote:
>>>
 Hi, All.

 Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
 including 11 correctness patches arrived at branch-3.2.

 Shall we make a new release, Apache Spark 3.2.2, as the third release
 at 3.2 line? I'd like to volunteer as the release manager for Apache
 Spark 3.2.2. I'm thinking about starting the first RC next week.

 $ git log --oneline v3.2.1..HEAD | wc -l
  197

 # Correctness issues

 SPARK-38075 Hive script transform with order by and limit will
 return fake rows
 SPARK-38204 All state operators are at a risk of inconsistency
 between state partitioning and operator partitioning
 SPARK-38309 SHS has incorrect percentiles for shuffle read bytes
 and shuffle total blocks metrics
 SPARK-38320 (flat)MapGroupsWithState can timeout groups which just
 received inputs in the same microbatch
 SPARK-38614 After Spark update, df.show() shows incorrect
 F.percent_rank results
 SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset
 row whose input is not null
 SPARK-38684 Stream-stream outer join has a possible correctness
 issue due to weakly read consistent on outer iterators
 SPARK-39061 Incorrect results or NPE when using Inline function
 against an array of dynamically created structs
 SPARK-39107 Silent change in regexp_replace's handling of empty
 strings
 SPARK-39259 Timestamps returned by now() and equivalent functions
 are not consistent in subqueries
 SPARK-39293 The accumulator of ArrayAggregate should copy the
 intermediate result if string, struct, array, or map

 Best,
 Dongjoon.

 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

 --
> John Zhuge
>

Re: Apache Spark 3.2.2 Release?

2022-07-06 Thread John Zhuge

+1  Thanks for the effort!

On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen 
wrote:

> +1
>
> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon :
>
>> Yeah +1
>>
>> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun 
>> wrote:
>>
>>> Hi, All.
>>>
>>> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
>>> including 11 correctness patches arrived at branch-3.2.
>>>
>>> Shall we make a new release, Apache Spark 3.2.2, as the third release
>>> at 3.2 line? I'd like to volunteer as the release manager for Apache
>>> Spark 3.2.2. I'm thinking about starting the first RC next week.
>>>
>>> $ git log --oneline v3.2.1..HEAD | wc -l
>>>  197
>>>
>>> # Correctness issues
>>>
>>> SPARK-38075 Hive script transform with order by and limit will
>>> return fake rows
>>> SPARK-38204 All state operators are at a risk of inconsistency
>>> between state partitioning and operator partitioning
>>> SPARK-38309 SHS has incorrect percentiles for shuffle read bytes
>>> and shuffle total blocks metrics
>>> SPARK-38320 (flat)MapGroupsWithState can timeout groups which just
>>> received inputs in the same microbatch
>>> SPARK-38614 After Spark update, df.show() shows incorrect
>>> F.percent_rank results
>>> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset
>>> row whose input is not null
>>> SPARK-38684 Stream-stream outer join has a possible correctness
>>> issue due to weakly read consistent on outer iterators
>>> SPARK-39061 Incorrect results or NPE when using Inline function
>>> against an array of dynamically created structs
>>> SPARK-39107 Silent change in regexp_replace's handling of empty
>>> strings
>>> SPARK-39259 Timestamps returned by now() and equivalent functions
>>> are not consistent in subqueries
>>> SPARK-39293 The accumulator of ArrayAggregate should copy the
>>> intermediate result if string, struct, array, or map
>>>
>>> Best,
>>> Dongjoon.
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>> --
John Zhuge

Re: Apache Spark 3.2.2 Release?

2022-07-06 Thread Bjørn Jørgensen

+1

ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon :

> Yeah +1
>
> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun 
> wrote:
>
>> Hi, All.
>>
>> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
>> including 11 correctness patches arrived at branch-3.2.
>>
>> Shall we make a new release, Apache Spark 3.2.2, as the third release
>> at 3.2 line? I'd like to volunteer as the release manager for Apache
>> Spark 3.2.2. I'm thinking about starting the first RC next week.
>>
>> $ git log --oneline v3.2.1..HEAD | wc -l
>>  197
>>
>> # Correctness issues
>>
>> SPARK-38075 Hive script transform with order by and limit will
>> return fake rows
>> SPARK-38204 All state operators are at a risk of inconsistency
>> between state partitioning and operator partitioning
>> SPARK-38309 SHS has incorrect percentiles for shuffle read bytes
>> and shuffle total blocks metrics
>> SPARK-38320 (flat)MapGroupsWithState can timeout groups which just
>> received inputs in the same microbatch
>> SPARK-38614 After Spark update, df.show() shows incorrect
>> F.percent_rank results
>> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset
>> row whose input is not null
>> SPARK-38684 Stream-stream outer join has a possible correctness
>> issue due to weakly read consistent on outer iterators
>> SPARK-39061 Incorrect results or NPE when using Inline function
>> against an array of dynamically created structs
>> SPARK-39107 Silent change in regexp_replace's handling of empty
>> strings
>> SPARK-39259 Timestamps returned by now() and equivalent functions
>> are not consistent in subqueries
>> SPARK-39293 The accumulator of ArrayAggregate should copy the
>> intermediate result if string, struct, array, or map
>>
>> Best,
>> Dongjoon.
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>

Re: Apache Spark 3.2.2 Release?

2022-07-06 Thread Hyukjin Kwon

Yeah +1

On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun 
wrote:

> Hi, All.
>
> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
> including 11 correctness patches arrived at branch-3.2.
>
> Shall we make a new release, Apache Spark 3.2.2, as the third release
> at 3.2 line? I'd like to volunteer as the release manager for Apache
> Spark 3.2.2. I'm thinking about starting the first RC next week.
>
> $ git log --oneline v3.2.1..HEAD | wc -l
>  197
>
> # Correctness issues
>
> SPARK-38075 Hive script transform with order by and limit will
> return fake rows
> SPARK-38204 All state operators are at a risk of inconsistency
> between state partitioning and operator partitioning
> SPARK-38309 SHS has incorrect percentiles for shuffle read bytes
> and shuffle total blocks metrics
> SPARK-38320 (flat)MapGroupsWithState can timeout groups which just
> received inputs in the same microbatch
> SPARK-38614 After Spark update, df.show() shows incorrect
> F.percent_rank results
> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset
> row whose input is not null
> SPARK-38684 Stream-stream outer join has a possible correctness
> issue due to weakly read consistent on outer iterators
> SPARK-39061 Incorrect results or NPE when using Inline function
> against an array of dynamically created structs
> SPARK-39107 Silent change in regexp_replace's handling of empty strings
> SPARK-39259 Timestamps returned by now() and equivalent functions
> are not consistent in subqueries
> SPARK-39293 The accumulator of ArrayAggregate should copy the
> intermediate result if string, struct, array, or map
>
> Best,
> Dongjoon.
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Apache Spark 3.2.2 Release?

2022-07-06 Thread Dongjoon Hyun

Hi, All.

Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches
including 11 correctness patches arrived at branch-3.2.

Shall we make a new release, Apache Spark 3.2.2, as the third release
at 3.2 line? I'd like to volunteer as the release manager for Apache
Spark 3.2.2. I'm thinking about starting the first RC next week.

$ git log --oneline v3.2.1..HEAD | wc -l
 197

# Correctness issues

SPARK-38075 Hive script transform with order by and limit will
return fake rows
SPARK-38204 All state operators are at a risk of inconsistency
between state partitioning and operator partitioning
SPARK-38309 SHS has incorrect percentiles for shuffle read bytes
and shuffle total blocks metrics
SPARK-38320 (flat)MapGroupsWithState can timeout groups which just
received inputs in the same microbatch
SPARK-38614 After Spark update, df.show() shows incorrect
F.percent_rank results
SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset
row whose input is not null
SPARK-38684 Stream-stream outer join has a possible correctness
issue due to weakly read consistent on outer iterators
SPARK-39061 Incorrect results or NPE when using Inline function
against an array of dynamically created structs
SPARK-39107 Silent change in regexp_replace's handling of empty strings
SPARK-39259 Timestamps returned by now() and equivalent functions
are not consistent in subqueries
SPARK-39293 The accumulator of ArrayAggregate should copy the
intermediate result if string, struct, array, or map

Best,
Dongjoon.

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Re: Apache Spark 3.2.2 Release?

Apache Spark 3.2.2 Release?

19 matches

Site Navigation

Mail list logo

Footer information