Re: Apache Spark 3.2.2 Release?
Thank you so much! :) Dongjoon. On Thu, Jul 7, 2022 at 6:51 PM Joshua Rosen wrote: > > +1; thanks for coordinating this! > > I have a few more correctness bugs to add to the list in your original email > (these were originally missing the 'correctness' JIRA label): > > - https://issues.apache.org/jira/browse/SPARK-37643 : when > charVarcharAsString is true, char datatype partition table query incorrect > - https://issues.apache.org/jira/browse/SPARK-37865 : Spark should not dedup > the groupingExpressions when the first child of Union has duplicate columns > - https://issues.apache.org/jira/browse/SPARK-38787 : Possible correctness > issue on stream-stream join when handling edge case > > > On Thu, Jul 7, 2022 at 6:12 PM Dongjoon Hyun wrote: >> >> Thank you all. >> >> I'll check and prepare RC1 for next week. >> >> Dongjoon. - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: Apache Spark 3.2.2 Release?
+1; thanks for coordinating this! I have a few more correctness bugs to add to the list in your original email (these were originally missing the 'correctness' JIRA label): - https://issues.apache.org/jira/browse/SPARK-37643 : when charVarcharAsString is true, char datatype partition table query incorrect - https://issues.apache.org/jira/browse/SPARK-37865 : Spark should not dedup the groupingExpressions when the first child of Union has duplicate columns - https://issues.apache.org/jira/browse/SPARK-38787 : Possible correctness issue on stream-stream join when handling edge case On Thu, Jul 7, 2022 at 6:12 PM Dongjoon Hyun wrote: > Thank you all. > > I'll check and prepare RC1 for next week. > > Dongjoon. >
Re: Apache Spark 3.2.2 Release?
Thank you all. I'll check and prepare RC1 for next week. Dongjoon.
Re: Apache Spark 3.2.2 Release?
+1 (non-binding) Thanks! On Thu, Jul 7, 2022 at 7:00 AM Yang,Jie(INF) wrote: > +1 (non-binding) Thank you Dongjoon ~ > > > > *发件人**: *Ruifeng Zheng > *日期**: *2022年7月7日 星期四 16:28 > *收件人**: *dev > *主题**: *Re: Apache Spark 3.2.2 Release? > > > > +1 thank you Dongjoon! > > > -- > > [image: 图像已被发件人删除。] > > Ruifeng Zheng > > ruife...@foxmail.com > > > > > > > > -- Original -- > > *From:* "Yikun Jiang" ; > > *Date:* Thu, Jul 7, 2022 04:16 PM > > *To:* "Mridul Muralidharan"; > > *Cc:* "Gengliang Wang";"Cheng Su";"Maxim > Gekk";"Wenchen > Fan";"Xiao > Li";"Xinrong > Meng";"Yuming Wang" >;"dev"; > > *Subject:* Re: Apache Spark 3.2.2 Release? > > > > +1 (non-binding) > > > > Thanks! > > > Regards, > > Yikun > > > > > > On Thu, Jul 7, 2022 at 1:57 PM Mridul Muralidharan > wrote: > > +1 > > > > Thanks for driving this Dongjoon ! > > > > Regards, > > Mridul > > > > On Thu, Jul 7, 2022 at 12:36 AM Gengliang Wang wrote: > > +1. > > Thank you, Dongjoon. > > > > On Wed, Jul 6, 2022 at 10:21 PM Wenchen Fan wrote: > > +1 > > > > On Thu, Jul 7, 2022 at 10:41 AM Xinrong Meng > wrote: > > +1 > > > Thanks! > > > > Xinrong Meng > > Software Engineer > > Databricks > > > > > > On Wed, Jul 6, 2022 at 7:25 PM Xiao Li wrote: > > +1 > > > > Xiao > > > > Cheng Su 于2022年7月6日周三 19:16写道: > > +1 (non-binding) > > > > Thanks, > > Cheng Su > > > > On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang wrote: > > +1 > > > > On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk > wrote: > > +1 > > > > On Thu, Jul 7, 2022 at 12:26 AM John Zhuge wrote: > > +1 Thanks for the effort! > > > > On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen > wrote: > > +1 > > > > ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon : > > Yeah +1 > > > > On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun > wrote: > > Hi, All. > > Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches > including 11 correctness patches arrived at branch-3.2. > > Shall we make a new release, Apache Spark 3.2.2, as the third release > at 3.2 line? I'd like to volunteer as the release manager for Apache > Spark 3.2.2. I'm thinking about starting the first RC next week. > > $ git log --oneline v3.2.1..HEAD | wc -l > 197 > > # Correctness issues > > SPARK-38075 Hive script transform with order by and limit will > return fake rows > SPARK-38204 All state operators are at a risk of inconsistency > between state partitioning and operator partitioning > SPARK-38309 SHS has incorrect percentiles for shuffle read bytes > and shuffle total blocks metrics > SPARK-38320 (flat)MapGroupsWithState can timeout groups which just > received inputs in the same microbatch > SPARK-38614 After Spark update, df.show() shows incorrect > F.percent_rank results > SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset > row whose input is not null > SPARK-38684 Stream-stream outer join has a possible correctness > issue due to weakly read consistent on outer iterators > SPARK-39061 Incorrect results or NPE when using Inline function > against an array of dynamically created structs > SPARK-39107 Silent change in regexp_replace's handling of empty strings > SPARK-39259 Timestamps returned by now() and equivalent functions > are not consistent in subqueries > SPARK-39293 The accumulator of ArrayAggregate should copy the > intermediate result if string, struct, array, or map > > Best, > Dongjoon. > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > -- > > John Zhuge > >
Re: Apache Spark 3.2.2 Release?
+1 (non-binding) Thank you Dongjoon ~ 发件人: Ruifeng Zheng 日期: 2022年7月7日 星期四 16:28 收件人: dev 主题: Re: Apache Spark 3.2.2 Release? +1 thank you Dongjoon! [图像已被发件人删除。] Ruifeng Zheng ruife...@foxmail.com -- Original -- From: "Yikun Jiang" ; Date: Thu, Jul 7, 2022 04:16 PM To: "Mridul Muralidharan"; Cc: "Gengliang Wang";"Cheng Su";"Maxim Gekk";"Wenchen Fan";"Xiao Li";"Xinrong Meng";"Yuming Wang";"dev"; Subject: Re: Apache Spark 3.2.2 Release? +1 (non-binding) Thanks! Regards, Yikun On Thu, Jul 7, 2022 at 1:57 PM Mridul Muralidharan mailto:mri...@gmail.com>> wrote: +1 Thanks for driving this Dongjoon ! Regards, Mridul On Thu, Jul 7, 2022 at 12:36 AM Gengliang Wang mailto:ltn...@gmail.com>> wrote: +1. Thank you, Dongjoon. On Wed, Jul 6, 2022 at 10:21 PM Wenchen Fan mailto:cloud0...@gmail.com>> wrote: +1 On Thu, Jul 7, 2022 at 10:41 AM Xinrong Meng wrote: +1 Thanks! Xinrong Meng Software Engineer Databricks On Wed, Jul 6, 2022 at 7:25 PM Xiao Li mailto:gatorsm...@gmail.com>> wrote: +1 Xiao Cheng Su mailto:scnj...@gmail.com>> 于2022年7月6日周三 19:16写道: +1 (non-binding) Thanks, Cheng Su On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang mailto:wgy...@gmail.com>> wrote: +1 On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk wrote: +1 On Thu, Jul 7, 2022 at 12:26 AM John Zhuge mailto:jzh...@apache.org>> wrote: +1 Thanks for the effort! On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen mailto:bjornjorgen...@gmail.com>> wrote: +1 ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon mailto:gurwls...@gmail.com>>: Yeah +1 On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun mailto:dongjoon.h...@gmail.com>> wrote: Hi, All. Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches including 11 correctness patches arrived at branch-3.2. Shall we make a new release, Apache Spark 3.2.2, as the third release at 3.2 line? I'd like to volunteer as the release manager for Apache Spark 3.2.2. I'm thinking about starting the first RC next week. $ git log --oneline v3.2.1..HEAD | wc -l 197 # Correctness issues SPARK-38075 Hive script transform with order by and limit will return fake rows SPARK-38204 All state operators are at a risk of inconsistency between state partitioning and operator partitioning SPARK-38309 SHS has incorrect percentiles for shuffle read bytes and shuffle total blocks metrics SPARK-38320 (flat)MapGroupsWithState can timeout groups which just received inputs in the same microbatch SPARK-38614 After Spark update, df.show() shows incorrect F.percent_rank results SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset row whose input is not null SPARK-38684 Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators SPARK-39061 Incorrect results or NPE when using Inline function against an array of dynamically created structs SPARK-39107 Silent change in regexp_replace's handling of empty strings SPARK-39259 Timestamps returned by now() and equivalent functions are not consistent in subqueries SPARK-39293 The accumulator of ArrayAggregate should copy the intermediate result if string, struct, array, or map Best, Dongjoon. - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org> -- John Zhuge
Re: Apache Spark 3.2.2 Release?
+1 thank you Dongjoon! RuifengZheng ruife...@foxmail.com --Original-- From: "Yikun Jiang"
Re: Apache Spark 3.2.2 Release?
+1 (non-binding) Thanks! Regards, Yikun On Thu, Jul 7, 2022 at 1:57 PM Mridul Muralidharan wrote: > +1 > > Thanks for driving this Dongjoon ! > > Regards, > Mridul > > On Thu, Jul 7, 2022 at 12:36 AM Gengliang Wang wrote: > >> +1. >> Thank you, Dongjoon. >> >> On Wed, Jul 6, 2022 at 10:21 PM Wenchen Fan wrote: >> >>> +1 >>> >>> On Thu, Jul 7, 2022 at 10:41 AM Xinrong Meng >>> wrote: >>> +1 Thanks! Xinrong Meng Software Engineer Databricks On Wed, Jul 6, 2022 at 7:25 PM Xiao Li wrote: > +1 > > Xiao > > Cheng Su 于2022年7月6日周三 19:16写道: > >> +1 (non-binding) >> >> Thanks, >> Cheng Su >> >> On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang wrote: >> >>> +1 >>> >>> On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk >>> wrote: >>> +1 On Thu, Jul 7, 2022 at 12:26 AM John Zhuge wrote: > +1 Thanks for the effort! > > On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen < > bjornjorgen...@gmail.com> wrote: > >> +1 >> >> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon > >: >> >>> Yeah +1 >>> >>> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun < >>> dongjoon.h...@gmail.com> wrote: >>> Hi, All. Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches including 11 correctness patches arrived at branch-3.2. Shall we make a new release, Apache Spark 3.2.2, as the third release at 3.2 line? I'd like to volunteer as the release manager for Apache Spark 3.2.2. I'm thinking about starting the first RC next week. $ git log --oneline v3.2.1..HEAD | wc -l 197 # Correctness issues SPARK-38075 Hive script transform with order by and limit will return fake rows SPARK-38204 All state operators are at a risk of inconsistency between state partitioning and operator partitioning SPARK-38309 SHS has incorrect percentiles for shuffle read bytes and shuffle total blocks metrics SPARK-38320 (flat)MapGroupsWithState can timeout groups which just received inputs in the same microbatch SPARK-38614 After Spark update, df.show() shows incorrect F.percent_rank results SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset row whose input is not null SPARK-38684 Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators SPARK-39061 Incorrect results or NPE when using Inline function against an array of dynamically created structs SPARK-39107 Silent change in regexp_replace's handling of empty strings SPARK-39259 Timestamps returned by now() and equivalent functions are not consistent in subqueries SPARK-39293 The accumulator of ArrayAggregate should copy the intermediate result if string, struct, array, or map Best, Dongjoon. - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org -- > John Zhuge >
Re: Apache Spark 3.2.2 Release?
+1 Thanks for driving this Dongjoon ! Regards, Mridul On Thu, Jul 7, 2022 at 12:36 AM Gengliang Wang wrote: > +1. > Thank you, Dongjoon. > > On Wed, Jul 6, 2022 at 10:21 PM Wenchen Fan wrote: > >> +1 >> >> On Thu, Jul 7, 2022 at 10:41 AM Xinrong Meng >> wrote: >> >>> +1 >>> >>> Thanks! >>> >>> >>> Xinrong Meng >>> >>> Software Engineer >>> >>> Databricks >>> >>> >>> On Wed, Jul 6, 2022 at 7:25 PM Xiao Li wrote: >>> +1 Xiao Cheng Su 于2022年7月6日周三 19:16写道: > +1 (non-binding) > > Thanks, > Cheng Su > > On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang wrote: > >> +1 >> >> On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk >> wrote: >> >>> +1 >>> >>> On Thu, Jul 7, 2022 at 12:26 AM John Zhuge >>> wrote: >>> +1 Thanks for the effort! On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen < bjornjorgen...@gmail.com> wrote: > +1 > > ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon : > >> Yeah +1 >> >> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun < >> dongjoon.h...@gmail.com> wrote: >> >>> Hi, All. >>> >>> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches >>> including 11 correctness patches arrived at branch-3.2. >>> >>> Shall we make a new release, Apache Spark 3.2.2, as the third >>> release >>> at 3.2 line? I'd like to volunteer as the release manager for >>> Apache >>> Spark 3.2.2. I'm thinking about starting the first RC next week. >>> >>> $ git log --oneline v3.2.1..HEAD | wc -l >>> 197 >>> >>> # Correctness issues >>> >>> SPARK-38075 Hive script transform with order by and limit >>> will >>> return fake rows >>> SPARK-38204 All state operators are at a risk of >>> inconsistency >>> between state partitioning and operator partitioning >>> SPARK-38309 SHS has incorrect percentiles for shuffle read >>> bytes >>> and shuffle total blocks metrics >>> SPARK-38320 (flat)MapGroupsWithState can timeout groups >>> which just >>> received inputs in the same microbatch >>> SPARK-38614 After Spark update, df.show() shows incorrect >>> F.percent_rank results >>> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the >>> offset >>> row whose input is not null >>> SPARK-38684 Stream-stream outer join has a possible >>> correctness >>> issue due to weakly read consistent on outer iterators >>> SPARK-39061 Incorrect results or NPE when using Inline >>> function >>> against an array of dynamically created structs >>> SPARK-39107 Silent change in regexp_replace's handling of >>> empty strings >>> SPARK-39259 Timestamps returned by now() and equivalent >>> functions >>> are not consistent in subqueries >>> SPARK-39293 The accumulator of ArrayAggregate should copy the >>> intermediate result if string, struct, array, or map >>> >>> Best, >>> Dongjoon. >>> >>> >>> - >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> -- John Zhuge >>>
Re: Apache Spark 3.2.2 Release?
+1. Thank you, Dongjoon. On Wed, Jul 6, 2022 at 10:21 PM Wenchen Fan wrote: > +1 > > On Thu, Jul 7, 2022 at 10:41 AM Xinrong Meng > wrote: > >> +1 >> >> Thanks! >> >> >> Xinrong Meng >> >> Software Engineer >> >> Databricks >> >> >> On Wed, Jul 6, 2022 at 7:25 PM Xiao Li wrote: >> >>> +1 >>> >>> Xiao >>> >>> Cheng Su 于2022年7月6日周三 19:16写道: >>> +1 (non-binding) Thanks, Cheng Su On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang wrote: > +1 > > On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk > wrote: > >> +1 >> >> On Thu, Jul 7, 2022 at 12:26 AM John Zhuge wrote: >> >>> +1 Thanks for the effort! >>> >>> On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen < >>> bjornjorgen...@gmail.com> wrote: >>> +1 ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon : > Yeah +1 > > On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun < > dongjoon.h...@gmail.com> wrote: > >> Hi, All. >> >> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches >> including 11 correctness patches arrived at branch-3.2. >> >> Shall we make a new release, Apache Spark 3.2.2, as the third >> release >> at 3.2 line? I'd like to volunteer as the release manager for >> Apache >> Spark 3.2.2. I'm thinking about starting the first RC next week. >> >> $ git log --oneline v3.2.1..HEAD | wc -l >> 197 >> >> # Correctness issues >> >> SPARK-38075 Hive script transform with order by and limit will >> return fake rows >> SPARK-38204 All state operators are at a risk of inconsistency >> between state partitioning and operator partitioning >> SPARK-38309 SHS has incorrect percentiles for shuffle read >> bytes >> and shuffle total blocks metrics >> SPARK-38320 (flat)MapGroupsWithState can timeout groups which >> just >> received inputs in the same microbatch >> SPARK-38614 After Spark update, df.show() shows incorrect >> F.percent_rank results >> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the >> offset >> row whose input is not null >> SPARK-38684 Stream-stream outer join has a possible >> correctness >> issue due to weakly read consistent on outer iterators >> SPARK-39061 Incorrect results or NPE when using Inline >> function >> against an array of dynamically created structs >> SPARK-39107 Silent change in regexp_replace's handling of >> empty strings >> SPARK-39259 Timestamps returned by now() and equivalent >> functions >> are not consistent in subqueries >> SPARK-39293 The accumulator of ArrayAggregate should copy the >> intermediate result if string, struct, array, or map >> >> Best, >> Dongjoon. >> >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> -- >>> John Zhuge >>> >>
Re: Apache Spark 3.2.2 Release?
+1 On Thu, Jul 7, 2022 at 10:41 AM Xinrong Meng wrote: > +1 > > Thanks! > > > Xinrong Meng > > Software Engineer > > Databricks > > > On Wed, Jul 6, 2022 at 7:25 PM Xiao Li wrote: > >> +1 >> >> Xiao >> >> Cheng Su 于2022年7月6日周三 19:16写道: >> >>> +1 (non-binding) >>> >>> Thanks, >>> Cheng Su >>> >>> On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang wrote: >>> +1 On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk wrote: > +1 > > On Thu, Jul 7, 2022 at 12:26 AM John Zhuge wrote: > >> +1 Thanks for the effort! >> >> On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen < >> bjornjorgen...@gmail.com> wrote: >> >>> +1 >>> >>> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon : >>> Yeah +1 On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun < dongjoon.h...@gmail.com> wrote: > Hi, All. > > Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches > including 11 correctness patches arrived at branch-3.2. > > Shall we make a new release, Apache Spark 3.2.2, as the third > release > at 3.2 line? I'd like to volunteer as the release manager for > Apache > Spark 3.2.2. I'm thinking about starting the first RC next week. > > $ git log --oneline v3.2.1..HEAD | wc -l > 197 > > # Correctness issues > > SPARK-38075 Hive script transform with order by and limit will > return fake rows > SPARK-38204 All state operators are at a risk of inconsistency > between state partitioning and operator partitioning > SPARK-38309 SHS has incorrect percentiles for shuffle read > bytes > and shuffle total blocks metrics > SPARK-38320 (flat)MapGroupsWithState can timeout groups which > just > received inputs in the same microbatch > SPARK-38614 After Spark update, df.show() shows incorrect > F.percent_rank results > SPARK-38655 OffsetWindowFunctionFrameBase cannot find the > offset > row whose input is not null > SPARK-38684 Stream-stream outer join has a possible correctness > issue due to weakly read consistent on outer iterators > SPARK-39061 Incorrect results or NPE when using Inline function > against an array of dynamically created structs > SPARK-39107 Silent change in regexp_replace's handling of > empty strings > SPARK-39259 Timestamps returned by now() and equivalent > functions > are not consistent in subqueries > SPARK-39293 The accumulator of ArrayAggregate should copy the > intermediate result if string, struct, array, or map > > Best, > Dongjoon. > > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > -- >> John Zhuge >> >
Re: Apache Spark 3.2.2 Release?
+1 Thanks! Xinrong Meng Software Engineer Databricks On Wed, Jul 6, 2022 at 7:25 PM Xiao Li wrote: > +1 > > Xiao > > Cheng Su 于2022年7月6日周三 19:16写道: > >> +1 (non-binding) >> >> Thanks, >> Cheng Su >> >> On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang wrote: >> >>> +1 >>> >>> On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk >>> wrote: >>> +1 On Thu, Jul 7, 2022 at 12:26 AM John Zhuge wrote: > +1 Thanks for the effort! > > On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen < > bjornjorgen...@gmail.com> wrote: > >> +1 >> >> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon : >> >>> Yeah +1 >>> >>> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun < >>> dongjoon.h...@gmail.com> wrote: >>> Hi, All. Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches including 11 correctness patches arrived at branch-3.2. Shall we make a new release, Apache Spark 3.2.2, as the third release at 3.2 line? I'd like to volunteer as the release manager for Apache Spark 3.2.2. I'm thinking about starting the first RC next week. $ git log --oneline v3.2.1..HEAD | wc -l 197 # Correctness issues SPARK-38075 Hive script transform with order by and limit will return fake rows SPARK-38204 All state operators are at a risk of inconsistency between state partitioning and operator partitioning SPARK-38309 SHS has incorrect percentiles for shuffle read bytes and shuffle total blocks metrics SPARK-38320 (flat)MapGroupsWithState can timeout groups which just received inputs in the same microbatch SPARK-38614 After Spark update, df.show() shows incorrect F.percent_rank results SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset row whose input is not null SPARK-38684 Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators SPARK-39061 Incorrect results or NPE when using Inline function against an array of dynamically created structs SPARK-39107 Silent change in regexp_replace's handling of empty strings SPARK-39259 Timestamps returned by now() and equivalent functions are not consistent in subqueries SPARK-39293 The accumulator of ArrayAggregate should copy the intermediate result if string, struct, array, or map Best, Dongjoon. - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org -- > John Zhuge >
Re: Apache Spark 3.2.2 Release?
+1 Xiao Cheng Su 于2022年7月6日周三 19:16写道: > +1 (non-binding) > > Thanks, > Cheng Su > > On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang wrote: > >> +1 >> >> On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk >> wrote: >> >>> +1 >>> >>> On Thu, Jul 7, 2022 at 12:26 AM John Zhuge wrote: >>> +1 Thanks for the effort! On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen < bjornjorgen...@gmail.com> wrote: > +1 > > ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon : > >> Yeah +1 >> >> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun >> wrote: >> >>> Hi, All. >>> >>> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches >>> including 11 correctness patches arrived at branch-3.2. >>> >>> Shall we make a new release, Apache Spark 3.2.2, as the third release >>> at 3.2 line? I'd like to volunteer as the release manager for Apache >>> Spark 3.2.2. I'm thinking about starting the first RC next week. >>> >>> $ git log --oneline v3.2.1..HEAD | wc -l >>> 197 >>> >>> # Correctness issues >>> >>> SPARK-38075 Hive script transform with order by and limit will >>> return fake rows >>> SPARK-38204 All state operators are at a risk of inconsistency >>> between state partitioning and operator partitioning >>> SPARK-38309 SHS has incorrect percentiles for shuffle read bytes >>> and shuffle total blocks metrics >>> SPARK-38320 (flat)MapGroupsWithState can timeout groups which >>> just >>> received inputs in the same microbatch >>> SPARK-38614 After Spark update, df.show() shows incorrect >>> F.percent_rank results >>> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset >>> row whose input is not null >>> SPARK-38684 Stream-stream outer join has a possible correctness >>> issue due to weakly read consistent on outer iterators >>> SPARK-39061 Incorrect results or NPE when using Inline function >>> against an array of dynamically created structs >>> SPARK-39107 Silent change in regexp_replace's handling of empty >>> strings >>> SPARK-39259 Timestamps returned by now() and equivalent functions >>> are not consistent in subqueries >>> SPARK-39293 The accumulator of ArrayAggregate should copy the >>> intermediate result if string, struct, array, or map >>> >>> Best, >>> Dongjoon. >>> >>> - >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> -- John Zhuge >>>
Re: Apache Spark 3.2.2 Release?
+1 (non-binding) Thanks, Cheng Su On Wed, Jul 6, 2022 at 6:01 PM Yuming Wang wrote: > +1 > > On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk > wrote: > >> +1 >> >> On Thu, Jul 7, 2022 at 12:26 AM John Zhuge wrote: >> >>> +1 Thanks for the effort! >>> >>> On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen >>> wrote: >>> +1 ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon : > Yeah +1 > > On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun > wrote: > >> Hi, All. >> >> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches >> including 11 correctness patches arrived at branch-3.2. >> >> Shall we make a new release, Apache Spark 3.2.2, as the third release >> at 3.2 line? I'd like to volunteer as the release manager for Apache >> Spark 3.2.2. I'm thinking about starting the first RC next week. >> >> $ git log --oneline v3.2.1..HEAD | wc -l >> 197 >> >> # Correctness issues >> >> SPARK-38075 Hive script transform with order by and limit will >> return fake rows >> SPARK-38204 All state operators are at a risk of inconsistency >> between state partitioning and operator partitioning >> SPARK-38309 SHS has incorrect percentiles for shuffle read bytes >> and shuffle total blocks metrics >> SPARK-38320 (flat)MapGroupsWithState can timeout groups which just >> received inputs in the same microbatch >> SPARK-38614 After Spark update, df.show() shows incorrect >> F.percent_rank results >> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset >> row whose input is not null >> SPARK-38684 Stream-stream outer join has a possible correctness >> issue due to weakly read consistent on outer iterators >> SPARK-39061 Incorrect results or NPE when using Inline function >> against an array of dynamically created structs >> SPARK-39107 Silent change in regexp_replace's handling of empty >> strings >> SPARK-39259 Timestamps returned by now() and equivalent functions >> are not consistent in subqueries >> SPARK-39293 The accumulator of ArrayAggregate should copy the >> intermediate result if string, struct, array, or map >> >> Best, >> Dongjoon. >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> -- >>> John Zhuge >>> >>
Re: Apache Spark 3.2.2 Release?
+1 On Thu, Jul 7, 2022 at 5:53 AM Maxim Gekk wrote: > +1 > > On Thu, Jul 7, 2022 at 12:26 AM John Zhuge wrote: > >> +1 Thanks for the effort! >> >> On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen >> wrote: >> >>> +1 >>> >>> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon : >>> Yeah +1 On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun wrote: > Hi, All. > > Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches > including 11 correctness patches arrived at branch-3.2. > > Shall we make a new release, Apache Spark 3.2.2, as the third release > at 3.2 line? I'd like to volunteer as the release manager for Apache > Spark 3.2.2. I'm thinking about starting the first RC next week. > > $ git log --oneline v3.2.1..HEAD | wc -l > 197 > > # Correctness issues > > SPARK-38075 Hive script transform with order by and limit will > return fake rows > SPARK-38204 All state operators are at a risk of inconsistency > between state partitioning and operator partitioning > SPARK-38309 SHS has incorrect percentiles for shuffle read bytes > and shuffle total blocks metrics > SPARK-38320 (flat)MapGroupsWithState can timeout groups which just > received inputs in the same microbatch > SPARK-38614 After Spark update, df.show() shows incorrect > F.percent_rank results > SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset > row whose input is not null > SPARK-38684 Stream-stream outer join has a possible correctness > issue due to weakly read consistent on outer iterators > SPARK-39061 Incorrect results or NPE when using Inline function > against an array of dynamically created structs > SPARK-39107 Silent change in regexp_replace's handling of empty > strings > SPARK-39259 Timestamps returned by now() and equivalent functions > are not consistent in subqueries > SPARK-39293 The accumulator of ArrayAggregate should copy the > intermediate result if string, struct, array, or map > > Best, > Dongjoon. > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > -- >> John Zhuge >> >
Re: Apache Spark 3.2.2 Release?
+1 On Thu, Jul 7, 2022 at 12:26 AM John Zhuge wrote: > +1 Thanks for the effort! > > On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen > wrote: > >> +1 >> >> ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon : >> >>> Yeah +1 >>> >>> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun >>> wrote: >>> Hi, All. Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches including 11 correctness patches arrived at branch-3.2. Shall we make a new release, Apache Spark 3.2.2, as the third release at 3.2 line? I'd like to volunteer as the release manager for Apache Spark 3.2.2. I'm thinking about starting the first RC next week. $ git log --oneline v3.2.1..HEAD | wc -l 197 # Correctness issues SPARK-38075 Hive script transform with order by and limit will return fake rows SPARK-38204 All state operators are at a risk of inconsistency between state partitioning and operator partitioning SPARK-38309 SHS has incorrect percentiles for shuffle read bytes and shuffle total blocks metrics SPARK-38320 (flat)MapGroupsWithState can timeout groups which just received inputs in the same microbatch SPARK-38614 After Spark update, df.show() shows incorrect F.percent_rank results SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset row whose input is not null SPARK-38684 Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators SPARK-39061 Incorrect results or NPE when using Inline function against an array of dynamically created structs SPARK-39107 Silent change in regexp_replace's handling of empty strings SPARK-39259 Timestamps returned by now() and equivalent functions are not consistent in subqueries SPARK-39293 The accumulator of ArrayAggregate should copy the intermediate result if string, struct, array, or map Best, Dongjoon. - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org -- > John Zhuge >
Re: Apache Spark 3.2.2 Release?
+1 Thanks for the effort! On Wed, Jul 6, 2022 at 2:23 PM Bjørn Jørgensen wrote: > +1 > > ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon : > >> Yeah +1 >> >> On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun >> wrote: >> >>> Hi, All. >>> >>> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches >>> including 11 correctness patches arrived at branch-3.2. >>> >>> Shall we make a new release, Apache Spark 3.2.2, as the third release >>> at 3.2 line? I'd like to volunteer as the release manager for Apache >>> Spark 3.2.2. I'm thinking about starting the first RC next week. >>> >>> $ git log --oneline v3.2.1..HEAD | wc -l >>> 197 >>> >>> # Correctness issues >>> >>> SPARK-38075 Hive script transform with order by and limit will >>> return fake rows >>> SPARK-38204 All state operators are at a risk of inconsistency >>> between state partitioning and operator partitioning >>> SPARK-38309 SHS has incorrect percentiles for shuffle read bytes >>> and shuffle total blocks metrics >>> SPARK-38320 (flat)MapGroupsWithState can timeout groups which just >>> received inputs in the same microbatch >>> SPARK-38614 After Spark update, df.show() shows incorrect >>> F.percent_rank results >>> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset >>> row whose input is not null >>> SPARK-38684 Stream-stream outer join has a possible correctness >>> issue due to weakly read consistent on outer iterators >>> SPARK-39061 Incorrect results or NPE when using Inline function >>> against an array of dynamically created structs >>> SPARK-39107 Silent change in regexp_replace's handling of empty >>> strings >>> SPARK-39259 Timestamps returned by now() and equivalent functions >>> are not consistent in subqueries >>> SPARK-39293 The accumulator of ArrayAggregate should copy the >>> intermediate result if string, struct, array, or map >>> >>> Best, >>> Dongjoon. >>> >>> - >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> -- John Zhuge
Re: Apache Spark 3.2.2 Release?
+1 ons. 6. jul. 2022, 23:05 skrev Hyukjin Kwon : > Yeah +1 > > On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun > wrote: > >> Hi, All. >> >> Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches >> including 11 correctness patches arrived at branch-3.2. >> >> Shall we make a new release, Apache Spark 3.2.2, as the third release >> at 3.2 line? I'd like to volunteer as the release manager for Apache >> Spark 3.2.2. I'm thinking about starting the first RC next week. >> >> $ git log --oneline v3.2.1..HEAD | wc -l >> 197 >> >> # Correctness issues >> >> SPARK-38075 Hive script transform with order by and limit will >> return fake rows >> SPARK-38204 All state operators are at a risk of inconsistency >> between state partitioning and operator partitioning >> SPARK-38309 SHS has incorrect percentiles for shuffle read bytes >> and shuffle total blocks metrics >> SPARK-38320 (flat)MapGroupsWithState can timeout groups which just >> received inputs in the same microbatch >> SPARK-38614 After Spark update, df.show() shows incorrect >> F.percent_rank results >> SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset >> row whose input is not null >> SPARK-38684 Stream-stream outer join has a possible correctness >> issue due to weakly read consistent on outer iterators >> SPARK-39061 Incorrect results or NPE when using Inline function >> against an array of dynamically created structs >> SPARK-39107 Silent change in regexp_replace's handling of empty >> strings >> SPARK-39259 Timestamps returned by now() and equivalent functions >> are not consistent in subqueries >> SPARK-39293 The accumulator of ArrayAggregate should copy the >> intermediate result if string, struct, array, or map >> >> Best, >> Dongjoon. >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >>
Re: Apache Spark 3.2.2 Release?
Yeah +1 On Thu, Jul 7, 2022 at 5:40 AM Dongjoon Hyun wrote: > Hi, All. > > Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches > including 11 correctness patches arrived at branch-3.2. > > Shall we make a new release, Apache Spark 3.2.2, as the third release > at 3.2 line? I'd like to volunteer as the release manager for Apache > Spark 3.2.2. I'm thinking about starting the first RC next week. > > $ git log --oneline v3.2.1..HEAD | wc -l > 197 > > # Correctness issues > > SPARK-38075 Hive script transform with order by and limit will > return fake rows > SPARK-38204 All state operators are at a risk of inconsistency > between state partitioning and operator partitioning > SPARK-38309 SHS has incorrect percentiles for shuffle read bytes > and shuffle total blocks metrics > SPARK-38320 (flat)MapGroupsWithState can timeout groups which just > received inputs in the same microbatch > SPARK-38614 After Spark update, df.show() shows incorrect > F.percent_rank results > SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset > row whose input is not null > SPARK-38684 Stream-stream outer join has a possible correctness > issue due to weakly read consistent on outer iterators > SPARK-39061 Incorrect results or NPE when using Inline function > against an array of dynamically created structs > SPARK-39107 Silent change in regexp_replace's handling of empty strings > SPARK-39259 Timestamps returned by now() and equivalent functions > are not consistent in subqueries > SPARK-39293 The accumulator of ArrayAggregate should copy the > intermediate result if string, struct, array, or map > > Best, > Dongjoon. > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >
Apache Spark 3.2.2 Release?
Hi, All. Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches including 11 correctness patches arrived at branch-3.2. Shall we make a new release, Apache Spark 3.2.2, as the third release at 3.2 line? I'd like to volunteer as the release manager for Apache Spark 3.2.2. I'm thinking about starting the first RC next week. $ git log --oneline v3.2.1..HEAD | wc -l 197 # Correctness issues SPARK-38075 Hive script transform with order by and limit will return fake rows SPARK-38204 All state operators are at a risk of inconsistency between state partitioning and operator partitioning SPARK-38309 SHS has incorrect percentiles for shuffle read bytes and shuffle total blocks metrics SPARK-38320 (flat)MapGroupsWithState can timeout groups which just received inputs in the same microbatch SPARK-38614 After Spark update, df.show() shows incorrect F.percent_rank results SPARK-38655 OffsetWindowFunctionFrameBase cannot find the offset row whose input is not null SPARK-38684 Stream-stream outer join has a possible correctness issue due to weakly read consistent on outer iterators SPARK-39061 Incorrect results or NPE when using Inline function against an array of dynamically created structs SPARK-39107 Silent change in regexp_replace's handling of empty strings SPARK-39259 Timestamps returned by now() and equivalent functions are not consistent in subqueries SPARK-39293 The accumulator of ArrayAggregate should copy the intermediate result if string, struct, array, or map Best, Dongjoon. - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org