Re: Time for 2.3.2?
FYI, currently we have one block issue ( https://issues.apache.org/jira/browse/SPARK-24535), will start the release after this is fixed. Also please let me know if there're other blocks or fixes want to land to 2.3.2 release. Thanks Saisai Saisai Shao 于2018年7月2日周一 下午1:16写道: > I will start preparing the release. > > Thanks > > John Zhuge 于2018年6月30日周六 上午10:31写道: > >> +1 Looking forward to the critical fixes in 2.3.2. >> >> On Thu, Jun 28, 2018 at 9:37 AM Ryan Blue >> wrote: >> >>> +1 >>> >>> On Thu, Jun 28, 2018 at 9:34 AM Xiao Li wrote: >>> +1. Thanks, Saisai! The impact of SPARK-24495 is large. We should release Spark 2.3.2 ASAP. Thanks, Xiao 2018-06-27 23:28 GMT-07:00 Takeshi Yamamuro : > +1, I heard some Spark users have skipped v2.3.1 because of these bugs. > > On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang > wrote: > >> +1 >> >> Wenchen Fan 于2018年6月28日 周四下午2:06写道: >> >>> Hi Saisai, that's great! please go ahead! >>> >>> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao >>> wrote: >>> +1, like mentioned by Marcelo, these issues seems quite severe. I can work on the release if short of hands :). Thanks Jerry Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: > +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get > fixes > for those out. > > (Those are what delayed 2.2.2 and 2.1.3 for those watching...) > > On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan > wrote: > > Hi all, > > > > Spark 2.3.1 was released just a while ago, but unfortunately we > discovered > > and fixed some critical issues afterward. > > > > SPARK-24495: SortMergeJoin may produce wrong result. > > This is a serious correctness bug, and is easy to hit: have > duplicated join > > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = > t2.c`, and the > > join is a sort merge join. This bug is only present in Spark 2.3. > > > > SPARK-24588: stream-stream join may produce wrong result > > This is a correctness bug in a new feature of Spark 2.3: the > stream-stream > > join. Users can hit this bug if one of the join side is > partitioned by a > > subset of the join keys. > > > > SPARK-24552: Task attempt numbers are reused when stages are > retried > > This is a long-standing bug in the output committer that may > introduce data > > corruption. > > > > SPARK-24542: UDFXPath allow users to pass carefully crafted > XML to > > access arbitrary files > > This is a potential security issue if users build access control > module upon > > Spark. > > > > I think we need a Spark 2.3.2 to address these issues(especially > the > > correctness bugs) ASAP. Any thoughts? > > > > Thanks, > > Wenchen > > > > -- > Marcelo > > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > > > -- > --- > Takeshi Yamamuro > >>> >>> -- >>> Ryan Blue >>> Software Engineer >>> Netflix >>> >>> -- >>> John Zhuge >>> >>
Re: Time for 2.3.2?
I will start preparing the release. Thanks John Zhuge 于2018年6月30日周六 上午10:31写道: > +1 Looking forward to the critical fixes in 2.3.2. > > On Thu, Jun 28, 2018 at 9:37 AM Ryan Blue > wrote: > >> +1 >> >> On Thu, Jun 28, 2018 at 9:34 AM Xiao Li wrote: >> >>> +1. Thanks, Saisai! >>> >>> The impact of SPARK-24495 is large. We should release Spark 2.3.2 ASAP. >>> >>> Thanks, >>> >>> Xiao >>> >>> 2018-06-27 23:28 GMT-07:00 Takeshi Yamamuro : >>> +1, I heard some Spark users have skipped v2.3.1 because of these bugs. On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang wrote: > +1 > > Wenchen Fan 于2018年6月28日 周四下午2:06写道: > >> Hi Saisai, that's great! please go ahead! >> >> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao >> wrote: >> >>> +1, like mentioned by Marcelo, these issues seems quite severe. >>> >>> I can work on the release if short of hands :). >>> >>> Thanks >>> Jerry >>> >>> >>> Marcelo Vanzin 于2018年6月28日周四 >>> 上午11:40写道: >>> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes for those out. (Those are what delayed 2.2.2 and 2.1.3 for those watching...) On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan wrote: > Hi all, > > Spark 2.3.1 was released just a while ago, but unfortunately we discovered > and fixed some critical issues afterward. > > SPARK-24495: SortMergeJoin may produce wrong result. > This is a serious correctness bug, and is easy to hit: have duplicated join > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and the > join is a sort merge join. This bug is only present in Spark 2.3. > > SPARK-24588: stream-stream join may produce wrong result > This is a correctness bug in a new feature of Spark 2.3: the stream-stream > join. Users can hit this bug if one of the join side is partitioned by a > subset of the join keys. > > SPARK-24552: Task attempt numbers are reused when stages are retried > This is a long-standing bug in the output committer that may introduce data > corruption. > > SPARK-24542: UDFXPath allow users to pass carefully crafted XML to > access arbitrary files > This is a potential security issue if users build access control module upon > Spark. > > I think we need a Spark 2.3.2 to address these issues(especially the > correctness bugs) ASAP. Any thoughts? > > Thanks, > Wenchen -- Marcelo - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org -- --- Takeshi Yamamuro >>> >>> >> >> -- >> Ryan Blue >> Software Engineer >> Netflix >> >> -- >> John Zhuge >> >
Re: Time for 2.3.2?
+1 Looking forward to the critical fixes in 2.3.2. On Thu, Jun 28, 2018 at 9:37 AM Ryan Blue wrote: > +1 > > On Thu, Jun 28, 2018 at 9:34 AM Xiao Li wrote: > >> +1. Thanks, Saisai! >> >> The impact of SPARK-24495 is large. We should release Spark 2.3.2 ASAP. >> >> Thanks, >> >> Xiao >> >> 2018-06-27 23:28 GMT-07:00 Takeshi Yamamuro : >> >>> +1, I heard some Spark users have skipped v2.3.1 because of these bugs. >>> >>> On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang >>> wrote: >>> +1 Wenchen Fan 于2018年6月28日 周四下午2:06写道: > Hi Saisai, that's great! please go ahead! > > On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao > wrote: > >> +1, like mentioned by Marcelo, these issues seems quite severe. >> >> I can work on the release if short of hands :). >> >> Thanks >> Jerry >> >> >> Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: >> >>> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes >>> for those out. >>> >>> (Those are what delayed 2.2.2 and 2.1.3 for those watching...) >>> >>> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan >>> wrote: >>> > Hi all, >>> > >>> > Spark 2.3.1 was released just a while ago, but unfortunately we >>> discovered >>> > and fixed some critical issues afterward. >>> > >>> > SPARK-24495: SortMergeJoin may produce wrong result. >>> > This is a serious correctness bug, and is easy to hit: have >>> duplicated join >>> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, >>> and the >>> > join is a sort merge join. This bug is only present in Spark 2.3. >>> > >>> > SPARK-24588: stream-stream join may produce wrong result >>> > This is a correctness bug in a new feature of Spark 2.3: the >>> stream-stream >>> > join. Users can hit this bug if one of the join side is >>> partitioned by a >>> > subset of the join keys. >>> > >>> > SPARK-24552: Task attempt numbers are reused when stages are >>> retried >>> > This is a long-standing bug in the output committer that may >>> introduce data >>> > corruption. >>> > >>> > SPARK-24542: UDFXPath allow users to pass carefully crafted >>> XML to >>> > access arbitrary files >>> > This is a potential security issue if users build access control >>> module upon >>> > Spark. >>> > >>> > I think we need a Spark 2.3.2 to address these issues(especially >>> the >>> > correctness bugs) ASAP. Any thoughts? >>> > >>> > Thanks, >>> > Wenchen >>> >>> >>> >>> -- >>> Marcelo >>> >>> - >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>> >>> >>> -- >>> --- >>> Takeshi Yamamuro >>> >> >> > > -- > Ryan Blue > Software Engineer > Netflix > > -- > John Zhuge >
Re: Time for 2.3.2?
+1. We are evaluating 2.3.1, please release Spark 2.3.2 ASAP. Thanks, Yucai
Re: Time for 2.3.2?
+1. Need to release Spark 2.3.2 ASAP Thanks, Venkata Ramana Gollamudi -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
Re: Time for 2.3.2?
+1 On Thu, Jun 28, 2018 at 9:34 AM Xiao Li wrote: > +1. Thanks, Saisai! > > The impact of SPARK-24495 is large. We should release Spark 2.3.2 ASAP. > > Thanks, > > Xiao > > 2018-06-27 23:28 GMT-07:00 Takeshi Yamamuro : > >> +1, I heard some Spark users have skipped v2.3.1 because of these bugs. >> >> On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang >> wrote: >> >>> +1 >>> >>> Wenchen Fan 于2018年6月28日 周四下午2:06写道: >>> Hi Saisai, that's great! please go ahead! On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao wrote: > +1, like mentioned by Marcelo, these issues seems quite severe. > > I can work on the release if short of hands :). > > Thanks > Jerry > > > Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: > >> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes >> for those out. >> >> (Those are what delayed 2.2.2 and 2.1.3 for those watching...) >> >> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan >> wrote: >> > Hi all, >> > >> > Spark 2.3.1 was released just a while ago, but unfortunately we >> discovered >> > and fixed some critical issues afterward. >> > >> > SPARK-24495: SortMergeJoin may produce wrong result. >> > This is a serious correctness bug, and is easy to hit: have >> duplicated join >> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, >> and the >> > join is a sort merge join. This bug is only present in Spark 2.3. >> > >> > SPARK-24588: stream-stream join may produce wrong result >> > This is a correctness bug in a new feature of Spark 2.3: the >> stream-stream >> > join. Users can hit this bug if one of the join side is partitioned >> by a >> > subset of the join keys. >> > >> > SPARK-24552: Task attempt numbers are reused when stages are retried >> > This is a long-standing bug in the output committer that may >> introduce data >> > corruption. >> > >> > SPARK-24542: UDFXPath allow users to pass carefully crafted XML >> to >> > access arbitrary files >> > This is a potential security issue if users build access control >> module upon >> > Spark. >> > >> > I think we need a Spark 2.3.2 to address these issues(especially the >> > correctness bugs) ASAP. Any thoughts? >> > >> > Thanks, >> > Wenchen >> >> >> >> -- >> Marcelo >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> >> >> -- >> --- >> Takeshi Yamamuro >> > > -- Ryan Blue Software Engineer Netflix
Re: Time for 2.3.2?
+1. Thanks, Saisai! The impact of SPARK-24495 is large. We should release Spark 2.3.2 ASAP. Thanks, Xiao 2018-06-27 23:28 GMT-07:00 Takeshi Yamamuro : > +1, I heard some Spark users have skipped v2.3.1 because of these bugs. > > On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang > wrote: > >> +1 >> >> Wenchen Fan 于2018年6月28日 周四下午2:06写道: >> >>> Hi Saisai, that's great! please go ahead! >>> >>> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao >>> wrote: >>> +1, like mentioned by Marcelo, these issues seems quite severe. I can work on the release if short of hands :). Thanks Jerry Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: > +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes > for those out. > > (Those are what delayed 2.2.2 and 2.1.3 for those watching...) > > On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan > wrote: > > Hi all, > > > > Spark 2.3.1 was released just a while ago, but unfortunately we > discovered > > and fixed some critical issues afterward. > > > > SPARK-24495: SortMergeJoin may produce wrong result. > > This is a serious correctness bug, and is easy to hit: have > duplicated join > > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, > and the > > join is a sort merge join. This bug is only present in Spark 2.3. > > > > SPARK-24588: stream-stream join may produce wrong result > > This is a correctness bug in a new feature of Spark 2.3: the > stream-stream > > join. Users can hit this bug if one of the join side is partitioned > by a > > subset of the join keys. > > > > SPARK-24552: Task attempt numbers are reused when stages are retried > > This is a long-standing bug in the output committer that may > introduce data > > corruption. > > > > SPARK-24542: UDFXPath allow users to pass carefully crafted XML > to > > access arbitrary files > > This is a potential security issue if users build access control > module upon > > Spark. > > > > I think we need a Spark 2.3.2 to address these issues(especially the > > correctness bugs) ASAP. Any thoughts? > > > > Thanks, > > Wenchen > > > > -- > Marcelo > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > > > -- > --- > Takeshi Yamamuro >
Re: Time for 2.3.2?
Yap will do From: Marcelo Vanzin Sent: Thursday, June 28, 2018 9:04:41 AM To: Felix Cheung Cc: Spark dev list Subject: Re: Time for 2.3.2? Could you mark that bug as blocker and set the target version, in that case? On Thu, Jun 28, 2018 at 8:46 AM, Felix Cheung mailto:felixcheun...@hotmail.com>> wrote: +1 I’d like to fix SPARK-24535 first though From: Stavros Kontopoulos mailto:stavros.kontopou...@lightbend.com>> Sent: Thursday, June 28, 2018 3:50:34 AM To: Marco Gaido Cc: Takeshi Yamamuro; Xingbo Jiang; Wenchen Fan; Spark dev list; Saisai Shao; van...@cloudera.com.invalid Subject: Re: Time for 2.3.2? +1 makes sense. On Thu, Jun 28, 2018 at 12:07 PM, Marco Gaido mailto:marcogaid...@gmail.com>> wrote: +1 too, I'd consider also to include SPARK-24208 if we can solve it timely... 2018-06-28 8:28 GMT+02:00 Takeshi Yamamuro mailto:linguin@gmail.com>>: +1, I heard some Spark users have skipped v2.3.1 because of these bugs. On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang mailto:jiangxb1...@gmail.com>> wrote: +1 Wenchen Fan mailto:cloud0...@gmail.com>>于2018年6月28日 周四下午2:06写道: Hi Saisai, that's great! please go ahead! On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao mailto:sai.sai.s...@gmail.com>> wrote: +1, like mentioned by Marcelo, these issues seems quite severe. I can work on the release if short of hands :). Thanks Jerry Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes for those out. (Those are what delayed 2.2.2 and 2.1.3 for those watching...) On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan mailto:cloud0...@gmail.com>> wrote: > Hi all, > > Spark 2.3.1 was released just a while ago, but unfortunately we discovered > and fixed some critical issues afterward. > > SPARK-24495: SortMergeJoin may produce wrong result. > This is a serious correctness bug, and is easy to hit: have duplicated join > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and the > join is a sort merge join. This bug is only present in Spark 2.3. > > SPARK-24588: stream-stream join may produce wrong result > This is a correctness bug in a new feature of Spark 2.3: the stream-stream > join. Users can hit this bug if one of the join side is partitioned by a > subset of the join keys. > > SPARK-24552: Task attempt numbers are reused when stages are retried > This is a long-standing bug in the output committer that may introduce data > corruption. > > SPARK-24542: UDFXPath allow users to pass carefully crafted XML to > access arbitrary files > This is a potential security issue if users build access control module upon > Spark. > > I think we need a Spark 2.3.2 to address these issues(especially the > correctness bugs) ASAP. Any thoughts? > > Thanks, > Wenchen -- Marcelo - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org> -- --- Takeshi Yamamuro -- Stavros Kontopoulos Senior Software Engineer Lightbend, Inc. p: +30 6977967274 e: stavros.kontopou...@lightbend.com<mailto:dave.mar...@lightbend.com> [https://docs.google.com/a/lightbend.com/uc?id=0B5AMuG_Ml2ddbFJqVWJxeHV0bzg=download] -- Marcelo
Re: Time for 2.3.2?
Could you mark that bug as blocker and set the target version, in that case? On Thu, Jun 28, 2018 at 8:46 AM, Felix Cheung wrote: > +1 > > I’d like to fix SPARK-24535 first though > > -- > *From:* Stavros Kontopoulos > *Sent:* Thursday, June 28, 2018 3:50:34 AM > *To:* Marco Gaido > *Cc:* Takeshi Yamamuro; Xingbo Jiang; Wenchen Fan; Spark dev list; Saisai > Shao; van...@cloudera.com.invalid > *Subject:* Re: Time for 2.3.2? > > +1 makes sense. > > On Thu, Jun 28, 2018 at 12:07 PM, Marco Gaido > wrote: > >> +1 too, I'd consider also to include SPARK-24208 if we can solve it >> timely... >> >> 2018-06-28 8:28 GMT+02:00 Takeshi Yamamuro : >> >>> +1, I heard some Spark users have skipped v2.3.1 because of these bugs. >>> >>> On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang >>> wrote: >>> >>>> +1 >>>> >>>> Wenchen Fan 于2018年6月28日 周四下午2:06写道: >>>> >>>>> Hi Saisai, that's great! please go ahead! >>>>> >>>>> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao >>>>> wrote: >>>>> >>>>>> +1, like mentioned by Marcelo, these issues seems quite severe. >>>>>> >>>>>> I can work on the release if short of hands :). >>>>>> >>>>>> Thanks >>>>>> Jerry >>>>>> >>>>>> >>>>>> Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: >>>>>> >>>>>>> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes >>>>>>> for those out. >>>>>>> >>>>>>> (Those are what delayed 2.2.2 and 2.1.3 for those watching...) >>>>>>> >>>>>>> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan >>>>>>> wrote: >>>>>>> > Hi all, >>>>>>> > >>>>>>> > Spark 2.3.1 was released just a while ago, but unfortunately we >>>>>>> discovered >>>>>>> > and fixed some critical issues afterward. >>>>>>> > >>>>>>> > SPARK-24495: SortMergeJoin may produce wrong result. >>>>>>> > This is a serious correctness bug, and is easy to hit: have >>>>>>> duplicated join >>>>>>> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, >>>>>>> and the >>>>>>> > join is a sort merge join. This bug is only present in Spark 2.3. >>>>>>> > >>>>>>> > SPARK-24588: stream-stream join may produce wrong result >>>>>>> > This is a correctness bug in a new feature of Spark 2.3: the >>>>>>> stream-stream >>>>>>> > join. Users can hit this bug if one of the join side is >>>>>>> partitioned by a >>>>>>> > subset of the join keys. >>>>>>> > >>>>>>> > SPARK-24552: Task attempt numbers are reused when stages are >>>>>>> retried >>>>>>> > This is a long-standing bug in the output committer that may >>>>>>> introduce data >>>>>>> > corruption. >>>>>>> > >>>>>>> > SPARK-24542: UDFXPath allow users to pass carefully crafted >>>>>>> XML to >>>>>>> > access arbitrary files >>>>>>> > This is a potential security issue if users build access control >>>>>>> module upon >>>>>>> > Spark. >>>>>>> > >>>>>>> > I think we need a Spark 2.3.2 to address these issues(especially >>>>>>> the >>>>>>> > correctness bugs) ASAP. Any thoughts? >>>>>>> > >>>>>>> > Thanks, >>>>>>> > Wenchen >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Marcelo >>>>>>> >>>>>>> >>>>>>> - >>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>> >>>>>>> >>> >>> -- >>> --- >>> Takeshi Yamamuro >>> >> >> > > > -- > Stavros Kontopoulos > > *Senior Software Engineer * > *Lightbend, Inc. * > > *p: +30 6977967274 <%2B1%20650%20678%200020>* > *e: stavros.kontopou...@lightbend.com* > > > -- Marcelo
Re: Time for 2.3.2?
+1 I’d like to fix SPARK-24535 first though From: Stavros Kontopoulos Sent: Thursday, June 28, 2018 3:50:34 AM To: Marco Gaido Cc: Takeshi Yamamuro; Xingbo Jiang; Wenchen Fan; Spark dev list; Saisai Shao; van...@cloudera.com.invalid Subject: Re: Time for 2.3.2? +1 makes sense. On Thu, Jun 28, 2018 at 12:07 PM, Marco Gaido mailto:marcogaid...@gmail.com>> wrote: +1 too, I'd consider also to include SPARK-24208 if we can solve it timely... 2018-06-28 8:28 GMT+02:00 Takeshi Yamamuro mailto:linguin@gmail.com>>: +1, I heard some Spark users have skipped v2.3.1 because of these bugs. On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang mailto:jiangxb1...@gmail.com>> wrote: +1 Wenchen Fan mailto:cloud0...@gmail.com>>于2018年6月28日 周四下午2:06写道: Hi Saisai, that's great! please go ahead! On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao mailto:sai.sai.s...@gmail.com>> wrote: +1, like mentioned by Marcelo, these issues seems quite severe. I can work on the release if short of hands :). Thanks Jerry Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes for those out. (Those are what delayed 2.2.2 and 2.1.3 for those watching...) On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan mailto:cloud0...@gmail.com>> wrote: > Hi all, > > Spark 2.3.1 was released just a while ago, but unfortunately we discovered > and fixed some critical issues afterward. > > SPARK-24495: SortMergeJoin may produce wrong result. > This is a serious correctness bug, and is easy to hit: have duplicated join > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and the > join is a sort merge join. This bug is only present in Spark 2.3. > > SPARK-24588: stream-stream join may produce wrong result > This is a correctness bug in a new feature of Spark 2.3: the stream-stream > join. Users can hit this bug if one of the join side is partitioned by a > subset of the join keys. > > SPARK-24552: Task attempt numbers are reused when stages are retried > This is a long-standing bug in the output committer that may introduce data > corruption. > > SPARK-24542: UDFXPath allow users to pass carefully crafted XML to > access arbitrary files > This is a potential security issue if users build access control module upon > Spark. > > I think we need a Spark 2.3.2 to address these issues(especially the > correctness bugs) ASAP. Any thoughts? > > Thanks, > Wenchen -- Marcelo - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org> -- --- Takeshi Yamamuro -- Stavros Kontopoulos Senior Software Engineer Lightbend, Inc. p: +30 6977967274 e: stavros.kontopou...@lightbend.com<mailto:dave.mar...@lightbend.com> [https://docs.google.com/a/lightbend.com/uc?id=0B5AMuG_Ml2ddbFJqVWJxeHV0bzg=download]
Re: Time for 2.3.2?
+1 makes sense. On Thu, Jun 28, 2018 at 12:07 PM, Marco Gaido wrote: > +1 too, I'd consider also to include SPARK-24208 if we can solve it > timely... > > 2018-06-28 8:28 GMT+02:00 Takeshi Yamamuro : > >> +1, I heard some Spark users have skipped v2.3.1 because of these bugs. >> >> On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang >> wrote: >> >>> +1 >>> >>> Wenchen Fan 于2018年6月28日 周四下午2:06写道: >>> Hi Saisai, that's great! please go ahead! On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao wrote: > +1, like mentioned by Marcelo, these issues seems quite severe. > > I can work on the release if short of hands :). > > Thanks > Jerry > > > Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: > >> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes >> for those out. >> >> (Those are what delayed 2.2.2 and 2.1.3 for those watching...) >> >> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan >> wrote: >> > Hi all, >> > >> > Spark 2.3.1 was released just a while ago, but unfortunately we >> discovered >> > and fixed some critical issues afterward. >> > >> > SPARK-24495: SortMergeJoin may produce wrong result. >> > This is a serious correctness bug, and is easy to hit: have >> duplicated join >> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, >> and the >> > join is a sort merge join. This bug is only present in Spark 2.3. >> > >> > SPARK-24588: stream-stream join may produce wrong result >> > This is a correctness bug in a new feature of Spark 2.3: the >> stream-stream >> > join. Users can hit this bug if one of the join side is partitioned >> by a >> > subset of the join keys. >> > >> > SPARK-24552: Task attempt numbers are reused when stages are retried >> > This is a long-standing bug in the output committer that may >> introduce data >> > corruption. >> > >> > SPARK-24542: UDFXPath allow users to pass carefully crafted XML >> to >> > access arbitrary files >> > This is a potential security issue if users build access control >> module upon >> > Spark. >> > >> > I think we need a Spark 2.3.2 to address these issues(especially the >> > correctness bugs) ASAP. Any thoughts? >> > >> > Thanks, >> > Wenchen >> >> >> >> -- >> Marcelo >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> >> >> -- >> --- >> Takeshi Yamamuro >> > > -- Stavros Kontopoulos *Senior Software Engineer* *Lightbend, Inc.* *p: +30 6977967274 <%2B1%20650%20678%200020>* *e: stavros.kontopou...@lightbend.com*
Re: Time for 2.3.2?
+1 too, I'd consider also to include SPARK-24208 if we can solve it timely... 2018-06-28 8:28 GMT+02:00 Takeshi Yamamuro : > +1, I heard some Spark users have skipped v2.3.1 because of these bugs. > > On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang > wrote: > >> +1 >> >> Wenchen Fan 于2018年6月28日 周四下午2:06写道: >> >>> Hi Saisai, that's great! please go ahead! >>> >>> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao >>> wrote: >>> +1, like mentioned by Marcelo, these issues seems quite severe. I can work on the release if short of hands :). Thanks Jerry Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: > +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes > for those out. > > (Those are what delayed 2.2.2 and 2.1.3 for those watching...) > > On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan > wrote: > > Hi all, > > > > Spark 2.3.1 was released just a while ago, but unfortunately we > discovered > > and fixed some critical issues afterward. > > > > SPARK-24495: SortMergeJoin may produce wrong result. > > This is a serious correctness bug, and is easy to hit: have > duplicated join > > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, > and the > > join is a sort merge join. This bug is only present in Spark 2.3. > > > > SPARK-24588: stream-stream join may produce wrong result > > This is a correctness bug in a new feature of Spark 2.3: the > stream-stream > > join. Users can hit this bug if one of the join side is partitioned > by a > > subset of the join keys. > > > > SPARK-24552: Task attempt numbers are reused when stages are retried > > This is a long-standing bug in the output committer that may > introduce data > > corruption. > > > > SPARK-24542: UDFXPath allow users to pass carefully crafted XML > to > > access arbitrary files > > This is a potential security issue if users build access control > module upon > > Spark. > > > > I think we need a Spark 2.3.2 to address these issues(especially the > > correctness bugs) ASAP. Any thoughts? > > > > Thanks, > > Wenchen > > > > -- > Marcelo > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > > > > -- > --- > Takeshi Yamamuro >
Re: Time for 2.3.2?
+1, I heard some Spark users have skipped v2.3.1 because of these bugs. On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang wrote: > +1 > > Wenchen Fan 于2018年6月28日 周四下午2:06写道: > >> Hi Saisai, that's great! please go ahead! >> >> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao >> wrote: >> >>> +1, like mentioned by Marcelo, these issues seems quite severe. >>> >>> I can work on the release if short of hands :). >>> >>> Thanks >>> Jerry >>> >>> >>> Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: >>> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes for those out. (Those are what delayed 2.2.2 and 2.1.3 for those watching...) On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan wrote: > Hi all, > > Spark 2.3.1 was released just a while ago, but unfortunately we discovered > and fixed some critical issues afterward. > > SPARK-24495: SortMergeJoin may produce wrong result. > This is a serious correctness bug, and is easy to hit: have duplicated join > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and the > join is a sort merge join. This bug is only present in Spark 2.3. > > SPARK-24588: stream-stream join may produce wrong result > This is a correctness bug in a new feature of Spark 2.3: the stream-stream > join. Users can hit this bug if one of the join side is partitioned by a > subset of the join keys. > > SPARK-24552: Task attempt numbers are reused when stages are retried > This is a long-standing bug in the output committer that may introduce data > corruption. > > SPARK-24542: UDFXPath allow users to pass carefully crafted XML to > access arbitrary files > This is a potential security issue if users build access control module upon > Spark. > > I think we need a Spark 2.3.2 to address these issues(especially the > correctness bugs) ASAP. Any thoughts? > > Thanks, > Wenchen -- Marcelo - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org -- --- Takeshi Yamamuro
Re: Time for 2.3.2?
+1 Wenchen Fan 于2018年6月28日 周四下午2:06写道: > Hi Saisai, that's great! please go ahead! > > On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao > wrote: > >> +1, like mentioned by Marcelo, these issues seems quite severe. >> >> I can work on the release if short of hands :). >> >> Thanks >> Jerry >> >> >> Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: >> >>> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes >>> for those out. >>> >>> (Those are what delayed 2.2.2 and 2.1.3 for those watching...) >>> >>> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan >>> wrote: >>> > Hi all, >>> > >>> > Spark 2.3.1 was released just a while ago, but unfortunately we >>> discovered >>> > and fixed some critical issues afterward. >>> > >>> > SPARK-24495: SortMergeJoin may produce wrong result. >>> > This is a serious correctness bug, and is easy to hit: have duplicated >>> join >>> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and >>> the >>> > join is a sort merge join. This bug is only present in Spark 2.3. >>> > >>> > SPARK-24588: stream-stream join may produce wrong result >>> > This is a correctness bug in a new feature of Spark 2.3: the >>> stream-stream >>> > join. Users can hit this bug if one of the join side is partitioned by >>> a >>> > subset of the join keys. >>> > >>> > SPARK-24552: Task attempt numbers are reused when stages are retried >>> > This is a long-standing bug in the output committer that may introduce >>> data >>> > corruption. >>> > >>> > SPARK-24542: UDFXPath allow users to pass carefully crafted XML to >>> > access arbitrary files >>> > This is a potential security issue if users build access control >>> module upon >>> > Spark. >>> > >>> > I think we need a Spark 2.3.2 to address these issues(especially the >>> > correctness bugs) ASAP. Any thoughts? >>> > >>> > Thanks, >>> > Wenchen >>> >>> >>> >>> -- >>> Marcelo >>> >>> - >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>>
Re: Time for 2.3.2?
Hi Saisai, that's great! please go ahead! On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao wrote: > +1, like mentioned by Marcelo, these issues seems quite severe. > > I can work on the release if short of hands :). > > Thanks > Jerry > > > Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: > >> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes >> for those out. >> >> (Those are what delayed 2.2.2 and 2.1.3 for those watching...) >> >> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan wrote: >> > Hi all, >> > >> > Spark 2.3.1 was released just a while ago, but unfortunately we >> discovered >> > and fixed some critical issues afterward. >> > >> > SPARK-24495: SortMergeJoin may produce wrong result. >> > This is a serious correctness bug, and is easy to hit: have duplicated >> join >> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and >> the >> > join is a sort merge join. This bug is only present in Spark 2.3. >> > >> > SPARK-24588: stream-stream join may produce wrong result >> > This is a correctness bug in a new feature of Spark 2.3: the >> stream-stream >> > join. Users can hit this bug if one of the join side is partitioned by a >> > subset of the join keys. >> > >> > SPARK-24552: Task attempt numbers are reused when stages are retried >> > This is a long-standing bug in the output committer that may introduce >> data >> > corruption. >> > >> > SPARK-24542: UDFXPath allow users to pass carefully crafted XML to >> > access arbitrary files >> > This is a potential security issue if users build access control module >> upon >> > Spark. >> > >> > I think we need a Spark 2.3.2 to address these issues(especially the >> > correctness bugs) ASAP. Any thoughts? >> > >> > Thanks, >> > Wenchen >> >> >> >> -- >> Marcelo >> >> - >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >>
Re: Time for 2.3.2?
+1, like mentioned by Marcelo, these issues seems quite severe. I can work on the release if short of hands :). Thanks Jerry Marcelo Vanzin 于2018年6月28日周四 上午11:40写道: > +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes > for those out. > > (Those are what delayed 2.2.2 and 2.1.3 for those watching...) > > On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan wrote: > > Hi all, > > > > Spark 2.3.1 was released just a while ago, but unfortunately we > discovered > > and fixed some critical issues afterward. > > > > SPARK-24495: SortMergeJoin may produce wrong result. > > This is a serious correctness bug, and is easy to hit: have duplicated > join > > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and > the > > join is a sort merge join. This bug is only present in Spark 2.3. > > > > SPARK-24588: stream-stream join may produce wrong result > > This is a correctness bug in a new feature of Spark 2.3: the > stream-stream > > join. Users can hit this bug if one of the join side is partitioned by a > > subset of the join keys. > > > > SPARK-24552: Task attempt numbers are reused when stages are retried > > This is a long-standing bug in the output committer that may introduce > data > > corruption. > > > > SPARK-24542: UDFXPath allow users to pass carefully crafted XML to > > access arbitrary files > > This is a potential security issue if users build access control module > upon > > Spark. > > > > I think we need a Spark 2.3.2 to address these issues(especially the > > correctness bugs) ASAP. Any thoughts? > > > > Thanks, > > Wenchen > > > > -- > Marcelo > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >
Re: Time for 2.3.2?
+1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes for those out. (Those are what delayed 2.2.2 and 2.1.3 for those watching...) On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan wrote: > Hi all, > > Spark 2.3.1 was released just a while ago, but unfortunately we discovered > and fixed some critical issues afterward. > > SPARK-24495: SortMergeJoin may produce wrong result. > This is a serious correctness bug, and is easy to hit: have duplicated join > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and the > join is a sort merge join. This bug is only present in Spark 2.3. > > SPARK-24588: stream-stream join may produce wrong result > This is a correctness bug in a new feature of Spark 2.3: the stream-stream > join. Users can hit this bug if one of the join side is partitioned by a > subset of the join keys. > > SPARK-24552: Task attempt numbers are reused when stages are retried > This is a long-standing bug in the output committer that may introduce data > corruption. > > SPARK-24542: UDFXPath allow users to pass carefully crafted XML to > access arbitrary files > This is a potential security issue if users build access control module upon > Spark. > > I think we need a Spark 2.3.2 to address these issues(especially the > correctness bugs) ASAP. Any thoughts? > > Thanks, > Wenchen -- Marcelo - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org