Re: Time for 2.3.2?

2018-07-03 Thread Saisai Shao
FYI, currently we have one block issue (
https://issues.apache.org/jira/browse/SPARK-24535), will start the release
after this is fixed.

Also please let me know if there're other blocks or fixes want to land to
2.3.2 release.

Thanks
Saisai

Saisai Shao  于2018年7月2日周一 下午1:16写道:

> I will start preparing the release.
>
> Thanks
>
> John Zhuge  于2018年6月30日周六 上午10:31写道:
>
>> +1  Looking forward to the critical fixes in 2.3.2.
>>
>> On Thu, Jun 28, 2018 at 9:37 AM Ryan Blue 
>> wrote:
>>
>>> +1
>>>
>>> On Thu, Jun 28, 2018 at 9:34 AM Xiao Li  wrote:
>>>
 +1. Thanks, Saisai!

 The impact of SPARK-24495 is large. We should release Spark 2.3.2 ASAP.

 Thanks,

 Xiao

 2018-06-27 23:28 GMT-07:00 Takeshi Yamamuro :

> +1, I heard some Spark users have skipped v2.3.1 because of these bugs.
>
> On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang 
> wrote:
>
>> +1
>>
>> Wenchen Fan 于2018年6月28日 周四下午2:06写道:
>>
>>> Hi Saisai, that's great! please go ahead!
>>>
>>> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao 
>>> wrote:
>>>
 +1, like mentioned by Marcelo, these issues seems quite severe.

 I can work on the release if short of hands :).

 Thanks
 Jerry


 Marcelo Vanzin  于2018年6月28日周四
 上午11:40写道:

> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get
> fixes
> for those out.
>
> (Those are what delayed 2.2.2 and 2.1.3 for those watching...)
>
> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan 
> wrote:
> > Hi all,
> >
> > Spark 2.3.1 was released just a while ago, but unfortunately we
> discovered
> > and fixed some critical issues afterward.
> >
> > SPARK-24495: SortMergeJoin may produce wrong result.
> > This is a serious correctness bug, and is easy to hit: have
> duplicated join
> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a =
> t2.c`, and the
> > join is a sort merge join. This bug is only present in Spark 2.3.
> >
> > SPARK-24588: stream-stream join may produce wrong result
> > This is a correctness bug in a new feature of Spark 2.3: the
> stream-stream
> > join. Users can hit this bug if one of the join side is
> partitioned by a
> > subset of the join keys.
> >
> > SPARK-24552: Task attempt numbers are reused when stages are
> retried
> > This is a long-standing bug in the output committer that may
> introduce data
> > corruption.
> >
> > SPARK-24542: UDFXPath allow users to pass carefully crafted
> XML to
> > access arbitrary files
> > This is a potential security issue if users build access control
> module upon
> > Spark.
> >
> > I think we need a Spark 2.3.2 to address these issues(especially
> the
> > correctness bugs) ASAP. Any thoughts?
> >
> > Thanks,
> > Wenchen
>
>
>
> --
> Marcelo
>
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
>
> --
> ---
> Takeshi Yamamuro
>


>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Netflix
>>>
>>> --
>>> John Zhuge
>>>
>>


Re: Time for 2.3.2?

2018-07-01 Thread Saisai Shao
I will start preparing the release.

Thanks

John Zhuge  于2018年6月30日周六 上午10:31写道:

> +1  Looking forward to the critical fixes in 2.3.2.
>
> On Thu, Jun 28, 2018 at 9:37 AM Ryan Blue 
> wrote:
>
>> +1
>>
>> On Thu, Jun 28, 2018 at 9:34 AM Xiao Li  wrote:
>>
>>> +1. Thanks, Saisai!
>>>
>>> The impact of SPARK-24495 is large. We should release Spark 2.3.2 ASAP.
>>>
>>> Thanks,
>>>
>>> Xiao
>>>
>>> 2018-06-27 23:28 GMT-07:00 Takeshi Yamamuro :
>>>
 +1, I heard some Spark users have skipped v2.3.1 because of these bugs.

 On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang 
 wrote:

> +1
>
> Wenchen Fan 于2018年6月28日 周四下午2:06写道:
>
>> Hi Saisai, that's great! please go ahead!
>>
>> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao 
>> wrote:
>>
>>> +1, like mentioned by Marcelo, these issues seems quite severe.
>>>
>>> I can work on the release if short of hands :).
>>>
>>> Thanks
>>> Jerry
>>>
>>>
>>> Marcelo Vanzin  于2018年6月28日周四
>>> 上午11:40写道:
>>>
 +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get
 fixes
 for those out.

 (Those are what delayed 2.2.2 and 2.1.3 for those watching...)

 On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan 
 wrote:
 > Hi all,
 >
 > Spark 2.3.1 was released just a while ago, but unfortunately we
 discovered
 > and fixed some critical issues afterward.
 >
 > SPARK-24495: SortMergeJoin may produce wrong result.
 > This is a serious correctness bug, and is easy to hit: have
 duplicated join
 > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a =
 t2.c`, and the
 > join is a sort merge join. This bug is only present in Spark 2.3.
 >
 > SPARK-24588: stream-stream join may produce wrong result
 > This is a correctness bug in a new feature of Spark 2.3: the
 stream-stream
 > join. Users can hit this bug if one of the join side is
 partitioned by a
 > subset of the join keys.
 >
 > SPARK-24552: Task attempt numbers are reused when stages are
 retried
 > This is a long-standing bug in the output committer that may
 introduce data
 > corruption.
 >
 > SPARK-24542: UDFXPath allow users to pass carefully crafted
 XML to
 > access arbitrary files
 > This is a potential security issue if users build access control
 module upon
 > Spark.
 >
 > I think we need a Spark 2.3.2 to address these issues(especially
 the
 > correctness bugs) ASAP. Any thoughts?
 >
 > Thanks,
 > Wenchen



 --
 Marcelo


 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



 --
 ---
 Takeshi Yamamuro

>>>
>>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>> --
>> John Zhuge
>>
>


Re: Time for 2.3.2?

2018-06-29 Thread John Zhuge
+1  Looking forward to the critical fixes in 2.3.2.

On Thu, Jun 28, 2018 at 9:37 AM Ryan Blue  wrote:

> +1
>
> On Thu, Jun 28, 2018 at 9:34 AM Xiao Li  wrote:
>
>> +1. Thanks, Saisai!
>>
>> The impact of SPARK-24495 is large. We should release Spark 2.3.2 ASAP.
>>
>> Thanks,
>>
>> Xiao
>>
>> 2018-06-27 23:28 GMT-07:00 Takeshi Yamamuro :
>>
>>> +1, I heard some Spark users have skipped v2.3.1 because of these bugs.
>>>
>>> On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang 
>>> wrote:
>>>
 +1

 Wenchen Fan 于2018年6月28日 周四下午2:06写道:

> Hi Saisai, that's great! please go ahead!
>
> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao 
> wrote:
>
>> +1, like mentioned by Marcelo, these issues seems quite severe.
>>
>> I can work on the release if short of hands :).
>>
>> Thanks
>> Jerry
>>
>>
>> Marcelo Vanzin  于2018年6月28日周四 上午11:40写道:
>>
>>> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
>>> for those out.
>>>
>>> (Those are what delayed 2.2.2 and 2.1.3 for those watching...)
>>>
>>> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan 
>>> wrote:
>>> > Hi all,
>>> >
>>> > Spark 2.3.1 was released just a while ago, but unfortunately we
>>> discovered
>>> > and fixed some critical issues afterward.
>>> >
>>> > SPARK-24495: SortMergeJoin may produce wrong result.
>>> > This is a serious correctness bug, and is easy to hit: have
>>> duplicated join
>>> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`,
>>> and the
>>> > join is a sort merge join. This bug is only present in Spark 2.3.
>>> >
>>> > SPARK-24588: stream-stream join may produce wrong result
>>> > This is a correctness bug in a new feature of Spark 2.3: the
>>> stream-stream
>>> > join. Users can hit this bug if one of the join side is
>>> partitioned by a
>>> > subset of the join keys.
>>> >
>>> > SPARK-24552: Task attempt numbers are reused when stages are
>>> retried
>>> > This is a long-standing bug in the output committer that may
>>> introduce data
>>> > corruption.
>>> >
>>> > SPARK-24542: UDFXPath allow users to pass carefully crafted
>>> XML to
>>> > access arbitrary files
>>> > This is a potential security issue if users build access control
>>> module upon
>>> > Spark.
>>> >
>>> > I think we need a Spark 2.3.2 to address these issues(especially
>>> the
>>> > correctness bugs) ASAP. Any thoughts?
>>> >
>>> > Thanks,
>>> > Wenchen
>>>
>>>
>>>
>>> --
>>> Marcelo
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>>>
>>> --
>>> ---
>>> Takeshi Yamamuro
>>>
>>
>>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>
> --
> John Zhuge
>


Re: Time for 2.3.2?

2018-06-29 Thread Yu, Yucai
+1. We are evaluating 2.3.1, please release Spark 2.3.2 ASAP.

Thanks,
Yucai


Re: Time for 2.3.2?

2018-06-29 Thread gvramana
+1. Need to release Spark 2.3.2 ASAP

Thanks,
Venkata Ramana Gollamudi



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Time for 2.3.2?

2018-06-28 Thread Ryan Blue
+1

On Thu, Jun 28, 2018 at 9:34 AM Xiao Li  wrote:

> +1. Thanks, Saisai!
>
> The impact of SPARK-24495 is large. We should release Spark 2.3.2 ASAP.
>
> Thanks,
>
> Xiao
>
> 2018-06-27 23:28 GMT-07:00 Takeshi Yamamuro :
>
>> +1, I heard some Spark users have skipped v2.3.1 because of these bugs.
>>
>> On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang 
>> wrote:
>>
>>> +1
>>>
>>> Wenchen Fan 于2018年6月28日 周四下午2:06写道:
>>>
 Hi Saisai, that's great! please go ahead!

 On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao 
 wrote:

> +1, like mentioned by Marcelo, these issues seems quite severe.
>
> I can work on the release if short of hands :).
>
> Thanks
> Jerry
>
>
> Marcelo Vanzin  于2018年6月28日周四 上午11:40写道:
>
>> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
>> for those out.
>>
>> (Those are what delayed 2.2.2 and 2.1.3 for those watching...)
>>
>> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan 
>> wrote:
>> > Hi all,
>> >
>> > Spark 2.3.1 was released just a while ago, but unfortunately we
>> discovered
>> > and fixed some critical issues afterward.
>> >
>> > SPARK-24495: SortMergeJoin may produce wrong result.
>> > This is a serious correctness bug, and is easy to hit: have
>> duplicated join
>> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`,
>> and the
>> > join is a sort merge join. This bug is only present in Spark 2.3.
>> >
>> > SPARK-24588: stream-stream join may produce wrong result
>> > This is a correctness bug in a new feature of Spark 2.3: the
>> stream-stream
>> > join. Users can hit this bug if one of the join side is partitioned
>> by a
>> > subset of the join keys.
>> >
>> > SPARK-24552: Task attempt numbers are reused when stages are retried
>> > This is a long-standing bug in the output committer that may
>> introduce data
>> > corruption.
>> >
>> > SPARK-24542: UDFXPath allow users to pass carefully crafted XML
>> to
>> > access arbitrary files
>> > This is a potential security issue if users build access control
>> module upon
>> > Spark.
>> >
>> > I think we need a Spark 2.3.2 to address these issues(especially the
>> > correctness bugs) ASAP. Any thoughts?
>> >
>> > Thanks,
>> > Wenchen
>>
>>
>>
>> --
>> Marcelo
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>
>

-- 
Ryan Blue
Software Engineer
Netflix


Re: Time for 2.3.2?

2018-06-28 Thread Xiao Li
+1. Thanks, Saisai!

The impact of SPARK-24495 is large. We should release Spark 2.3.2 ASAP.

Thanks,

Xiao

2018-06-27 23:28 GMT-07:00 Takeshi Yamamuro :

> +1, I heard some Spark users have skipped v2.3.1 because of these bugs.
>
> On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang 
> wrote:
>
>> +1
>>
>> Wenchen Fan 于2018年6月28日 周四下午2:06写道:
>>
>>> Hi Saisai, that's great! please go ahead!
>>>
>>> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao 
>>> wrote:
>>>
 +1, like mentioned by Marcelo, these issues seems quite severe.

 I can work on the release if short of hands :).

 Thanks
 Jerry


 Marcelo Vanzin  于2018年6月28日周四 上午11:40写道:

> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
> for those out.
>
> (Those are what delayed 2.2.2 and 2.1.3 for those watching...)
>
> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan 
> wrote:
> > Hi all,
> >
> > Spark 2.3.1 was released just a while ago, but unfortunately we
> discovered
> > and fixed some critical issues afterward.
> >
> > SPARK-24495: SortMergeJoin may produce wrong result.
> > This is a serious correctness bug, and is easy to hit: have
> duplicated join
> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`,
> and the
> > join is a sort merge join. This bug is only present in Spark 2.3.
> >
> > SPARK-24588: stream-stream join may produce wrong result
> > This is a correctness bug in a new feature of Spark 2.3: the
> stream-stream
> > join. Users can hit this bug if one of the join side is partitioned
> by a
> > subset of the join keys.
> >
> > SPARK-24552: Task attempt numbers are reused when stages are retried
> > This is a long-standing bug in the output committer that may
> introduce data
> > corruption.
> >
> > SPARK-24542: UDFXPath allow users to pass carefully crafted XML
> to
> > access arbitrary files
> > This is a potential security issue if users build access control
> module upon
> > Spark.
> >
> > I think we need a Spark 2.3.2 to address these issues(especially the
> > correctness bugs) ASAP. Any thoughts?
> >
> > Thanks,
> > Wenchen
>
>
>
> --
> Marcelo
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
>
> --
> ---
> Takeshi Yamamuro
>


Re: Time for 2.3.2?

2018-06-28 Thread Felix Cheung
Yap will do


From: Marcelo Vanzin 
Sent: Thursday, June 28, 2018 9:04:41 AM
To: Felix Cheung
Cc: Spark dev list
Subject: Re: Time for 2.3.2?

Could you mark that bug as blocker and set the target version, in that case?

On Thu, Jun 28, 2018 at 8:46 AM, Felix Cheung 
mailto:felixcheun...@hotmail.com>> wrote:
+1

I’d like to fix SPARK-24535 first though


From: Stavros Kontopoulos 
mailto:stavros.kontopou...@lightbend.com>>
Sent: Thursday, June 28, 2018 3:50:34 AM
To: Marco Gaido
Cc: Takeshi Yamamuro; Xingbo Jiang; Wenchen Fan; Spark dev list; Saisai Shao; 
van...@cloudera.com.invalid
Subject: Re: Time for 2.3.2?

+1 makes sense.

On Thu, Jun 28, 2018 at 12:07 PM, Marco Gaido 
mailto:marcogaid...@gmail.com>> wrote:
+1 too, I'd consider also to include SPARK-24208 if we can solve it timely...

2018-06-28 8:28 GMT+02:00 Takeshi Yamamuro 
mailto:linguin@gmail.com>>:
+1, I heard some Spark users have skipped v2.3.1 because of these bugs.

On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang 
mailto:jiangxb1...@gmail.com>> wrote:
+1

Wenchen Fan mailto:cloud0...@gmail.com>>于2018年6月28日 
周四下午2:06写道:
Hi Saisai, that's great! please go ahead!

On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao 
mailto:sai.sai.s...@gmail.com>> wrote:
+1, like mentioned by Marcelo, these issues seems quite severe.

I can work on the release if short of hands :).

Thanks
Jerry


Marcelo Vanzin  于2018年6月28日周四 上午11:40写道:
+1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
for those out.

(Those are what delayed 2.2.2 and 2.1.3 for those watching...)

On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan 
mailto:cloud0...@gmail.com>> wrote:
> Hi all,
>
> Spark 2.3.1 was released just a while ago, but unfortunately we discovered
> and fixed some critical issues afterward.
>
> SPARK-24495: SortMergeJoin may produce wrong result.
> This is a serious correctness bug, and is easy to hit: have duplicated join
> key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and the
> join is a sort merge join. This bug is only present in Spark 2.3.
>
> SPARK-24588: stream-stream join may produce wrong result
> This is a correctness bug in a new feature of Spark 2.3: the stream-stream
> join. Users can hit this bug if one of the join side is partitioned by a
> subset of the join keys.
>
> SPARK-24552: Task attempt numbers are reused when stages are retried
> This is a long-standing bug in the output committer that may introduce data
> corruption.
>
> SPARK-24542: UDFXPath allow users to pass carefully crafted XML to
> access arbitrary files
> This is a potential security issue if users build access control module upon
> Spark.
>
> I think we need a Spark 2.3.2 to address these issues(especially the
> correctness bugs) ASAP. Any thoughts?
>
> Thanks,
> Wenchen



--
Marcelo

-
To unsubscribe e-mail: 
dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org>



--
---
Takeshi Yamamuro




--
Stavros Kontopoulos
Senior Software Engineer
Lightbend, Inc.
p:  +30 6977967274

e: stavros.kontopou...@lightbend.com<mailto:dave.mar...@lightbend.com>

[https://docs.google.com/a/lightbend.com/uc?id=0B5AMuG_Ml2ddbFJqVWJxeHV0bzg=download]



--
Marcelo


Re: Time for 2.3.2?

2018-06-28 Thread Marcelo Vanzin
Could you mark that bug as blocker and set the target version, in that case?

On Thu, Jun 28, 2018 at 8:46 AM, Felix Cheung 
wrote:

> +1
>
> I’d like to fix SPARK-24535 first though
>
> --
> *From:* Stavros Kontopoulos 
> *Sent:* Thursday, June 28, 2018 3:50:34 AM
> *To:* Marco Gaido
> *Cc:* Takeshi Yamamuro; Xingbo Jiang; Wenchen Fan; Spark dev list; Saisai
> Shao; van...@cloudera.com.invalid
> *Subject:* Re: Time for 2.3.2?
>
> +1 makes sense.
>
> On Thu, Jun 28, 2018 at 12:07 PM, Marco Gaido 
> wrote:
>
>> +1 too, I'd consider also to include SPARK-24208 if we can solve it
>> timely...
>>
>> 2018-06-28 8:28 GMT+02:00 Takeshi Yamamuro :
>>
>>> +1, I heard some Spark users have skipped v2.3.1 because of these bugs.
>>>
>>> On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang 
>>> wrote:
>>>
>>>> +1
>>>>
>>>> Wenchen Fan 于2018年6月28日 周四下午2:06写道:
>>>>
>>>>> Hi Saisai, that's great! please go ahead!
>>>>>
>>>>> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao 
>>>>> wrote:
>>>>>
>>>>>> +1, like mentioned by Marcelo, these issues seems quite severe.
>>>>>>
>>>>>> I can work on the release if short of hands :).
>>>>>>
>>>>>> Thanks
>>>>>> Jerry
>>>>>>
>>>>>>
>>>>>> Marcelo Vanzin  于2018年6月28日周四 上午11:40写道:
>>>>>>
>>>>>>> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
>>>>>>> for those out.
>>>>>>>
>>>>>>> (Those are what delayed 2.2.2 and 2.1.3 for those watching...)
>>>>>>>
>>>>>>> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan 
>>>>>>> wrote:
>>>>>>> > Hi all,
>>>>>>> >
>>>>>>> > Spark 2.3.1 was released just a while ago, but unfortunately we
>>>>>>> discovered
>>>>>>> > and fixed some critical issues afterward.
>>>>>>> >
>>>>>>> > SPARK-24495: SortMergeJoin may produce wrong result.
>>>>>>> > This is a serious correctness bug, and is easy to hit: have
>>>>>>> duplicated join
>>>>>>> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`,
>>>>>>> and the
>>>>>>> > join is a sort merge join. This bug is only present in Spark 2.3.
>>>>>>> >
>>>>>>> > SPARK-24588: stream-stream join may produce wrong result
>>>>>>> > This is a correctness bug in a new feature of Spark 2.3: the
>>>>>>> stream-stream
>>>>>>> > join. Users can hit this bug if one of the join side is
>>>>>>> partitioned by a
>>>>>>> > subset of the join keys.
>>>>>>> >
>>>>>>> > SPARK-24552: Task attempt numbers are reused when stages are
>>>>>>> retried
>>>>>>> > This is a long-standing bug in the output committer that may
>>>>>>> introduce data
>>>>>>> > corruption.
>>>>>>> >
>>>>>>> > SPARK-24542: UDFXPath allow users to pass carefully crafted
>>>>>>> XML to
>>>>>>> > access arbitrary files
>>>>>>> > This is a potential security issue if users build access control
>>>>>>> module upon
>>>>>>> > Spark.
>>>>>>> >
>>>>>>> > I think we need a Spark 2.3.2 to address these issues(especially
>>>>>>> the
>>>>>>> > correctness bugs) ASAP. Any thoughts?
>>>>>>> >
>>>>>>> > Thanks,
>>>>>>> > Wenchen
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Marcelo
>>>>>>>
>>>>>>> 
>>>>>>> -
>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>
>>>>>>>
>>>
>>> --
>>> ---
>>> Takeshi Yamamuro
>>>
>>
>>
>
>
> --
> Stavros Kontopoulos
>
> *Senior Software Engineer *
> *Lightbend, Inc. *
>
> *p:  +30 6977967274 <%2B1%20650%20678%200020>*
> *e: stavros.kontopou...@lightbend.com* 
>
>
>


-- 
Marcelo


Re: Time for 2.3.2?

2018-06-28 Thread Felix Cheung
+1

I’d like to fix SPARK-24535 first though


From: Stavros Kontopoulos 
Sent: Thursday, June 28, 2018 3:50:34 AM
To: Marco Gaido
Cc: Takeshi Yamamuro; Xingbo Jiang; Wenchen Fan; Spark dev list; Saisai Shao; 
van...@cloudera.com.invalid
Subject: Re: Time for 2.3.2?

+1 makes sense.

On Thu, Jun 28, 2018 at 12:07 PM, Marco Gaido 
mailto:marcogaid...@gmail.com>> wrote:
+1 too, I'd consider also to include SPARK-24208 if we can solve it timely...

2018-06-28 8:28 GMT+02:00 Takeshi Yamamuro 
mailto:linguin@gmail.com>>:
+1, I heard some Spark users have skipped v2.3.1 because of these bugs.

On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang 
mailto:jiangxb1...@gmail.com>> wrote:
+1

Wenchen Fan mailto:cloud0...@gmail.com>>于2018年6月28日 
周四下午2:06写道:
Hi Saisai, that's great! please go ahead!

On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao 
mailto:sai.sai.s...@gmail.com>> wrote:
+1, like mentioned by Marcelo, these issues seems quite severe.

I can work on the release if short of hands :).

Thanks
Jerry


Marcelo Vanzin  于2018年6月28日周四 上午11:40写道:
+1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
for those out.

(Those are what delayed 2.2.2 and 2.1.3 for those watching...)

On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan 
mailto:cloud0...@gmail.com>> wrote:
> Hi all,
>
> Spark 2.3.1 was released just a while ago, but unfortunately we discovered
> and fixed some critical issues afterward.
>
> SPARK-24495: SortMergeJoin may produce wrong result.
> This is a serious correctness bug, and is easy to hit: have duplicated join
> key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and the
> join is a sort merge join. This bug is only present in Spark 2.3.
>
> SPARK-24588: stream-stream join may produce wrong result
> This is a correctness bug in a new feature of Spark 2.3: the stream-stream
> join. Users can hit this bug if one of the join side is partitioned by a
> subset of the join keys.
>
> SPARK-24552: Task attempt numbers are reused when stages are retried
> This is a long-standing bug in the output committer that may introduce data
> corruption.
>
> SPARK-24542: UDFXPath allow users to pass carefully crafted XML to
> access arbitrary files
> This is a potential security issue if users build access control module upon
> Spark.
>
> I think we need a Spark 2.3.2 to address these issues(especially the
> correctness bugs) ASAP. Any thoughts?
>
> Thanks,
> Wenchen



--
Marcelo

-
To unsubscribe e-mail: 
dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org>



--
---
Takeshi Yamamuro




--
Stavros Kontopoulos
Senior Software Engineer
Lightbend, Inc.
p:  +30 6977967274

e: stavros.kontopou...@lightbend.com<mailto:dave.mar...@lightbend.com>

[https://docs.google.com/a/lightbend.com/uc?id=0B5AMuG_Ml2ddbFJqVWJxeHV0bzg=download]


Re: Time for 2.3.2?

2018-06-28 Thread Stavros Kontopoulos
+1 makes sense.

On Thu, Jun 28, 2018 at 12:07 PM, Marco Gaido 
wrote:

> +1 too, I'd consider also to include SPARK-24208 if we can solve it
> timely...
>
> 2018-06-28 8:28 GMT+02:00 Takeshi Yamamuro :
>
>> +1, I heard some Spark users have skipped v2.3.1 because of these bugs.
>>
>> On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang 
>> wrote:
>>
>>> +1
>>>
>>> Wenchen Fan 于2018年6月28日 周四下午2:06写道:
>>>
 Hi Saisai, that's great! please go ahead!

 On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao 
 wrote:

> +1, like mentioned by Marcelo, these issues seems quite severe.
>
> I can work on the release if short of hands :).
>
> Thanks
> Jerry
>
>
> Marcelo Vanzin  于2018年6月28日周四 上午11:40写道:
>
>> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
>> for those out.
>>
>> (Those are what delayed 2.2.2 and 2.1.3 for those watching...)
>>
>> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan 
>> wrote:
>> > Hi all,
>> >
>> > Spark 2.3.1 was released just a while ago, but unfortunately we
>> discovered
>> > and fixed some critical issues afterward.
>> >
>> > SPARK-24495: SortMergeJoin may produce wrong result.
>> > This is a serious correctness bug, and is easy to hit: have
>> duplicated join
>> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`,
>> and the
>> > join is a sort merge join. This bug is only present in Spark 2.3.
>> >
>> > SPARK-24588: stream-stream join may produce wrong result
>> > This is a correctness bug in a new feature of Spark 2.3: the
>> stream-stream
>> > join. Users can hit this bug if one of the join side is partitioned
>> by a
>> > subset of the join keys.
>> >
>> > SPARK-24552: Task attempt numbers are reused when stages are retried
>> > This is a long-standing bug in the output committer that may
>> introduce data
>> > corruption.
>> >
>> > SPARK-24542: UDFXPath allow users to pass carefully crafted XML
>> to
>> > access arbitrary files
>> > This is a potential security issue if users build access control
>> module upon
>> > Spark.
>> >
>> > I think we need a Spark 2.3.2 to address these issues(especially the
>> > correctness bugs) ASAP. Any thoughts?
>> >
>> > Thanks,
>> > Wenchen
>>
>>
>>
>> --
>> Marcelo
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>
>


-- 
Stavros Kontopoulos

*Senior Software Engineer*
*Lightbend, Inc.*

*p:  +30 6977967274 <%2B1%20650%20678%200020>*
*e: stavros.kontopou...@lightbend.com* 


Re: Time for 2.3.2?

2018-06-28 Thread Marco Gaido
+1 too, I'd consider also to include SPARK-24208 if we can solve it
timely...

2018-06-28 8:28 GMT+02:00 Takeshi Yamamuro :

> +1, I heard some Spark users have skipped v2.3.1 because of these bugs.
>
> On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang 
> wrote:
>
>> +1
>>
>> Wenchen Fan 于2018年6月28日 周四下午2:06写道:
>>
>>> Hi Saisai, that's great! please go ahead!
>>>
>>> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao 
>>> wrote:
>>>
 +1, like mentioned by Marcelo, these issues seems quite severe.

 I can work on the release if short of hands :).

 Thanks
 Jerry


 Marcelo Vanzin  于2018年6月28日周四 上午11:40写道:

> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
> for those out.
>
> (Those are what delayed 2.2.2 and 2.1.3 for those watching...)
>
> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan 
> wrote:
> > Hi all,
> >
> > Spark 2.3.1 was released just a while ago, but unfortunately we
> discovered
> > and fixed some critical issues afterward.
> >
> > SPARK-24495: SortMergeJoin may produce wrong result.
> > This is a serious correctness bug, and is easy to hit: have
> duplicated join
> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`,
> and the
> > join is a sort merge join. This bug is only present in Spark 2.3.
> >
> > SPARK-24588: stream-stream join may produce wrong result
> > This is a correctness bug in a new feature of Spark 2.3: the
> stream-stream
> > join. Users can hit this bug if one of the join side is partitioned
> by a
> > subset of the join keys.
> >
> > SPARK-24552: Task attempt numbers are reused when stages are retried
> > This is a long-standing bug in the output committer that may
> introduce data
> > corruption.
> >
> > SPARK-24542: UDFXPath allow users to pass carefully crafted XML
> to
> > access arbitrary files
> > This is a potential security issue if users build access control
> module upon
> > Spark.
> >
> > I think we need a Spark 2.3.2 to address these issues(especially the
> > correctness bugs) ASAP. Any thoughts?
> >
> > Thanks,
> > Wenchen
>
>
>
> --
> Marcelo
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
>
> --
> ---
> Takeshi Yamamuro
>


Re: Time for 2.3.2?

2018-06-28 Thread Takeshi Yamamuro
+1, I heard some Spark users have skipped v2.3.1 because of these bugs.

On Thu, Jun 28, 2018 at 3:09 PM Xingbo Jiang  wrote:

> +1
>
> Wenchen Fan 于2018年6月28日 周四下午2:06写道:
>
>> Hi Saisai, that's great! please go ahead!
>>
>> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao 
>> wrote:
>>
>>> +1, like mentioned by Marcelo, these issues seems quite severe.
>>>
>>> I can work on the release if short of hands :).
>>>
>>> Thanks
>>> Jerry
>>>
>>>
>>> Marcelo Vanzin  于2018年6月28日周四 上午11:40写道:
>>>
 +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
 for those out.

 (Those are what delayed 2.2.2 and 2.1.3 for those watching...)

 On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan 
 wrote:
 > Hi all,
 >
 > Spark 2.3.1 was released just a while ago, but unfortunately we
 discovered
 > and fixed some critical issues afterward.
 >
 > SPARK-24495: SortMergeJoin may produce wrong result.
 > This is a serious correctness bug, and is easy to hit: have
 duplicated join
 > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`,
 and the
 > join is a sort merge join. This bug is only present in Spark 2.3.
 >
 > SPARK-24588: stream-stream join may produce wrong result
 > This is a correctness bug in a new feature of Spark 2.3: the
 stream-stream
 > join. Users can hit this bug if one of the join side is partitioned
 by a
 > subset of the join keys.
 >
 > SPARK-24552: Task attempt numbers are reused when stages are retried
 > This is a long-standing bug in the output committer that may
 introduce data
 > corruption.
 >
 > SPARK-24542: UDFXPath allow users to pass carefully crafted XML to
 > access arbitrary files
 > This is a potential security issue if users build access control
 module upon
 > Spark.
 >
 > I think we need a Spark 2.3.2 to address these issues(especially the
 > correctness bugs) ASAP. Any thoughts?
 >
 > Thanks,
 > Wenchen



 --
 Marcelo

 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



-- 
---
Takeshi Yamamuro


Re: Time for 2.3.2?

2018-06-28 Thread Xingbo Jiang
+1

Wenchen Fan 于2018年6月28日 周四下午2:06写道:

> Hi Saisai, that's great! please go ahead!
>
> On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao 
> wrote:
>
>> +1, like mentioned by Marcelo, these issues seems quite severe.
>>
>> I can work on the release if short of hands :).
>>
>> Thanks
>> Jerry
>>
>>
>> Marcelo Vanzin  于2018年6月28日周四 上午11:40写道:
>>
>>> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
>>> for those out.
>>>
>>> (Those are what delayed 2.2.2 and 2.1.3 for those watching...)
>>>
>>> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan 
>>> wrote:
>>> > Hi all,
>>> >
>>> > Spark 2.3.1 was released just a while ago, but unfortunately we
>>> discovered
>>> > and fixed some critical issues afterward.
>>> >
>>> > SPARK-24495: SortMergeJoin may produce wrong result.
>>> > This is a serious correctness bug, and is easy to hit: have duplicated
>>> join
>>> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and
>>> the
>>> > join is a sort merge join. This bug is only present in Spark 2.3.
>>> >
>>> > SPARK-24588: stream-stream join may produce wrong result
>>> > This is a correctness bug in a new feature of Spark 2.3: the
>>> stream-stream
>>> > join. Users can hit this bug if one of the join side is partitioned by
>>> a
>>> > subset of the join keys.
>>> >
>>> > SPARK-24552: Task attempt numbers are reused when stages are retried
>>> > This is a long-standing bug in the output committer that may introduce
>>> data
>>> > corruption.
>>> >
>>> > SPARK-24542: UDFXPath allow users to pass carefully crafted XML to
>>> > access arbitrary files
>>> > This is a potential security issue if users build access control
>>> module upon
>>> > Spark.
>>> >
>>> > I think we need a Spark 2.3.2 to address these issues(especially the
>>> > correctness bugs) ASAP. Any thoughts?
>>> >
>>> > Thanks,
>>> > Wenchen
>>>
>>>
>>>
>>> --
>>> Marcelo
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>


Re: Time for 2.3.2?

2018-06-28 Thread Wenchen Fan
Hi Saisai, that's great! please go ahead!

On Thu, Jun 28, 2018 at 12:56 PM Saisai Shao  wrote:

> +1, like mentioned by Marcelo, these issues seems quite severe.
>
> I can work on the release if short of hands :).
>
> Thanks
> Jerry
>
>
> Marcelo Vanzin  于2018年6月28日周四 上午11:40写道:
>
>> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
>> for those out.
>>
>> (Those are what delayed 2.2.2 and 2.1.3 for those watching...)
>>
>> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan  wrote:
>> > Hi all,
>> >
>> > Spark 2.3.1 was released just a while ago, but unfortunately we
>> discovered
>> > and fixed some critical issues afterward.
>> >
>> > SPARK-24495: SortMergeJoin may produce wrong result.
>> > This is a serious correctness bug, and is easy to hit: have duplicated
>> join
>> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and
>> the
>> > join is a sort merge join. This bug is only present in Spark 2.3.
>> >
>> > SPARK-24588: stream-stream join may produce wrong result
>> > This is a correctness bug in a new feature of Spark 2.3: the
>> stream-stream
>> > join. Users can hit this bug if one of the join side is partitioned by a
>> > subset of the join keys.
>> >
>> > SPARK-24552: Task attempt numbers are reused when stages are retried
>> > This is a long-standing bug in the output committer that may introduce
>> data
>> > corruption.
>> >
>> > SPARK-24542: UDFXPath allow users to pass carefully crafted XML to
>> > access arbitrary files
>> > This is a potential security issue if users build access control module
>> upon
>> > Spark.
>> >
>> > I think we need a Spark 2.3.2 to address these issues(especially the
>> > correctness bugs) ASAP. Any thoughts?
>> >
>> > Thanks,
>> > Wenchen
>>
>>
>>
>> --
>> Marcelo
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>


Re: Time for 2.3.2?

2018-06-27 Thread Saisai Shao
+1, like mentioned by Marcelo, these issues seems quite severe.

I can work on the release if short of hands :).

Thanks
Jerry


Marcelo Vanzin  于2018年6月28日周四 上午11:40写道:

> +1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
> for those out.
>
> (Those are what delayed 2.2.2 and 2.1.3 for those watching...)
>
> On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan  wrote:
> > Hi all,
> >
> > Spark 2.3.1 was released just a while ago, but unfortunately we
> discovered
> > and fixed some critical issues afterward.
> >
> > SPARK-24495: SortMergeJoin may produce wrong result.
> > This is a serious correctness bug, and is easy to hit: have duplicated
> join
> > key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and
> the
> > join is a sort merge join. This bug is only present in Spark 2.3.
> >
> > SPARK-24588: stream-stream join may produce wrong result
> > This is a correctness bug in a new feature of Spark 2.3: the
> stream-stream
> > join. Users can hit this bug if one of the join side is partitioned by a
> > subset of the join keys.
> >
> > SPARK-24552: Task attempt numbers are reused when stages are retried
> > This is a long-standing bug in the output committer that may introduce
> data
> > corruption.
> >
> > SPARK-24542: UDFXPath allow users to pass carefully crafted XML to
> > access arbitrary files
> > This is a potential security issue if users build access control module
> upon
> > Spark.
> >
> > I think we need a Spark 2.3.2 to address these issues(especially the
> > correctness bugs) ASAP. Any thoughts?
> >
> > Thanks,
> > Wenchen
>
>
>
> --
> Marcelo
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: Time for 2.3.2?

2018-06-27 Thread Marcelo Vanzin
+1. SPARK-24589 / SPARK-24552 are kinda nasty and we should get fixes
for those out.

(Those are what delayed 2.2.2 and 2.1.3 for those watching...)

On Wed, Jun 27, 2018 at 7:59 PM, Wenchen Fan  wrote:
> Hi all,
>
> Spark 2.3.1 was released just a while ago, but unfortunately we discovered
> and fixed some critical issues afterward.
>
> SPARK-24495: SortMergeJoin may produce wrong result.
> This is a serious correctness bug, and is easy to hit: have duplicated join
> key from the left table, e.g. `WHERE t1.a = t2.b AND t1.a = t2.c`, and the
> join is a sort merge join. This bug is only present in Spark 2.3.
>
> SPARK-24588: stream-stream join may produce wrong result
> This is a correctness bug in a new feature of Spark 2.3: the stream-stream
> join. Users can hit this bug if one of the join side is partitioned by a
> subset of the join keys.
>
> SPARK-24552: Task attempt numbers are reused when stages are retried
> This is a long-standing bug in the output committer that may introduce data
> corruption.
>
> SPARK-24542: UDFXPath allow users to pass carefully crafted XML to
> access arbitrary files
> This is a potential security issue if users build access control module upon
> Spark.
>
> I think we need a Spark 2.3.2 to address these issues(especially the
> correctness bugs) ASAP. Any thoughts?
>
> Thanks,
> Wenchen



-- 
Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org