Re: [VOTE] SPARK 2.3.2 (RC3)

2018-08-06 Thread Saisai Shao
Yes, there'll be an RC4, still waiting for the fix of one issue.

Yuval Itzchakov  于2018年8月6日周一 下午6:10写道:

> Are there any plans to create an RC4? There's an important Kafka Source
> leak
> fix I've merged back to the 2.3 branch.
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-08-06 Thread Yuval Itzchakov
Are there any plans to create an RC4? There's an important Kafka Source leak
fix I've merged back to the 2.3 branch.



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-30 Thread Wenchen Fan
Another two correctness bug fixes were merged to 2.3 today:
https://issues.apache.org/jira/browse/SPARK-24934
https://issues.apache.org/jira/browse/SPARK-24957

On Mon, Jul 30, 2018 at 1:19 PM Xiao Li  wrote:

> Sounds good to me. Thanks! Today, we merged another correctness fix
> https://github.com/apache/spark/pull/21772.
>
> Xiao
>
> 2018-07-29 18:31 GMT-07:00 Saisai Shao :
>
>> Sure, I will do a next RC. I'm still waiting for a CVE fix, if this can
>> be done in this two days, I will also include that one.
>>
>> Xiao Li  于2018年7月28日周六 上午12:05写道:
>>
>>> The following blocker/important fixes have been merged to Spark 2.3
>>> branch:
>>>
>>> https://issues.apache.org/jira/browse/SPARK-24927
>>> https://issues.apache.org/jira/browse/SPARK-24867
>>> https://issues.apache.org/jira/browse/SPARK-24891
>>>
>>> *Saisai*, could you start the next RC?
>>>
>>> Thanks,
>>>
>>> Xiao
>>>
>>>
>>> 2018-07-20 14:21 GMT-07:00 Tom Graves :
>>>
 fyi, I merged in a couple jira that were critical (and I thought would
 be good to include in the next release) that if we spin another RC will get
 included, we should update the jira SPARK-24755
 
  and SPARK-24677
 ,
 if anyone disagrees we could back those out but I think they would be good
 to include.

 Tom

 On Thursday, July 19, 2018, 8:13:23 PM CDT, Saisai Shao <
 sai.sai.s...@gmail.com> wrote:


 Sure, I can wait for this and create another RC then.

 Thanks,
 Saisai

 Xiao Li  于2018年7月20日周五 上午9:11写道:

 Yes. https://issues.apache.org/jira/browse/SPARK-24867 is the one I
 created. The PR has been created. Since this is not rare, let us merge it
 to 2.3.2?

 Reynold' PR is to get rid of AnalysisBarrier. That is better than
 multiple patches we added for AnalysisBarrier after 2.3.0 release. We can
 target it to 2.4.

 Thanks,

 Xiao

 2018-07-19 17:48 GMT-07:00 Saisai Shao :

 I see, thanks Reynold.

 Reynold Xin  于2018年7月20日周五 上午8:46写道:

 Looking at the list of pull requests it looks like this is the ticket:
 https://issues.apache.org/jira/browse/SPARK-24867



 On Thu, Jul 19, 2018 at 5:25 PM Reynold Xin 
 wrote:

 I don't think my ticket should block this release. It's a big general
 refactoring.

 Xiao do you have a ticket for the bug you found?


 On Thu, Jul 19, 2018 at 5:24 PM Saisai Shao 
 wrote:

 Hi Xiao,

 Are you referring to this JIRA (
 https://issues.apache.org/jira/browse/SPARK-24865)?

 Xiao Li  于2018年7月20日周五 上午2:41写道:

 dfWithUDF.cache()
 dfWithUDF.write.saveAsTable("t")
 dfWithUDF.write.saveAsTable("t1")


 Cached data is not being used. It causes a big performance regression.




 2018-07-19 11:32 GMT-07:00 Sean Owen :

 What regression are you referring to here? A -1 vote really needs a
 rationale.

 On Thu, Jul 19, 2018 at 1:27 PM Xiao Li  wrote:

 I would first vote -1.

 I might find another regression caused by the analysis barrier. Will
 keep you posted.




>>>
>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-29 Thread Xiao Li
Sounds good to me. Thanks! Today, we merged another correctness fix
https://github.com/apache/spark/pull/21772.

Xiao

2018-07-29 18:31 GMT-07:00 Saisai Shao :

> Sure, I will do a next RC. I'm still waiting for a CVE fix, if this can be
> done in this two days, I will also include that one.
>
> Xiao Li  于2018年7月28日周六 上午12:05写道:
>
>> The following blocker/important fixes have been merged to Spark 2.3
>> branch:
>>
>> https://issues.apache.org/jira/browse/SPARK-24927
>> https://issues.apache.org/jira/browse/SPARK-24867
>> https://issues.apache.org/jira/browse/SPARK-24891
>>
>> *Saisai*, could you start the next RC?
>>
>> Thanks,
>>
>> Xiao
>>
>>
>> 2018-07-20 14:21 GMT-07:00 Tom Graves :
>>
>>> fyi, I merged in a couple jira that were critical (and I thought would
>>> be good to include in the next release) that if we spin another RC will get
>>> included, we should update the jira SPARK-24755
>>> 
>>>  and SPARK-24677
>>> ,
>>> if anyone disagrees we could back those out but I think they would be good
>>> to include.
>>>
>>> Tom
>>>
>>> On Thursday, July 19, 2018, 8:13:23 PM CDT, Saisai Shao <
>>> sai.sai.s...@gmail.com> wrote:
>>>
>>>
>>> Sure, I can wait for this and create another RC then.
>>>
>>> Thanks,
>>> Saisai
>>>
>>> Xiao Li  于2018年7月20日周五 上午9:11写道:
>>>
>>> Yes. https://issues.apache.org/jira/browse/SPARK-24867 is the one I
>>> created. The PR has been created. Since this is not rare, let us merge it
>>> to 2.3.2?
>>>
>>> Reynold' PR is to get rid of AnalysisBarrier. That is better than
>>> multiple patches we added for AnalysisBarrier after 2.3.0 release. We can
>>> target it to 2.4.
>>>
>>> Thanks,
>>>
>>> Xiao
>>>
>>> 2018-07-19 17:48 GMT-07:00 Saisai Shao :
>>>
>>> I see, thanks Reynold.
>>>
>>> Reynold Xin  于2018年7月20日周五 上午8:46写道:
>>>
>>> Looking at the list of pull requests it looks like this is the ticket:
>>> https://issues.apache.org/jira/browse/SPARK-24867
>>>
>>>
>>>
>>> On Thu, Jul 19, 2018 at 5:25 PM Reynold Xin  wrote:
>>>
>>> I don't think my ticket should block this release. It's a big general
>>> refactoring.
>>>
>>> Xiao do you have a ticket for the bug you found?
>>>
>>>
>>> On Thu, Jul 19, 2018 at 5:24 PM Saisai Shao 
>>> wrote:
>>>
>>> Hi Xiao,
>>>
>>> Are you referring to this JIRA (https://issues.apache.org/
>>> jira/browse/SPARK-24865)?
>>>
>>> Xiao Li  于2018年7月20日周五 上午2:41写道:
>>>
>>> dfWithUDF.cache()
>>> dfWithUDF.write.saveAsTable("t")
>>> dfWithUDF.write.saveAsTable("t1")
>>>
>>>
>>> Cached data is not being used. It causes a big performance regression.
>>>
>>>
>>>
>>>
>>> 2018-07-19 11:32 GMT-07:00 Sean Owen :
>>>
>>> What regression are you referring to here? A -1 vote really needs a
>>> rationale.
>>>
>>> On Thu, Jul 19, 2018 at 1:27 PM Xiao Li  wrote:
>>>
>>> I would first vote -1.
>>>
>>> I might find another regression caused by the analysis barrier. Will
>>> keep you posted.
>>>
>>>
>>>
>>>
>>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-29 Thread Saisai Shao
Sure, I will do a next RC. I'm still waiting for a CVE fix, if this can be
done in this two days, I will also include that one.

Xiao Li  于2018年7月28日周六 上午12:05写道:

> The following blocker/important fixes have been merged to Spark 2.3 branch:
>
> https://issues.apache.org/jira/browse/SPARK-24927
> https://issues.apache.org/jira/browse/SPARK-24867
> https://issues.apache.org/jira/browse/SPARK-24891
>
> *Saisai*, could you start the next RC?
>
> Thanks,
>
> Xiao
>
>
> 2018-07-20 14:21 GMT-07:00 Tom Graves :
>
>> fyi, I merged in a couple jira that were critical (and I thought would be
>> good to include in the next release) that if we spin another RC will get
>> included, we should update the jira SPARK-24755
>> 
>>  and SPARK-24677
>> ,
>> if anyone disagrees we could back those out but I think they would be good
>> to include.
>>
>> Tom
>>
>> On Thursday, July 19, 2018, 8:13:23 PM CDT, Saisai Shao <
>> sai.sai.s...@gmail.com> wrote:
>>
>>
>> Sure, I can wait for this and create another RC then.
>>
>> Thanks,
>> Saisai
>>
>> Xiao Li  于2018年7月20日周五 上午9:11写道:
>>
>> Yes. https://issues.apache.org/jira/browse/SPARK-24867 is the one I
>> created. The PR has been created. Since this is not rare, let us merge it
>> to 2.3.2?
>>
>> Reynold' PR is to get rid of AnalysisBarrier. That is better than
>> multiple patches we added for AnalysisBarrier after 2.3.0 release. We can
>> target it to 2.4.
>>
>> Thanks,
>>
>> Xiao
>>
>> 2018-07-19 17:48 GMT-07:00 Saisai Shao :
>>
>> I see, thanks Reynold.
>>
>> Reynold Xin  于2018年7月20日周五 上午8:46写道:
>>
>> Looking at the list of pull requests it looks like this is the ticket:
>> https://issues.apache.org/jira/browse/SPARK-24867
>>
>>
>>
>> On Thu, Jul 19, 2018 at 5:25 PM Reynold Xin  wrote:
>>
>> I don't think my ticket should block this release. It's a big general
>> refactoring.
>>
>> Xiao do you have a ticket for the bug you found?
>>
>>
>> On Thu, Jul 19, 2018 at 5:24 PM Saisai Shao 
>> wrote:
>>
>> Hi Xiao,
>>
>> Are you referring to this JIRA (
>> https://issues.apache.org/jira/browse/SPARK-24865)?
>>
>> Xiao Li  于2018年7月20日周五 上午2:41写道:
>>
>> dfWithUDF.cache()
>> dfWithUDF.write.saveAsTable("t")
>> dfWithUDF.write.saveAsTable("t1")
>>
>>
>> Cached data is not being used. It causes a big performance regression.
>>
>>
>>
>>
>> 2018-07-19 11:32 GMT-07:00 Sean Owen :
>>
>> What regression are you referring to here? A -1 vote really needs a
>> rationale.
>>
>> On Thu, Jul 19, 2018 at 1:27 PM Xiao Li  wrote:
>>
>> I would first vote -1.
>>
>> I might find another regression caused by the analysis barrier. Will keep
>> you posted.
>>
>>
>>
>>
>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-27 Thread Xiao Li
The following blocker/important fixes have been merged to Spark 2.3 branch:

https://issues.apache.org/jira/browse/SPARK-24927
https://issues.apache.org/jira/browse/SPARK-24867
https://issues.apache.org/jira/browse/SPARK-24891

*Saisai*, could you start the next RC?

Thanks,

Xiao


2018-07-20 14:21 GMT-07:00 Tom Graves :

> fyi, I merged in a couple jira that were critical (and I thought would be
> good to include in the next release) that if we spin another RC will get
> included, we should update the jira SPARK-24755
> 
>  and SPARK-24677
> ,
> if anyone disagrees we could back those out but I think they would be good
> to include.
>
> Tom
>
> On Thursday, July 19, 2018, 8:13:23 PM CDT, Saisai Shao <
> sai.sai.s...@gmail.com> wrote:
>
>
> Sure, I can wait for this and create another RC then.
>
> Thanks,
> Saisai
>
> Xiao Li  于2018年7月20日周五 上午9:11写道:
>
> Yes. https://issues.apache.org/jira/browse/SPARK-24867 is the one I
> created. The PR has been created. Since this is not rare, let us merge it
> to 2.3.2?
>
> Reynold' PR is to get rid of AnalysisBarrier. That is better than multiple
> patches we added for AnalysisBarrier after 2.3.0 release. We can target it
> to 2.4.
>
> Thanks,
>
> Xiao
>
> 2018-07-19 17:48 GMT-07:00 Saisai Shao :
>
> I see, thanks Reynold.
>
> Reynold Xin  于2018年7月20日周五 上午8:46写道:
>
> Looking at the list of pull requests it looks like this is the ticket:
> https://issues.apache.org/jira/browse/SPARK-24867
>
>
>
> On Thu, Jul 19, 2018 at 5:25 PM Reynold Xin  wrote:
>
> I don't think my ticket should block this release. It's a big general
> refactoring.
>
> Xiao do you have a ticket for the bug you found?
>
>
> On Thu, Jul 19, 2018 at 5:24 PM Saisai Shao 
> wrote:
>
> Hi Xiao,
>
> Are you referring to this JIRA (https://issues.apache.org/
> jira/browse/SPARK-24865)?
>
> Xiao Li  于2018年7月20日周五 上午2:41写道:
>
> dfWithUDF.cache()
> dfWithUDF.write.saveAsTable("t")
> dfWithUDF.write.saveAsTable("t1")
>
>
> Cached data is not being used. It causes a big performance regression.
>
>
>
>
> 2018-07-19 11:32 GMT-07:00 Sean Owen :
>
> What regression are you referring to here? A -1 vote really needs a
> rationale.
>
> On Thu, Jul 19, 2018 at 1:27 PM Xiao Li  wrote:
>
> I would first vote -1.
>
> I might find another regression caused by the analysis barrier. Will keep
> you posted.
>
>
>
>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-20 Thread Tom Graves
 fyi, I merged in a couple jira that were critical (and I thought would be good 
to include in the next release) that if we spin another RC will get included, 
we should update the jira SPARK-24755 and SPARK-24677, if anyone disagrees we 
could back those out but I think they would be good to include.
Tom
On Thursday, July 19, 2018, 8:13:23 PM CDT, Saisai Shao 
 wrote:  
 
 Sure, I can wait for this and create another RC then.
Thanks,Saisai
Xiao Li  于2018年7月20日周五 上午9:11写道:

Yes. https://issues.apache.org/jira/browse/SPARK-24867 is the one I created. 
The PR has been created. Since this is not rare, let us merge it to 2.3.2? 
Reynold' PR is to get rid of AnalysisBarrier. That is better than multiple 
patches we added for AnalysisBarrier after 2.3.0 release. We can target it to 
2.4. 
Thanks, 
Xiao
2018-07-19 17:48 GMT-07:00 Saisai Shao :

I see, thanks Reynold.
Reynold Xin  于2018年7月20日周五 上午8:46写道:

Looking at the list of pull requests it looks like this is the ticket: 
https://issues.apache.org/jira/browse/SPARK-24867


On Thu, Jul 19, 2018 at 5:25 PM Reynold Xin  wrote:

I don't think my ticket should block this release. It's a big general 
refactoring.
Xiao do you have a ticket for the bug you found?

On Thu, Jul 19, 2018 at 5:24 PM Saisai Shao  wrote:

Hi Xiao,
Are you referring to this JIRA 
(https://issues.apache.org/jira/browse/SPARK-24865)?
Xiao Li  于2018年7月20日周五 上午2:41写道:

dfWithUDF.cache()
dfWithUDF.write.saveAsTable("t")
dfWithUDF.write.saveAsTable("t1")
Cached data is not being used. It causes a big performance regression. 



2018-07-19 11:32 GMT-07:00 Sean Owen :

What regression are you referring to here? A -1 vote really needs a rationale.

On Thu, Jul 19, 2018 at 1:27 PM Xiao Li  wrote:

I would first vote -1. 
I might find another regression caused by the analysis barrier. Will keep you 
posted. 










  

Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-19 Thread Saisai Shao
Sure, I can wait for this and create another RC then.

Thanks,
Saisai

Xiao Li  于2018年7月20日周五 上午9:11写道:

> Yes. https://issues.apache.org/jira/browse/SPARK-24867 is the one I
> created. The PR has been created. Since this is not rare, let us merge it
> to 2.3.2?
>
> Reynold' PR is to get rid of AnalysisBarrier. That is better than multiple
> patches we added for AnalysisBarrier after 2.3.0 release. We can target it
> to 2.4.
>
> Thanks,
>
> Xiao
>
> 2018-07-19 17:48 GMT-07:00 Saisai Shao :
>
>> I see, thanks Reynold.
>>
>> Reynold Xin  于2018年7月20日周五 上午8:46写道:
>>
>>> Looking at the list of pull requests it looks like this is the ticket:
>>> https://issues.apache.org/jira/browse/SPARK-24867
>>>
>>>
>>>
>>> On Thu, Jul 19, 2018 at 5:25 PM Reynold Xin  wrote:
>>>
 I don't think my ticket should block this release. It's a big general
 refactoring.

 Xiao do you have a ticket for the bug you found?


 On Thu, Jul 19, 2018 at 5:24 PM Saisai Shao 
 wrote:

> Hi Xiao,
>
> Are you referring to this JIRA (
> https://issues.apache.org/jira/browse/SPARK-24865)?
>
> Xiao Li  于2018年7月20日周五 上午2:41写道:
>
>> dfWithUDF.cache()
>> dfWithUDF.write.saveAsTable("t")
>> dfWithUDF.write.saveAsTable("t1")
>>
>>
>> Cached data is not being used. It causes a big performance
>> regression.
>>
>>
>>
>>
>> 2018-07-19 11:32 GMT-07:00 Sean Owen :
>>
>>> What regression are you referring to here? A -1 vote really needs a
>>> rationale.
>>>
>>> On Thu, Jul 19, 2018 at 1:27 PM Xiao Li 
>>> wrote:
>>>
 I would first vote -1.

 I might find another regression caused by the analysis barrier.
 Will keep you posted.


>>
>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-19 Thread Xiao Li
Yes. https://issues.apache.org/jira/browse/SPARK-24867 is the one I
created. The PR has been created. Since this is not rare, let us merge it
to 2.3.2?

Reynold' PR is to get rid of AnalysisBarrier. That is better than multiple
patches we added for AnalysisBarrier after 2.3.0 release. We can target it
to 2.4.

Thanks,

Xiao

2018-07-19 17:48 GMT-07:00 Saisai Shao :

> I see, thanks Reynold.
>
> Reynold Xin  于2018年7月20日周五 上午8:46写道:
>
>> Looking at the list of pull requests it looks like this is the ticket:
>> https://issues.apache.org/jira/browse/SPARK-24867
>>
>>
>>
>> On Thu, Jul 19, 2018 at 5:25 PM Reynold Xin  wrote:
>>
>>> I don't think my ticket should block this release. It's a big general
>>> refactoring.
>>>
>>> Xiao do you have a ticket for the bug you found?
>>>
>>>
>>> On Thu, Jul 19, 2018 at 5:24 PM Saisai Shao 
>>> wrote:
>>>
 Hi Xiao,

 Are you referring to this JIRA (https://issues.apache.org/
 jira/browse/SPARK-24865)?

 Xiao Li  于2018年7月20日周五 上午2:41写道:

> dfWithUDF.cache()
> dfWithUDF.write.saveAsTable("t")
> dfWithUDF.write.saveAsTable("t1")
>
>
> Cached data is not being used. It causes a big performance regression.
>
>
>
>
> 2018-07-19 11:32 GMT-07:00 Sean Owen :
>
>> What regression are you referring to here? A -1 vote really needs a
>> rationale.
>>
>> On Thu, Jul 19, 2018 at 1:27 PM Xiao Li  wrote:
>>
>>> I would first vote -1.
>>>
>>> I might find another regression caused by the analysis barrier. Will
>>> keep you posted.
>>>
>>>
>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-19 Thread Saisai Shao
I see, thanks Reynold.

Reynold Xin  于2018年7月20日周五 上午8:46写道:

> Looking at the list of pull requests it looks like this is the ticket:
> https://issues.apache.org/jira/browse/SPARK-24867
>
>
>
> On Thu, Jul 19, 2018 at 5:25 PM Reynold Xin  wrote:
>
>> I don't think my ticket should block this release. It's a big general
>> refactoring.
>>
>> Xiao do you have a ticket for the bug you found?
>>
>>
>> On Thu, Jul 19, 2018 at 5:24 PM Saisai Shao 
>> wrote:
>>
>>> Hi Xiao,
>>>
>>> Are you referring to this JIRA (
>>> https://issues.apache.org/jira/browse/SPARK-24865)?
>>>
>>> Xiao Li  于2018年7月20日周五 上午2:41写道:
>>>
 dfWithUDF.cache()
 dfWithUDF.write.saveAsTable("t")
 dfWithUDF.write.saveAsTable("t1")


 Cached data is not being used. It causes a big performance regression.




 2018-07-19 11:32 GMT-07:00 Sean Owen :

> What regression are you referring to here? A -1 vote really needs a
> rationale.
>
> On Thu, Jul 19, 2018 at 1:27 PM Xiao Li  wrote:
>
>> I would first vote -1.
>>
>> I might find another regression caused by the analysis barrier. Will
>> keep you posted.
>>
>>



Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-19 Thread Reynold Xin
Looking at the list of pull requests it looks like this is the ticket:
https://issues.apache.org/jira/browse/SPARK-24867



On Thu, Jul 19, 2018 at 5:25 PM Reynold Xin  wrote:

> I don't think my ticket should block this release. It's a big general
> refactoring.
>
> Xiao do you have a ticket for the bug you found?
>
>
> On Thu, Jul 19, 2018 at 5:24 PM Saisai Shao 
> wrote:
>
>> Hi Xiao,
>>
>> Are you referring to this JIRA (
>> https://issues.apache.org/jira/browse/SPARK-24865)?
>>
>> Xiao Li  于2018年7月20日周五 上午2:41写道:
>>
>>> dfWithUDF.cache()
>>> dfWithUDF.write.saveAsTable("t")
>>> dfWithUDF.write.saveAsTable("t1")
>>>
>>>
>>> Cached data is not being used. It causes a big performance regression.
>>>
>>>
>>>
>>>
>>> 2018-07-19 11:32 GMT-07:00 Sean Owen :
>>>
 What regression are you referring to here? A -1 vote really needs a
 rationale.

 On Thu, Jul 19, 2018 at 1:27 PM Xiao Li  wrote:

> I would first vote -1.
>
> I might find another regression caused by the analysis barrier. Will
> keep you posted.
>
>
>>>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-19 Thread Reynold Xin
I don't think my ticket should block this release. It's a big general
refactoring.

Xiao do you have a ticket for the bug you found?


On Thu, Jul 19, 2018 at 5:24 PM Saisai Shao  wrote:

> Hi Xiao,
>
> Are you referring to this JIRA (
> https://issues.apache.org/jira/browse/SPARK-24865)?
>
> Xiao Li  于2018年7月20日周五 上午2:41写道:
>
>> dfWithUDF.cache()
>> dfWithUDF.write.saveAsTable("t")
>> dfWithUDF.write.saveAsTable("t1")
>>
>>
>> Cached data is not being used. It causes a big performance regression.
>>
>>
>>
>>
>> 2018-07-19 11:32 GMT-07:00 Sean Owen :
>>
>>> What regression are you referring to here? A -1 vote really needs a
>>> rationale.
>>>
>>> On Thu, Jul 19, 2018 at 1:27 PM Xiao Li  wrote:
>>>
 I would first vote -1.

 I might find another regression caused by the analysis barrier. Will
 keep you posted.


>>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-19 Thread Saisai Shao
Hi Xiao,

Are you referring to this JIRA (
https://issues.apache.org/jira/browse/SPARK-24865)?

Xiao Li  于2018年7月20日周五 上午2:41写道:

> dfWithUDF.cache()
> dfWithUDF.write.saveAsTable("t")
> dfWithUDF.write.saveAsTable("t1")
>
>
> Cached data is not being used. It causes a big performance regression.
>
>
>
>
> 2018-07-19 11:32 GMT-07:00 Sean Owen :
>
>> What regression are you referring to here? A -1 vote really needs a
>> rationale.
>>
>> On Thu, Jul 19, 2018 at 1:27 PM Xiao Li  wrote:
>>
>>> I would first vote -1.
>>>
>>> I might find another regression caused by the analysis barrier. Will
>>> keep you posted.
>>>
>>>
>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-19 Thread Xiao Li
dfWithUDF.cache()
dfWithUDF.write.saveAsTable("t")
dfWithUDF.write.saveAsTable("t1")


Cached data is not being used. It causes a big performance regression.




2018-07-19 11:32 GMT-07:00 Sean Owen :

> What regression are you referring to here? A -1 vote really needs a
> rationale.
>
> On Thu, Jul 19, 2018 at 1:27 PM Xiao Li  wrote:
>
>> I would first vote -1.
>>
>> I might find another regression caused by the analysis barrier. Will keep
>> you posted.
>>
>>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-19 Thread Sean Owen
What regression are you referring to here? A -1 vote really needs a
rationale.

On Thu, Jul 19, 2018 at 1:27 PM Xiao Li  wrote:

> I would first vote -1.
>
> I might find another regression caused by the analysis barrier. Will keep
> you posted.
>
>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-19 Thread Xiao Li
I would first vote -1.

I might find another regression caused by the analysis barrier. Will keep
you posted.

Xiao

2018-07-18 18:05 GMT-07:00 Takeshi Yamamuro :

> +1 (non-binding)
>
> I run tests on a EC2 m4.2xlarge instance;
> [ec2-user]$ java -version
> openjdk version "1.8.0_171"
> OpenJDK Runtime Environment (build 1.8.0_171-b10)
> OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)
>
>
> On Thu, Jul 19, 2018 at 5:29 AM Ryan Blue 
> wrote:
>
>> +1 (non-binding)
>>
>> On Wed, Jul 18, 2018 at 10:38 AM Denny Lee  wrote:
>>
>>> +1 (non-binding)
>>> On Tue, Jul 17, 2018 at 23:04 John Zhuge  wrote:
>>>
 +1 (non-binding)

 On Mon, Jul 16, 2018 at 8:04 PM Saisai Shao 
 wrote:

> I will put my +1 on this RC.
>
> For the test failure fix, I will include it if there's another RC.
>
> Sean Owen  于2018年7月16日周一 下午10:47写道:
>
 OK, hm, will try to get to the bottom of it. But if others can build
>> this module successfully, I give a +1 . The test failure is inevitable 
>> here
>> and should not block release.
>>
>> On Sun, Jul 15, 2018 at 9:39 PM Saisai Shao 
>> wrote:
>>
> Hi Sean,
>>>
>>> I just did a clean build with mvn/sbt on 2.3.2, I didn't meet the
>>> errors you pasted here. I'm not sure how it happens.
>>>
>>> Sean Owen  于2018年7月16日周一 上午6:30写道:
>>>
>> Looks good to me, with the following caveats.

 First see the discussion on https://issues.apache.org/
 jira/browse/SPARK-24813 ; the flaky HiveExternalCatalogVersionsSuite
 will probably fail all the time right now. That's not a regression and 
 is a
 test-only issue, so don't think it must block the release. However if 
 this
 fix holds up, and we need another RC, worth pulling in for sure.

 Also is anyone seeing this while building and testing the Spark
 SQL + Kafka module? I see this error even after a clean rebuild. I 
 sort of
 get what the error is saying but can't figure out why it would only 
 happen
 at test/runtime. Haven't seen it before.

 [error] missing or invalid dependency detected while loading class
 file 'MetricsSystem.class'.

 [error] Could not access term eclipse in package org,

 [error] because it (or its dependencies) are missing. Check your
 build definition for

 [error] missing or conflicting dependencies. (Re-run with
 `-Ylog-classpath` to see the problematic classpath.)

 [error] A full rebuild may help if 'MetricsSystem.class' was
 compiled against an incompatible version of org.

 [error] missing or invalid dependency detected while loading class
 file 'MetricsSystem.class'.

 [error] Could not access term jetty in value org.eclipse,

 [error] because it (or its dependencies) are missing. Check your
 build definition for

 [error] missing or conflicting dependencies. (Re-run with
 `-Ylog-classpath` to see the problematic classpath.)

 [error] A full rebuild may help if 'MetricsSystem.class' was
 compiled against an incompatible version of org.eclipse

 On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao 
 wrote:

>>> Please vote on releasing the following candidate as Apache Spark
> version 2.3.2.
>
> The vote is open until July 20 PST and passes if a majority +1 PMC
> votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 2.3.2
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see
> http://spark.apache.org/
>
> The tag to be voted on is v2.3.2-rc3 (commit
> b3726dadcf2997f20231873ec6e057dba433ae64):
> https://github.com/apache/spark/tree/v2.3.2-rc3
>
> The release files, including signatures, digests, etc. can be
> found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/
> orgapachespark-1278/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/
>
> The list of bug fixes going into 2.3.2 can be found at the
> following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12343289
>
> Note. RC2 was cancelled because of one blocking issue SPARK-24781
> during release 

Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-18 Thread Takeshi Yamamuro
+1 (non-binding)

I run tests on a EC2 m4.2xlarge instance;
[ec2-user]$ java -version
openjdk version "1.8.0_171"
OpenJDK Runtime Environment (build 1.8.0_171-b10)
OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)


On Thu, Jul 19, 2018 at 5:29 AM Ryan Blue  wrote:

> +1 (non-binding)
>
> On Wed, Jul 18, 2018 at 10:38 AM Denny Lee  wrote:
>
>> +1 (non-binding)
>> On Tue, Jul 17, 2018 at 23:04 John Zhuge  wrote:
>>
>>> +1 (non-binding)
>>>
>>> On Mon, Jul 16, 2018 at 8:04 PM Saisai Shao 
>>> wrote:
>>>
 I will put my +1 on this RC.

 For the test failure fix, I will include it if there's another RC.

 Sean Owen  于2018年7月16日周一 下午10:47写道:

>>> OK, hm, will try to get to the bottom of it. But if others can build
> this module successfully, I give a +1 . The test failure is inevitable 
> here
> and should not block release.
>
> On Sun, Jul 15, 2018 at 9:39 PM Saisai Shao 
> wrote:
>
 Hi Sean,
>>
>> I just did a clean build with mvn/sbt on 2.3.2, I didn't meet the
>> errors you pasted here. I'm not sure how it happens.
>>
>> Sean Owen  于2018年7月16日周一 上午6:30写道:
>>
> Looks good to me, with the following caveats.
>>>
>>> First see the discussion on
>>> https://issues.apache.org/jira/browse/SPARK-24813 ; the
>>> flaky HiveExternalCatalogVersionsSuite will probably fail all the time
>>> right now. That's not a regression and is a test-only issue, so don't 
>>> think
>>> it must block the release. However if this fix holds up, and we need
>>> another RC, worth pulling in for sure.
>>>
>>> Also is anyone seeing this while building and testing the Spark
>>> SQL + Kafka module? I see this error even after a clean rebuild. I sort 
>>> of
>>> get what the error is saying but can't figure out why it would only 
>>> happen
>>> at test/runtime. Haven't seen it before.
>>>
>>> [error] missing or invalid dependency detected while loading class
>>> file 'MetricsSystem.class'.
>>>
>>> [error] Could not access term eclipse in package org,
>>>
>>> [error] because it (or its dependencies) are missing. Check your
>>> build definition for
>>>
>>> [error] missing or conflicting dependencies. (Re-run with
>>> `-Ylog-classpath` to see the problematic classpath.)
>>>
>>> [error] A full rebuild may help if 'MetricsSystem.class' was
>>> compiled against an incompatible version of org.
>>>
>>> [error] missing or invalid dependency detected while loading class
>>> file 'MetricsSystem.class'.
>>>
>>> [error] Could not access term jetty in value org.eclipse,
>>>
>>> [error] because it (or its dependencies) are missing. Check your
>>> build definition for
>>>
>>> [error] missing or conflicting dependencies. (Re-run with
>>> `-Ylog-classpath` to see the problematic classpath.)
>>>
>>> [error] A full rebuild may help if 'MetricsSystem.class' was
>>> compiled against an incompatible version of org.eclipse
>>>
>>> On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao 
>>> wrote:
>>>
>> Please vote on releasing the following candidate as Apache Spark
 version 2.3.2.

 The vote is open until July 20 PST and passes if a majority +1 PMC
 votes are cast, with a minimum of 3 +1 votes.

 [ ] +1 Release this package as Apache Spark 2.3.2
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see
 http://spark.apache.org/

 The tag to be voted on is v2.3.2-rc3
 (commit b3726dadcf2997f20231873ec6e057dba433ae64):
 https://github.com/apache/spark/tree/v2.3.2-rc3

 The release files, including signatures, digests, etc. can be found
 at:
 https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/

 Signatures used for Spark RCs can be found in this file:
 https://dist.apache.org/repos/dist/dev/spark/KEYS

 The staging repository for this release can be found at:

 https://repository.apache.org/content/repositories/orgapachespark-1278/

 The documentation corresponding to this release can be found at:
 https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/

 The list of bug fixes going into 2.3.2 can be found at the
 following URL:
 https://issues.apache.org/jira/projects/SPARK/versions/12343289

 Note. RC2 was cancelled because of one blocking issue SPARK-24781
 during release preparation.

 FAQ

 =
 How can I help test this release?
 =

 If you are a Spark user, you can help us test this release by taking
 an existing Spark workload and running on this 

Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-18 Thread Ryan Blue
+1 (non-binding)

On Wed, Jul 18, 2018 at 10:38 AM Denny Lee  wrote:

> +1 (non-binding)
> On Tue, Jul 17, 2018 at 23:04 John Zhuge  wrote:
>
>> +1 (non-binding)
>>
>> On Mon, Jul 16, 2018 at 8:04 PM Saisai Shao 
>> wrote:
>>
>>> I will put my +1 on this RC.
>>>
>>> For the test failure fix, I will include it if there's another RC.
>>>
>>> Sean Owen  于2018年7月16日周一 下午10:47写道:
>>>
>> OK, hm, will try to get to the bottom of it. But if others can build this
 module successfully, I give a +1 . The test failure is inevitable here and
 should not block release.

 On Sun, Jul 15, 2018 at 9:39 PM Saisai Shao 
 wrote:

>>> Hi Sean,
>
> I just did a clean build with mvn/sbt on 2.3.2, I didn't meet the
> errors you pasted here. I'm not sure how it happens.
>
> Sean Owen  于2018年7月16日周一 上午6:30写道:
>
 Looks good to me, with the following caveats.
>>
>> First see the discussion on
>> https://issues.apache.org/jira/browse/SPARK-24813 ; the
>> flaky HiveExternalCatalogVersionsSuite will probably fail all the time
>> right now. That's not a regression and is a test-only issue, so don't 
>> think
>> it must block the release. However if this fix holds up, and we need
>> another RC, worth pulling in for sure.
>>
>> Also is anyone seeing this while building and testing the Spark SQL +
>> Kafka module? I see this error even after a clean rebuild. I sort of get
>> what the error is saying but can't figure out why it would only happen at
>> test/runtime. Haven't seen it before.
>>
>> [error] missing or invalid dependency detected while loading class
>> file 'MetricsSystem.class'.
>>
>> [error] Could not access term eclipse in package org,
>>
>> [error] because it (or its dependencies) are missing. Check your
>> build definition for
>>
>> [error] missing or conflicting dependencies. (Re-run with
>> `-Ylog-classpath` to see the problematic classpath.)
>>
>> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
>> against an incompatible version of org.
>>
>> [error] missing or invalid dependency detected while loading class
>> file 'MetricsSystem.class'.
>>
>> [error] Could not access term jetty in value org.eclipse,
>>
>> [error] because it (or its dependencies) are missing. Check your
>> build definition for
>>
>> [error] missing or conflicting dependencies. (Re-run with
>> `-Ylog-classpath` to see the problematic classpath.)
>>
>> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
>> against an incompatible version of org.eclipse
>>
>> On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao 
>> wrote:
>>
> Please vote on releasing the following candidate as Apache Spark
>>> version 2.3.2.
>>>
>>> The vote is open until July 20 PST and passes if a majority +1 PMC
>>> votes are cast, with a minimum of 3 +1 votes.
>>>
>>> [ ] +1 Release this package as Apache Spark 2.3.2
>>> [ ] -1 Do not release this package because ...
>>>
>>> To learn more about Apache Spark, please see
>>> http://spark.apache.org/
>>>
>>> The tag to be voted on is v2.3.2-rc3
>>> (commit b3726dadcf2997f20231873ec6e057dba433ae64):
>>> https://github.com/apache/spark/tree/v2.3.2-rc3
>>>
>>> The release files, including signatures, digests, etc. can be found
>>> at:
>>> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/
>>>
>>> Signatures used for Spark RCs can be found in this file:
>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>
>>> The staging repository for this release can be found at:
>>>
>>> https://repository.apache.org/content/repositories/orgapachespark-1278/
>>>
>>> The documentation corresponding to this release can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/
>>>
>>> The list of bug fixes going into 2.3.2 can be found at the following
>>> URL:
>>> https://issues.apache.org/jira/projects/SPARK/versions/12343289
>>>
>>> Note. RC2 was cancelled because of one blocking issue SPARK-24781
>>> during release preparation.
>>>
>>> FAQ
>>>
>>> =
>>> How can I help test this release?
>>> =
>>>
>>> If you are a Spark user, you can help us test this release by taking
>>> an existing Spark workload and running on this release candidate,
>>> then
>>> reporting any regressions.
>>>
>>> If you're working in PySpark you can set up a virtual env and install
>>> the current RC and see if anything important breaks, in the
>>> Java/Scala
>>> you can add the staging repository to your projects resolvers and
>>> test
>>> with the RC (make sure to clean up the artifact cache before/after so
>>> you don't end up 

Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-18 Thread Denny Lee
+1 (non-binding)
On Tue, Jul 17, 2018 at 23:04 John Zhuge  wrote:

> +1 (non-binding)
>
> On Mon, Jul 16, 2018 at 8:04 PM Saisai Shao 
> wrote:
>
>> I will put my +1 on this RC.
>>
>> For the test failure fix, I will include it if there's another RC.
>>
>> Sean Owen  于2018年7月16日周一 下午10:47写道:
>>
> OK, hm, will try to get to the bottom of it. But if others can build this
>>> module successfully, I give a +1 . The test failure is inevitable here and
>>> should not block release.
>>>
>>> On Sun, Jul 15, 2018 at 9:39 PM Saisai Shao 
>>> wrote:
>>>
>> Hi Sean,

 I just did a clean build with mvn/sbt on 2.3.2, I didn't meet the
 errors you pasted here. I'm not sure how it happens.

 Sean Owen  于2018年7月16日周一 上午6:30写道:

>>> Looks good to me, with the following caveats.
>
> First see the discussion on
> https://issues.apache.org/jira/browse/SPARK-24813 ; the
> flaky HiveExternalCatalogVersionsSuite will probably fail all the time
> right now. That's not a regression and is a test-only issue, so don't 
> think
> it must block the release. However if this fix holds up, and we need
> another RC, worth pulling in for sure.
>
> Also is anyone seeing this while building and testing the Spark SQL +
> Kafka module? I see this error even after a clean rebuild. I sort of get
> what the error is saying but can't figure out why it would only happen at
> test/runtime. Haven't seen it before.
>
> [error] missing or invalid dependency detected while loading class
> file 'MetricsSystem.class'.
>
> [error] Could not access term eclipse in package org,
>
> [error] because it (or its dependencies) are missing. Check your build
> definition for
>
> [error] missing or conflicting dependencies. (Re-run with
> `-Ylog-classpath` to see the problematic classpath.)
>
> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
> against an incompatible version of org.
>
> [error] missing or invalid dependency detected while loading class
> file 'MetricsSystem.class'.
>
> [error] Could not access term jetty in value org.eclipse,
>
> [error] because it (or its dependencies) are missing. Check your build
> definition for
>
> [error] missing or conflicting dependencies. (Re-run with
> `-Ylog-classpath` to see the problematic classpath.)
>
> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
> against an incompatible version of org.eclipse
>
> On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao 
> wrote:
>
 Please vote on releasing the following candidate as Apache Spark
>> version 2.3.2.
>>
>> The vote is open until July 20 PST and passes if a majority +1 PMC
>> votes are cast, with a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 2.3.2
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v2.3.2-rc3
>> (commit b3726dadcf2997f20231873ec6e057dba433ae64):
>> https://github.com/apache/spark/tree/v2.3.2-rc3
>>
>> The release files, including signatures, digests, etc. can be found
>> at:
>> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>>
>> https://repository.apache.org/content/repositories/orgapachespark-1278/
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/
>>
>> The list of bug fixes going into 2.3.2 can be found at the following
>> URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12343289
>>
>> Note. RC2 was cancelled because of one blocking issue SPARK-24781
>> during release preparation.
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>>
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC and see if anything important breaks, in the Java/Scala
>> you can add the staging repository to your projects resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 2.3.2?
>> ===
>>

Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-18 Thread John Zhuge
+1 (non-binding)

On Mon, Jul 16, 2018 at 8:04 PM Saisai Shao  wrote:

> I will put my +1 on this RC.
>
> For the test failure fix, I will include it if there's another RC.
>
> Sean Owen  于2018年7月16日周一 下午10:47写道:
>
>> OK, hm, will try to get to the bottom of it. But if others can build this
>> module successfully, I give a +1 . The test failure is inevitable here and
>> should not block release.
>>
>> On Sun, Jul 15, 2018 at 9:39 PM Saisai Shao 
>> wrote:
>>
>>> Hi Sean,
>>>
>>> I just did a clean build with mvn/sbt on 2.3.2, I didn't meet the errors
>>> you pasted here. I'm not sure how it happens.
>>>
>>> Sean Owen  于2018年7月16日周一 上午6:30写道:
>>>
 Looks good to me, with the following caveats.

 First see the discussion on
 https://issues.apache.org/jira/browse/SPARK-24813 ; the
 flaky HiveExternalCatalogVersionsSuite will probably fail all the time
 right now. That's not a regression and is a test-only issue, so don't think
 it must block the release. However if this fix holds up, and we need
 another RC, worth pulling in for sure.

 Also is anyone seeing this while building and testing the Spark SQL +
 Kafka module? I see this error even after a clean rebuild. I sort of get
 what the error is saying but can't figure out why it would only happen at
 test/runtime. Haven't seen it before.

 [error] missing or invalid dependency detected while loading class file
 'MetricsSystem.class'.

 [error] Could not access term eclipse in package org,

 [error] because it (or its dependencies) are missing. Check your build
 definition for

 [error] missing or conflicting dependencies. (Re-run with
 `-Ylog-classpath` to see the problematic classpath.)

 [error] A full rebuild may help if 'MetricsSystem.class' was compiled
 against an incompatible version of org.

 [error] missing or invalid dependency detected while loading class file
 'MetricsSystem.class'.

 [error] Could not access term jetty in value org.eclipse,

 [error] because it (or its dependencies) are missing. Check your build
 definition for

 [error] missing or conflicting dependencies. (Re-run with
 `-Ylog-classpath` to see the problematic classpath.)

 [error] A full rebuild may help if 'MetricsSystem.class' was compiled
 against an incompatible version of org.eclipse

 On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao 
 wrote:

> Please vote on releasing the following candidate as Apache Spark
> version 2.3.2.
>
> The vote is open until July 20 PST and passes if a majority +1 PMC
> votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 2.3.2
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v2.3.2-rc3
> (commit b3726dadcf2997f20231873ec6e057dba433ae64):
> https://github.com/apache/spark/tree/v2.3.2-rc3
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1278/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/
>
> The list of bug fixes going into 2.3.2 can be found at the following
> URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12343289
>
> Note. RC2 was cancelled because of one blocking issue SPARK-24781
> during release preparation.
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 2.3.2?
> ===
>
> The current list of open tickets targeted at 2.3.2 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 2.3.2
>
> Committers should 

Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-16 Thread Saisai Shao
I will put my +1 on this RC.

For the test failure fix, I will include it if there's another RC.

Sean Owen  于2018年7月16日周一 下午10:47写道:

> OK, hm, will try to get to the bottom of it. But if others can build this
> module successfully, I give a +1 . The test failure is inevitable here and
> should not block release.
>
> On Sun, Jul 15, 2018 at 9:39 PM Saisai Shao 
> wrote:
>
>> Hi Sean,
>>
>> I just did a clean build with mvn/sbt on 2.3.2, I didn't meet the errors
>> you pasted here. I'm not sure how it happens.
>>
>> Sean Owen  于2018年7月16日周一 上午6:30写道:
>>
>>> Looks good to me, with the following caveats.
>>>
>>> First see the discussion on
>>> https://issues.apache.org/jira/browse/SPARK-24813 ; the
>>> flaky HiveExternalCatalogVersionsSuite will probably fail all the time
>>> right now. That's not a regression and is a test-only issue, so don't think
>>> it must block the release. However if this fix holds up, and we need
>>> another RC, worth pulling in for sure.
>>>
>>> Also is anyone seeing this while building and testing the Spark SQL +
>>> Kafka module? I see this error even after a clean rebuild. I sort of get
>>> what the error is saying but can't figure out why it would only happen at
>>> test/runtime. Haven't seen it before.
>>>
>>> [error] missing or invalid dependency detected while loading class file
>>> 'MetricsSystem.class'.
>>>
>>> [error] Could not access term eclipse in package org,
>>>
>>> [error] because it (or its dependencies) are missing. Check your build
>>> definition for
>>>
>>> [error] missing or conflicting dependencies. (Re-run with
>>> `-Ylog-classpath` to see the problematic classpath.)
>>>
>>> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
>>> against an incompatible version of org.
>>>
>>> [error] missing or invalid dependency detected while loading class file
>>> 'MetricsSystem.class'.
>>>
>>> [error] Could not access term jetty in value org.eclipse,
>>>
>>> [error] because it (or its dependencies) are missing. Check your build
>>> definition for
>>>
>>> [error] missing or conflicting dependencies. (Re-run with
>>> `-Ylog-classpath` to see the problematic classpath.)
>>>
>>> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
>>> against an incompatible version of org.eclipse
>>>
>>> On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao 
>>> wrote:
>>>
 Please vote on releasing the following candidate as Apache Spark
 version 2.3.2.

 The vote is open until July 20 PST and passes if a majority +1 PMC
 votes are cast, with a minimum of 3 +1 votes.

 [ ] +1 Release this package as Apache Spark 2.3.2
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see http://spark.apache.org/

 The tag to be voted on is v2.3.2-rc3
 (commit b3726dadcf2997f20231873ec6e057dba433ae64):
 https://github.com/apache/spark/tree/v2.3.2-rc3

 The release files, including signatures, digests, etc. can be found at:
 https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/

 Signatures used for Spark RCs can be found in this file:
 https://dist.apache.org/repos/dist/dev/spark/KEYS

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1278/

 The documentation corresponding to this release can be found at:
 https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/

 The list of bug fixes going into 2.3.2 can be found at the following
 URL:
 https://issues.apache.org/jira/projects/SPARK/versions/12343289

 Note. RC2 was cancelled because of one blocking issue SPARK-24781
 during release preparation.

 FAQ

 =
 How can I help test this release?
 =

 If you are a Spark user, you can help us test this release by taking
 an existing Spark workload and running on this release candidate, then
 reporting any regressions.

 If you're working in PySpark you can set up a virtual env and install
 the current RC and see if anything important breaks, in the Java/Scala
 you can add the staging repository to your projects resolvers and test
 with the RC (make sure to clean up the artifact cache before/after so
 you don't end up building with a out of date RC going forward).

 ===
 What should happen to JIRA tickets still targeting 2.3.2?
 ===

 The current list of open tickets targeted at 2.3.2 can be found at:
 https://issues.apache.org/jira/projects/SPARK and search for "Target
 Version/s" = 2.3.2

 Committers should look at those and triage. Extremely important bug
 fixes, documentation, and API tweaks that impact compatibility should
 be worked on immediately. Everything else please retarget to an
 

Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-16 Thread Sean Owen
OK, hm, will try to get to the bottom of it. But if others can build this
module successfully, I give a +1 . The test failure is inevitable here and
should not block release.

On Sun, Jul 15, 2018 at 9:39 PM Saisai Shao  wrote:

> Hi Sean,
>
> I just did a clean build with mvn/sbt on 2.3.2, I didn't meet the errors
> you pasted here. I'm not sure how it happens.
>
> Sean Owen  于2018年7月16日周一 上午6:30写道:
>
>> Looks good to me, with the following caveats.
>>
>> First see the discussion on
>> https://issues.apache.org/jira/browse/SPARK-24813 ; the
>> flaky HiveExternalCatalogVersionsSuite will probably fail all the time
>> right now. That's not a regression and is a test-only issue, so don't think
>> it must block the release. However if this fix holds up, and we need
>> another RC, worth pulling in for sure.
>>
>> Also is anyone seeing this while building and testing the Spark SQL +
>> Kafka module? I see this error even after a clean rebuild. I sort of get
>> what the error is saying but can't figure out why it would only happen at
>> test/runtime. Haven't seen it before.
>>
>> [error] missing or invalid dependency detected while loading class file
>> 'MetricsSystem.class'.
>>
>> [error] Could not access term eclipse in package org,
>>
>> [error] because it (or its dependencies) are missing. Check your build
>> definition for
>>
>> [error] missing or conflicting dependencies. (Re-run with
>> `-Ylog-classpath` to see the problematic classpath.)
>>
>> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
>> against an incompatible version of org.
>>
>> [error] missing or invalid dependency detected while loading class file
>> 'MetricsSystem.class'.
>>
>> [error] Could not access term jetty in value org.eclipse,
>>
>> [error] because it (or its dependencies) are missing. Check your build
>> definition for
>>
>> [error] missing or conflicting dependencies. (Re-run with
>> `-Ylog-classpath` to see the problematic classpath.)
>>
>> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
>> against an incompatible version of org.eclipse
>>
>> On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao 
>> wrote:
>>
>>> Please vote on releasing the following candidate as Apache Spark version
>>> 2.3.2.
>>>
>>> The vote is open until July 20 PST and passes if a majority +1 PMC votes
>>> are cast, with a minimum of 3 +1 votes.
>>>
>>> [ ] +1 Release this package as Apache Spark 2.3.2
>>> [ ] -1 Do not release this package because ...
>>>
>>> To learn more about Apache Spark, please see http://spark.apache.org/
>>>
>>> The tag to be voted on is v2.3.2-rc3
>>> (commit b3726dadcf2997f20231873ec6e057dba433ae64):
>>> https://github.com/apache/spark/tree/v2.3.2-rc3
>>>
>>> The release files, including signatures, digests, etc. can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/
>>>
>>> Signatures used for Spark RCs can be found in this file:
>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>
>>> The staging repository for this release can be found at:
>>> https://repository.apache.org/content/repositories/orgapachespark-1278/
>>>
>>> The documentation corresponding to this release can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/
>>>
>>> The list of bug fixes going into 2.3.2 can be found at the following URL:
>>> https://issues.apache.org/jira/projects/SPARK/versions/12343289
>>>
>>> Note. RC2 was cancelled because of one blocking issue SPARK-24781 during
>>> release preparation.
>>>
>>> FAQ
>>>
>>> =
>>> How can I help test this release?
>>> =
>>>
>>> If you are a Spark user, you can help us test this release by taking
>>> an existing Spark workload and running on this release candidate, then
>>> reporting any regressions.
>>>
>>> If you're working in PySpark you can set up a virtual env and install
>>> the current RC and see if anything important breaks, in the Java/Scala
>>> you can add the staging repository to your projects resolvers and test
>>> with the RC (make sure to clean up the artifact cache before/after so
>>> you don't end up building with a out of date RC going forward).
>>>
>>> ===
>>> What should happen to JIRA tickets still targeting 2.3.2?
>>> ===
>>>
>>> The current list of open tickets targeted at 2.3.2 can be found at:
>>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>>> Version/s" = 2.3.2
>>>
>>> Committers should look at those and triage. Extremely important bug
>>> fixes, documentation, and API tweaks that impact compatibility should
>>> be worked on immediately. Everything else please retarget to an
>>> appropriate release.
>>>
>>> ==
>>> But my bug isn't fixed?
>>> ==
>>>
>>> In order to make timely releases, we will typically not hold the
>>> release unless the bug in question is a regression from the previous
>>> release. That 

Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-15 Thread Saisai Shao
Hi Sean,

I just did a clean build with mvn/sbt on 2.3.2, I didn't meet the errors
you pasted here. I'm not sure how it happens.

Sean Owen  于2018年7月16日周一 上午6:30写道:

> Looks good to me, with the following caveats.
>
> First see the discussion on
> https://issues.apache.org/jira/browse/SPARK-24813 ; the
> flaky HiveExternalCatalogVersionsSuite will probably fail all the time
> right now. That's not a regression and is a test-only issue, so don't think
> it must block the release. However if this fix holds up, and we need
> another RC, worth pulling in for sure.
>
> Also is anyone seeing this while building and testing the Spark SQL +
> Kafka module? I see this error even after a clean rebuild. I sort of get
> what the error is saying but can't figure out why it would only happen at
> test/runtime. Haven't seen it before.
>
> [error] missing or invalid dependency detected while loading class file
> 'MetricsSystem.class'.
>
> [error] Could not access term eclipse in package org,
>
> [error] because it (or its dependencies) are missing. Check your build
> definition for
>
> [error] missing or conflicting dependencies. (Re-run with
> `-Ylog-classpath` to see the problematic classpath.)
>
> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
> against an incompatible version of org.
>
> [error] missing or invalid dependency detected while loading class file
> 'MetricsSystem.class'.
>
> [error] Could not access term jetty in value org.eclipse,
>
> [error] because it (or its dependencies) are missing. Check your build
> definition for
>
> [error] missing or conflicting dependencies. (Re-run with
> `-Ylog-classpath` to see the problematic classpath.)
>
> [error] A full rebuild may help if 'MetricsSystem.class' was compiled
> against an incompatible version of org.eclipse
>
> On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao 
> wrote:
>
>> Please vote on releasing the following candidate as Apache Spark version
>> 2.3.2.
>>
>> The vote is open until July 20 PST and passes if a majority +1 PMC votes
>> are cast, with a minimum of 3 +1 votes.
>>
>> [ ] +1 Release this package as Apache Spark 2.3.2
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v2.3.2-rc3
>> (commit b3726dadcf2997f20231873ec6e057dba433ae64):
>> https://github.com/apache/spark/tree/v2.3.2-rc3
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1278/
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/
>>
>> The list of bug fixes going into 2.3.2 can be found at the following URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12343289
>>
>> Note. RC2 was cancelled because of one blocking issue SPARK-24781 during
>> release preparation.
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>>
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC and see if anything important breaks, in the Java/Scala
>> you can add the staging repository to your projects resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 2.3.2?
>> ===
>>
>> The current list of open tickets targeted at 2.3.2 can be found at:
>> https://issues.apache.org/jira/projects/SPARK and search for "Target
>> Version/s" = 2.3.2
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should
>> be worked on immediately. Everything else please retarget to an
>> appropriate release.
>>
>> ==
>> But my bug isn't fixed?
>> ==
>>
>> In order to make timely releases, we will typically not hold the
>> release unless the bug in question is a regression from the previous
>> release. That being said, if there is something which is a regression
>> that has not been correctly targeted please ping me or a committer to
>> help target the issue.
>>
>>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-15 Thread Sean Owen
Looks good to me, with the following caveats.

First see the discussion on
https://issues.apache.org/jira/browse/SPARK-24813 ; the
flaky HiveExternalCatalogVersionsSuite will probably fail all the time
right now. That's not a regression and is a test-only issue, so don't think
it must block the release. However if this fix holds up, and we need
another RC, worth pulling in for sure.

Also is anyone seeing this while building and testing the Spark SQL + Kafka
module? I see this error even after a clean rebuild. I sort of get what the
error is saying but can't figure out why it would only happen at
test/runtime. Haven't seen it before.

[error] missing or invalid dependency detected while loading class file
'MetricsSystem.class'.

[error] Could not access term eclipse in package org,

[error] because it (or its dependencies) are missing. Check your build
definition for

[error] missing or conflicting dependencies. (Re-run with `-Ylog-classpath`
to see the problematic classpath.)

[error] A full rebuild may help if 'MetricsSystem.class' was compiled
against an incompatible version of org.

[error] missing or invalid dependency detected while loading class file
'MetricsSystem.class'.

[error] Could not access term jetty in value org.eclipse,

[error] because it (or its dependencies) are missing. Check your build
definition for

[error] missing or conflicting dependencies. (Re-run with `-Ylog-classpath`
to see the problematic classpath.)

[error] A full rebuild may help if 'MetricsSystem.class' was compiled
against an incompatible version of org.eclipse

On Sun, Jul 15, 2018 at 3:09 AM Saisai Shao  wrote:

> Please vote on releasing the following candidate as Apache Spark version
> 2.3.2.
>
> The vote is open until July 20 PST and passes if a majority +1 PMC votes
> are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 2.3.2
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v2.3.2-rc3
> (commit b3726dadcf2997f20231873ec6e057dba433ae64):
> https://github.com/apache/spark/tree/v2.3.2-rc3
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1278/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/
>
> The list of bug fixes going into 2.3.2 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12343289
>
> Note. RC2 was cancelled because of one blocking issue SPARK-24781 during
> release preparation.
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 2.3.2?
> ===
>
> The current list of open tickets targeted at 2.3.2 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 2.3.2
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>
>


Re: [VOTE] SPARK 2.3.2 (RC3)

2018-07-15 Thread Wenchen Fan
+1. The Spark 2.3 regressions I'm aware of are all fixed.

On Sun, Jul 15, 2018 at 4:09 PM Saisai Shao  wrote:

> Please vote on releasing the following candidate as Apache Spark version
> 2.3.2.
>
> The vote is open until July 20 PST and passes if a majority +1 PMC votes
> are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 2.3.2
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v2.3.2-rc3
> (commit b3726dadcf2997f20231873ec6e057dba433ae64):
> https://github.com/apache/spark/tree/v2.3.2-rc3
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1278/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/
>
> The list of bug fixes going into 2.3.2 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12343289
>
> Note. RC2 was cancelled because of one blocking issue SPARK-24781 during
> release preparation.
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 2.3.2?
> ===
>
> The current list of open tickets targeted at 2.3.2 can be found at:
> https://issues.apache.org/jira/projects/SPARK and search for "Target
> Version/s" = 2.3.2
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>
>


[VOTE] SPARK 2.3.2 (RC3)

2018-07-15 Thread Saisai Shao
Please vote on releasing the following candidate as Apache Spark version
2.3.2.

The vote is open until July 20 PST and passes if a majority +1 PMC votes
are cast, with a minimum of 3 +1 votes.

[ ] +1 Release this package as Apache Spark 2.3.2
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.3.2-rc3
(commit b3726dadcf2997f20231873ec6e057dba433ae64):
https://github.com/apache/spark/tree/v2.3.2-rc3

The release files, including signatures, digests, etc. can be found at:
https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-bin/

Signatures used for Spark RCs can be found in this file:
https://dist.apache.org/repos/dist/dev/spark/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1278/

The documentation corresponding to this release can be found at:
https://dist.apache.org/repos/dist/dev/spark/v2.3.2-rc3-docs/

The list of bug fixes going into 2.3.2 can be found at the following URL:
https://issues.apache.org/jira/projects/SPARK/versions/12343289

Note. RC2 was cancelled because of one blocking issue SPARK-24781 during
release preparation.

FAQ

=
How can I help test this release?
=

If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC and see if anything important breaks, in the Java/Scala
you can add the staging repository to your projects resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with a out of date RC going forward).

===
What should happen to JIRA tickets still targeting 2.3.2?
===

The current list of open tickets targeted at 2.3.2 can be found at:
https://issues.apache.org/jira/projects/SPARK and search for "Target
Version/s" = 2.3.2

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==
But my bug isn't fixed?
==

In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.