Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-07-14 Thread Prashant Sharma
Hi Folks,

So, I am back, and searched the JIRAS with target version as "2.4.7" and
Resolved, found only 2 jiras. So, are we good to go, with just a couple of
jiras fixed ? Shall I proceed with making a RC?

Thanks,
Prashant

On Thu, Jul 2, 2020 at 5:23 PM Prashant Sharma  wrote:

> Thank you, Holden.
>
> Folks, My health has gone down a bit. So, I will start working on this in
> a few days. If this needs to be published sooner, then maybe someone else
> has to help out.
>
>
>
>
>
> On Thu, Jul 2, 2020 at 10:11 AM Holden Karau  wrote:
>
>> I’m happy to have Prashant do 2.4.7 :)
>>
>> On Wed, Jul 1, 2020 at 9:40 PM Xiao Li  wrote:
>>
>>> +1 on releasing both 3.0.1 and 2.4.7
>>>
>>> Great! Three committers volunteer to be a release manager. Ruifeng,
>>> Prashant and Holden. Holden just helped release Spark 2.4.6. This time,
>>> maybe, Ruifeng and Prashant can be the release manager of 3.0.1 and 2.4.7
>>> respectively.
>>>
>>> Xiao
>>>
>>> On Wed, Jul 1, 2020 at 2:24 PM Jungtaek Lim <
>>> kabhwan.opensou...@gmail.com> wrote:
>>>
 https://issues.apache.org/jira/browse/SPARK-32148 was reported
 yesterday, and if the report is valid it looks to be a blocker. I'll try to
 take a look sooner.

 On Thu, Jul 2, 2020 at 12:48 AM Shivaram Venkataraman <
 shiva...@eecs.berkeley.edu> wrote:

> Thanks Holden -- it would be great to also get 2.4.7 started
>
> Thanks
> Shivaram
>
> On Tue, Jun 30, 2020 at 10:31 PM Holden Karau 
> wrote:
> >
> > I can take care of 2.4.7 unless someone else wants to do it.
> >
> > On Tue, Jun 30, 2020 at 8:29 PM Jason Moore <
> jason.mo...@quantium.com.au> wrote:
> >>
> >> Hi all,
> >>
> >>
> >>
> >> Could I get some input on the severity of this one that I found
> yesterday?  If that’s a correctness issue, should it block this patch?  
> Let
> me know under the ticket if there’s more info that I can provide to help.
> >>
> >>
> >>
> >> https://issues.apache.org/jira/browse/SPARK-32136
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Jason.
> >>
> >>
> >>
> >> From: Jungtaek Lim 
> >> Date: Wednesday, 1 July 2020 at 10:20 am
> >> To: Shivaram Venkataraman 
> >> Cc: Prashant Sharma , 郑瑞峰 <
> ruife...@foxmail.com>, Gengliang Wang ,
> gurwls223 , Dongjoon Hyun <
> dongjoon.h...@gmail.com>, Jules Damji , Holden
> Karau , Reynold Xin ,
> Yuanjian Li , "dev@spark.apache.org" <
> dev@spark.apache.org>, Takeshi Yamamuro 
> >> Subject: Re: [DISCUSS] Apache Spark 3.0.1 Release
> >>
> >>
> >>
> >> SPARK-32130 [1] looks to be a performance regression introduced in
> Spark 3.0.0, which is ideal to look into before releasing another bugfix
> version.
> >>
> >>
> >>
> >> 1. https://issues.apache.org/jira/browse/SPARK-32130
> >>
> >>
> >>
> >> On Wed, Jul 1, 2020 at 7:05 AM Shivaram Venkataraman <
> shiva...@eecs.berkeley.edu> wrote:
> >>
> >> Hi all
> >>
> >>
> >>
> >> I just wanted to ping this thread to see if all the outstanding
> blockers for 3.0.1 have been fixed. If so, it would be great if we can get
> the release going. The CRAN team sent us a note that the version SparkR
> available on CRAN for the current R version (4.0.2) is broken and hence we
> need to update the package soon --  it will be great to do it with 3.0.1.
> >>
> >>
> >>
> >> Thanks
> >>
> >> Shivaram
> >>
> >>
> >>
> >> On Wed, Jun 24, 2020 at 8:31 PM Prashant Sharma <
> scrapco...@gmail.com> wrote:
> >>
> >> +1 for 3.0.1 release.
> >>
> >> I too can help out as release manager.
> >>
> >>
> >>
> >> On Thu, Jun 25, 2020 at 4:58 AM 郑瑞峰  wrote:
> >>
> >> I volunteer to be a release manager of 3.0.1, if nobody is working
> on this.
> >>
> >>
> >>
> >>
> >>
> >> -- 原始邮件 --
> >>
> >> 发件人: "Gengliang Wang";
> >>
> >> 发送时间: 2020年6月24日(星期三) 下午4:15
> >>
> >> 收件人: "Hyukjin Kwon";
> >>
> >> 抄送: "Dongjoon Hyun";"Jungtaek Lim"<
> kabhwan.opensou...@gmail.com>;"Jules Damji";"Holden
> Karau";"Reynold Xin";"Shivaram
> Venkataraman";"Yuanjian Li"<
> xyliyuanj...@gmail.com>;"Spark dev list";"Takeshi
> Yamamuro";
> >>
> >> 主题: Re: [DISCUSS] Apache Spark 3.0.1 Release
> >>
> >>
> >>
> >> +1, the issues mentioned are really serious.
> >>
> >>
> >>
> >> On Tue, Jun 23, 2020 at 7:56 PM Hyukjin Kwon 
> wrote:
> >>
> >> +1.
> >>
> >> Just as a note,
> >> - SPARK-31918 is fixed now, and there's no blocker. - When we build
> SparkR, we should use the latest R version at least 4.0.0+.
> >>
> >>
> >>
> >> 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun 님이
> 작성:
> >>
> >> +1
> >>
> 

Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Nick Pentreath
Congratulations and welcome as Apache Spark committers!

On Wed, 15 Jul 2020 at 06:59, Prashant Sharma  wrote:

> Congratulations all ! It's great to have such committed folks as
> committers. :)
>
> On Wed, Jul 15, 2020 at 9:24 AM Yi Wu  wrote:
>
>> Congrats!!
>>
>> On Wed, Jul 15, 2020 at 8:02 AM Hyukjin Kwon  wrote:
>>
>>> Congrats!
>>>
>>> 2020년 7월 15일 (수) 오전 7:56, Takeshi Yamamuro 님이 작성:
>>>
 Congrats, all!

 On Wed, Jul 15, 2020 at 5:15 AM Takuya UESHIN 
 wrote:

> Congrats and welcome!
>
> On Tue, Jul 14, 2020 at 1:07 PM Bryan Cutler 
> wrote:
>
>> Congratulations and welcome!
>>
>> On Tue, Jul 14, 2020 at 12:36 PM Xingbo Jiang 
>> wrote:
>>
>>> Welcome, Huaxin, Jungtaek, and Dilip!
>>>
>>> Congratulations!
>>>
>>> On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia <
>>> matei.zaha...@gmail.com> wrote:
>>>
 Hi all,

 The Spark PMC recently voted to add several new committers. Please
 join me in welcoming them to their new roles! The new committers are:

 - Huaxin Gao
 - Jungtaek Lim
 - Dilip Biswal

 All three of them contributed to Spark 3.0 and we’re excited to
 have them join the project.

 Matei and the Spark PMC

 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org


>
> --
> Takuya UESHIN
>
>

 --
 ---
 Takeshi Yamamuro

>>>


Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Prashant Sharma
Congratulations all ! It's great to have such committed folks as
committers. :)

On Wed, Jul 15, 2020 at 9:24 AM Yi Wu  wrote:

> Congrats!!
>
> On Wed, Jul 15, 2020 at 8:02 AM Hyukjin Kwon  wrote:
>
>> Congrats!
>>
>> 2020년 7월 15일 (수) 오전 7:56, Takeshi Yamamuro 님이 작성:
>>
>>> Congrats, all!
>>>
>>> On Wed, Jul 15, 2020 at 5:15 AM Takuya UESHIN 
>>> wrote:
>>>
 Congrats and welcome!

 On Tue, Jul 14, 2020 at 1:07 PM Bryan Cutler  wrote:

> Congratulations and welcome!
>
> On Tue, Jul 14, 2020 at 12:36 PM Xingbo Jiang 
> wrote:
>
>> Welcome, Huaxin, Jungtaek, and Dilip!
>>
>> Congratulations!
>>
>> On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia <
>> matei.zaha...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> The Spark PMC recently voted to add several new committers. Please
>>> join me in welcoming them to their new roles! The new committers are:
>>>
>>> - Huaxin Gao
>>> - Jungtaek Lim
>>> - Dilip Biswal
>>>
>>> All three of them contributed to Spark 3.0 and we’re excited to have
>>> them join the project.
>>>
>>> Matei and the Spark PMC
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>

 --
 Takuya UESHIN


>>>
>>> --
>>> ---
>>> Takeshi Yamamuro
>>>
>>


Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Yi Wu
Congrats!!

On Wed, Jul 15, 2020 at 8:02 AM Hyukjin Kwon  wrote:

> Congrats!
>
> 2020년 7월 15일 (수) 오전 7:56, Takeshi Yamamuro 님이 작성:
>
>> Congrats, all!
>>
>> On Wed, Jul 15, 2020 at 5:15 AM Takuya UESHIN 
>> wrote:
>>
>>> Congrats and welcome!
>>>
>>> On Tue, Jul 14, 2020 at 1:07 PM Bryan Cutler  wrote:
>>>
 Congratulations and welcome!

 On Tue, Jul 14, 2020 at 12:36 PM Xingbo Jiang 
 wrote:

> Welcome, Huaxin, Jungtaek, and Dilip!
>
> Congratulations!
>
> On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia <
> matei.zaha...@gmail.com> wrote:
>
>> Hi all,
>>
>> The Spark PMC recently voted to add several new committers. Please
>> join me in welcoming them to their new roles! The new committers are:
>>
>> - Huaxin Gao
>> - Jungtaek Lim
>> - Dilip Biswal
>>
>> All three of them contributed to Spark 3.0 and we’re excited to have
>> them join the project.
>>
>> Matei and the Spark PMC
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>>>
>>> --
>>> Takuya UESHIN
>>>
>>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>


Re: [PSA] Apache Spark uses GitHub Actions to run the tests

2020-07-14 Thread Holden Karau
This has already helped me catch a potential flaky test I might have
otherwise missed. So excited for this additional signal :)

On Tue, Jul 14, 2020 at 6:19 PM Takeshi Yamamuro 
wrote:

> Thanks, Hyukjin!
>
> > Therefore, I do believe PRs can be merged in most general cases once the
> Jenkins PR
> builder or Github Actions build passes
>
> greatly helpful!
>
> Bests,
> Takeshi
>
> On Tue, Jul 14, 2020 at 4:14 PM Hyukjin Kwon  wrote:
>
>> Perfect. Plus, Github Actions is only for master branch at this moment.
>>
>> BTW, I think we can enable Java(Scala) doc build and dependency test back
>> in Jenkins for simplicity.
>> Seems like the Jenkins machine came back to normal.
>>
>> 2020년 7월 14일 (화) 오후 4:08, Wenchen Fan 님이 작성:
>>
>>> To clarify, we need to wait for:
>>> 1. Java documentation build test in github actions
>>> 2. dependency test in github actions
>>> 3. either github action all green or jenkin pass
>>>
>>> If the PR touches Kinesis, or it uses other profiles, we must wait for
>>> jenkins to pass.
>>>
>>> Do I miss something?
>>>
>>> On Tue, Jul 14, 2020 at 2:18 PM Hyukjin Kwon 
>>> wrote:
>>>
 Hi dev,

 Github Actions build was introduced to run the regular Spark test cases
 at https://github.com/apache/spark/pull/29057and
 https://github.com/apache/spark/pull/29086.
 This is virtually the duplication of default Jenkins PR builder at this
 moment.

 The only differences are:
 - Github Actions does not run the tests for Kinesis, see SPARK-32246
 - Github Actions does not support other profiles such as JDK 11 or Hive
 1.2, see SPARK-32255
 - Jenkins build does not run Java documentation build, see SPARK-32233
 - Jenkins build does not run the dependency test, see SPARK-32178

 Therefore, I do believe PRs can be merged in most general cases once
 the Jenkins PR
 builder or Github Actions build passes when we expect the successful
 test results from
 the default Jenkins PR builder.

 Thanks.

>>>
>
> --
> ---
> Takeshi Yamamuro
>


-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-07-14 Thread Yi Wu
Thanks Dongjoon, you're right. I referenced to a wrong caused-by jira link
in https://issues.apache.org/jira/browse/SPARK-32307.

Bests,
Yi

On Wed, Jul 15, 2020 at 3:03 AM Dongjoon Hyun 
wrote:

> Hi, Yi.
>
> Could you explain why you think that is a blocker? For the given example
> from the JIRA description,
>
> spark.udf.register("key", udf((m: Map[String, String]) => m.keys.head.toInt))
>
> Seq(Map("1" -> "one", "2" -> "two")).toDF("a").createOrReplaceTempView("t")
>
> checkAnswer(sql("SELECT key(a) AS k FROM t GROUP BY key(a)"), Row(1) :: Nil)
>
>
> Apache Spark 3.0.0 seems to work like the following.
>
> scala> spark.version
> res0: String = 3.0.0
>
> scala> spark.udf.register("key", udf((m: Map[String, String]) =>
> m.keys.head.toInt))
> res1: org.apache.spark.sql.expressions.UserDefinedFunction =
> SparkUserDefinedFunction($Lambda$1958/948653928@5d6bed7b,IntegerType,List(Some(class[value[0]:
> map])),None,false,true)
>
> scala> Seq(Map("1" -> "one", "2" ->
> "two")).toDF("a").createOrReplaceTempView("t")
>
> scala> sql("SELECT key(a) AS k FROM t GROUP BY key(a)").collect
> res3: Array[org.apache.spark.sql.Row] = Array([1])
>
>
> Could you provide a reproducible example?
>
> Bests,
> Dongjoon.
>
>
> On Tue, Jul 14, 2020 at 10:04 AM Yi Wu  wrote:
>
>> This probably be a blocker:
>> https://issues.apache.org/jira/browse/SPARK-32307
>>
>> On Tue, Jul 14, 2020 at 11:13 PM Sean Owen  wrote:
>>
>>> https://issues.apache.org/jira/browse/SPARK-32234 ?
>>>
>>> On Tue, Jul 14, 2020 at 9:57 AM Shivaram Venkataraman
>>>  wrote:
>>> >
>>> > Hi all
>>> >
>>> > Just wanted to check if there are any blockers that we are still
>>> waiting for to start the new release process.
>>> >
>>> > Thanks
>>> > Shivaram
>>> >
>>>
>>


Re: [PSA] Apache Spark uses GitHub Actions to run the tests

2020-07-14 Thread Takeshi Yamamuro
Thanks, Hyukjin!

> Therefore, I do believe PRs can be merged in most general cases once the
Jenkins PR
builder or Github Actions build passes

greatly helpful!

Bests,
Takeshi

On Tue, Jul 14, 2020 at 4:14 PM Hyukjin Kwon  wrote:

> Perfect. Plus, Github Actions is only for master branch at this moment.
>
> BTW, I think we can enable Java(Scala) doc build and dependency test back
> in Jenkins for simplicity.
> Seems like the Jenkins machine came back to normal.
>
> 2020년 7월 14일 (화) 오후 4:08, Wenchen Fan 님이 작성:
>
>> To clarify, we need to wait for:
>> 1. Java documentation build test in github actions
>> 2. dependency test in github actions
>> 3. either github action all green or jenkin pass
>>
>> If the PR touches Kinesis, or it uses other profiles, we must wait for
>> jenkins to pass.
>>
>> Do I miss something?
>>
>> On Tue, Jul 14, 2020 at 2:18 PM Hyukjin Kwon  wrote:
>>
>>> Hi dev,
>>>
>>> Github Actions build was introduced to run the regular Spark test cases
>>> at https://github.com/apache/spark/pull/29057and
>>> https://github.com/apache/spark/pull/29086.
>>> This is virtually the duplication of default Jenkins PR builder at this
>>> moment.
>>>
>>> The only differences are:
>>> - Github Actions does not run the tests for Kinesis, see SPARK-32246
>>> - Github Actions does not support other profiles such as JDK 11 or Hive
>>> 1.2, see SPARK-32255
>>> - Jenkins build does not run Java documentation build, see SPARK-32233
>>> - Jenkins build does not run the dependency test, see SPARK-32178
>>>
>>> Therefore, I do believe PRs can be merged in most general cases once the
>>> Jenkins PR
>>> builder or Github Actions build passes when we expect the successful
>>> test results from
>>> the default Jenkins PR builder.
>>>
>>> Thanks.
>>>
>>

-- 
---
Takeshi Yamamuro


Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Hyukjin Kwon
Congrats!

2020년 7월 15일 (수) 오전 7:56, Takeshi Yamamuro 님이 작성:

> Congrats, all!
>
> On Wed, Jul 15, 2020 at 5:15 AM Takuya UESHIN 
> wrote:
>
>> Congrats and welcome!
>>
>> On Tue, Jul 14, 2020 at 1:07 PM Bryan Cutler  wrote:
>>
>>> Congratulations and welcome!
>>>
>>> On Tue, Jul 14, 2020 at 12:36 PM Xingbo Jiang 
>>> wrote:
>>>
 Welcome, Huaxin, Jungtaek, and Dilip!

 Congratulations!

 On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia 
 wrote:

> Hi all,
>
> The Spark PMC recently voted to add several new committers. Please
> join me in welcoming them to their new roles! The new committers are:
>
> - Huaxin Gao
> - Jungtaek Lim
> - Dilip Biswal
>
> All three of them contributed to Spark 3.0 and we’re excited to have
> them join the project.
>
> Matei and the Spark PMC
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
>>
>> --
>> Takuya UESHIN
>>
>>
>
> --
> ---
> Takeshi Yamamuro
>


Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-07-14 Thread Takeshi Yamamuro
> Just wanted to check if there are any blockers that we are still waiting
for to start the new release process.
I don't see any on-going blocker in my area.
Thanks for the notification.

Bests,
Tkaeshi

On Wed, Jul 15, 2020 at 4:03 AM Dongjoon Hyun 
wrote:

> Hi, Yi.
>
> Could you explain why you think that is a blocker? For the given example
> from the JIRA description,
>
> spark.udf.register("key", udf((m: Map[String, String]) => m.keys.head.toInt))
>
> Seq(Map("1" -> "one", "2" -> "two")).toDF("a").createOrReplaceTempView("t")
>
> checkAnswer(sql("SELECT key(a) AS k FROM t GROUP BY key(a)"), Row(1) :: Nil)
>
>
> Apache Spark 3.0.0 seems to work like the following.
>
> scala> spark.version
> res0: String = 3.0.0
>
> scala> spark.udf.register("key", udf((m: Map[String, String]) =>
> m.keys.head.toInt))
> res1: org.apache.spark.sql.expressions.UserDefinedFunction =
> SparkUserDefinedFunction($Lambda$1958/948653928@5d6bed7b,IntegerType,List(Some(class[value[0]:
> map])),None,false,true)
>
> scala> Seq(Map("1" -> "one", "2" ->
> "two")).toDF("a").createOrReplaceTempView("t")
>
> scala> sql("SELECT key(a) AS k FROM t GROUP BY key(a)").collect
> res3: Array[org.apache.spark.sql.Row] = Array([1])
>
>
> Could you provide a reproducible example?
>
> Bests,
> Dongjoon.
>
>
> On Tue, Jul 14, 2020 at 10:04 AM Yi Wu  wrote:
>
>> This probably be a blocker:
>> https://issues.apache.org/jira/browse/SPARK-32307
>>
>> On Tue, Jul 14, 2020 at 11:13 PM Sean Owen  wrote:
>>
>>> https://issues.apache.org/jira/browse/SPARK-32234 ?
>>>
>>> On Tue, Jul 14, 2020 at 9:57 AM Shivaram Venkataraman
>>>  wrote:
>>> >
>>> > Hi all
>>> >
>>> > Just wanted to check if there are any blockers that we are still
>>> waiting for to start the new release process.
>>> >
>>> > Thanks
>>> > Shivaram
>>> >
>>>
>>

-- 
---
Takeshi Yamamuro


Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Takeshi Yamamuro
Congrats, all!

On Wed, Jul 15, 2020 at 5:15 AM Takuya UESHIN 
wrote:

> Congrats and welcome!
>
> On Tue, Jul 14, 2020 at 1:07 PM Bryan Cutler  wrote:
>
>> Congratulations and welcome!
>>
>> On Tue, Jul 14, 2020 at 12:36 PM Xingbo Jiang 
>> wrote:
>>
>>> Welcome, Huaxin, Jungtaek, and Dilip!
>>>
>>> Congratulations!
>>>
>>> On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia 
>>> wrote:
>>>
 Hi all,

 The Spark PMC recently voted to add several new committers. Please join
 me in welcoming them to their new roles! The new committers are:

 - Huaxin Gao
 - Jungtaek Lim
 - Dilip Biswal

 All three of them contributed to Spark 3.0 and we’re excited to have
 them join the project.

 Matei and the Spark PMC
 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org


>
> --
> Takuya UESHIN
>
>

-- 
---
Takeshi Yamamuro


Re: [DISCUSS] Drop Python 2, 3.4 and 3.5

2020-07-14 Thread Holden Karau
I’m going to drink a celebratory afternoon coffee :)

On Tue, Jul 14, 2020 at 12:26 PM shane knapp ☠  wrote:

> this is seriously great news!  let's all take a moment and welcome apache
> spark's python support to the present.  ;)
>
> On Mon, Jul 13, 2020 at 7:26 PM Holden Karau  wrote:
>
>> Awesome, thanks you for driving this forward :)
>>
>> On Mon, Jul 13, 2020 at 7:25 PM Hyukjin Kwon  wrote:
>>
>>> Thank you all. Python 2, 3.4 and 3.5 are dropped now in the master
>>> branch at https://github.com/apache/spark/pull/28957
>>>
>>> 2020년 7월 3일 (금) 오전 10:01, Hyukjin Kwon 님이 작성:
>>>
 Thanks Dongjoon. That makes much more sense now!

 2020년 7월 3일 (금) 오전 12:11, Dongjoon Hyun 님이 작성:

> Thank you, Hyukjin.
>
> According to the Python community, Python 3.5 is also EOF at
> 2020-09-13 (only two months left).
>
> - https://www.python.org/downloads/
>
> So, targeting live Python versions at Apache Spark 3.1.0 (December
> 2020) looks reasonable to me.
>
> For old Python versions, we still have Apache Spark 2.4 LTS and also
> Apache Spark 3.0.x will work.
>
> Bests,
> Dongjoon.
>
>
> On Wed, Jul 1, 2020 at 10:50 PM Yuanjian Li 
> wrote:
>
>> +1, especially Python 2
>>
>> Holden Karau  于2020年7月2日周四 上午10:20写道:
>>
>>> I’m ok with us dropping Python 2, 3.4, and 3.5 in Spark 3.1 forward.
>>> It will be exciting to get to use more recent Python features. The most
>>> recent Ubuntu LTS ships with 3.7, and while the previous LTS ships with
>>> 3.5, if folks really can’t upgrade there’s conda.
>>>
>>> Is there anyone with a large Python 3.5 fleet who can’t use conda?
>>>
>>> On Wed, Jul 1, 2020 at 7:15 PM Hyukjin Kwon 
>>> wrote:
>>>
 Yeah, sure. It will be dropped at Spark 3.1 onwards. I don't think
 we should make such changes in maintenance releases

 2020년 7월 2일 (목) 오전 11:13, Holden Karau 님이 작성:

> To be clear the plan is to drop them in Spark 3.1 onwards, yes?
>
> On Wed, Jul 1, 2020 at 7:11 PM Hyukjin Kwon 
> wrote:
>
>> Hi all,
>>
>> I would like to discuss dropping deprecated Python versions 2,
>> 3.4 and 3.5 at https://github.com/apache/spark/pull/28957. I
>> assume people support it in general
>> but I am writing this to make sure everybody is happy.
>>
>> Fokko made a very good investigation on it, see
>> https://github.com/apache/spark/pull/28957#issuecomment-652022449
>> .
>> Assuming from the statistics, I think we're pretty safe to drop
>> them.
>> Also note that dropping Python 2 was actually declared at
>> https://python3statement.org/
>>
>> Roughly speaking, there are many main advantages by dropping them:
>>   1. It removes a bunch of hacks we added around 700 lines in
>> PySpark.
>>   2. PyPy2 has a critical bug that causes a flaky test,
>> https://issues.apache.org/jira/browse/SPARK-28358 given my
>> testing and investigation.
>>   3. Users can use Python type hints with Pandas UDFs without
>> thinking about Python version
>>   4. Users can leverage one latest cloudpickle,
>> https://github.com/apache/spark/pull/28950. With Python 3.8+ it
>> can also leverage C pickle.
>>   5. ...
>>
>> So it benefits both users and dev. WDYT guys?
>>
>>
>> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>
 --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>
>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>
>
> --
> Shane Knapp
> Computer Guy / Voice of Reason
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>
-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Takuya UESHIN
Congrats and welcome!

On Tue, Jul 14, 2020 at 1:07 PM Bryan Cutler  wrote:

> Congratulations and welcome!
>
> On Tue, Jul 14, 2020 at 12:36 PM Xingbo Jiang 
> wrote:
>
>> Welcome, Huaxin, Jungtaek, and Dilip!
>>
>> Congratulations!
>>
>> On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia 
>> wrote:
>>
>>> Hi all,
>>>
>>> The Spark PMC recently voted to add several new committers. Please join
>>> me in welcoming them to their new roles! The new committers are:
>>>
>>> - Huaxin Gao
>>> - Jungtaek Lim
>>> - Dilip Biswal
>>>
>>> All three of them contributed to Spark 3.0 and we’re excited to have
>>> them join the project.
>>>
>>> Matei and the Spark PMC
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>

-- 
Takuya UESHIN


Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Bryan Cutler
Congratulations and welcome!

On Tue, Jul 14, 2020 at 12:36 PM Xingbo Jiang  wrote:

> Welcome, Huaxin, Jungtaek, and Dilip!
>
> Congratulations!
>
> On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia 
> wrote:
>
>> Hi all,
>>
>> The Spark PMC recently voted to add several new committers. Please join
>> me in welcoming them to their new roles! The new committers are:
>>
>> - Huaxin Gao
>> - Jungtaek Lim
>> - Dilip Biswal
>>
>> All three of them contributed to Spark 3.0 and we’re excited to have them
>> join the project.
>>
>> Matei and the Spark PMC
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>


Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Xingbo Jiang
Welcome, Huaxin, Jungtaek, and Dilip!

Congratulations!

On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia 
wrote:

> Hi all,
>
> The Spark PMC recently voted to add several new committers. Please join me
> in welcoming them to their new roles! The new committers are:
>
> - Huaxin Gao
> - Jungtaek Lim
> - Dilip Biswal
>
> All three of them contributed to Spark 3.0 and we’re excited to have them
> join the project.
>
> Matei and the Spark PMC
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [DISCUSS] Drop Python 2, 3.4 and 3.5

2020-07-14 Thread shane knapp ☠
this is seriously great news!  let's all take a moment and welcome apache
spark's python support to the present.  ;)

On Mon, Jul 13, 2020 at 7:26 PM Holden Karau  wrote:

> Awesome, thanks you for driving this forward :)
>
> On Mon, Jul 13, 2020 at 7:25 PM Hyukjin Kwon  wrote:
>
>> Thank you all. Python 2, 3.4 and 3.5 are dropped now in the master branch
>> at https://github.com/apache/spark/pull/28957
>>
>> 2020년 7월 3일 (금) 오전 10:01, Hyukjin Kwon 님이 작성:
>>
>>> Thanks Dongjoon. That makes much more sense now!
>>>
>>> 2020년 7월 3일 (금) 오전 12:11, Dongjoon Hyun 님이 작성:
>>>
 Thank you, Hyukjin.

 According to the Python community, Python 3.5 is also EOF at 2020-09-13
 (only two months left).

 - https://www.python.org/downloads/

 So, targeting live Python versions at Apache Spark 3.1.0 (December
 2020) looks reasonable to me.

 For old Python versions, we still have Apache Spark 2.4 LTS and also
 Apache Spark 3.0.x will work.

 Bests,
 Dongjoon.


 On Wed, Jul 1, 2020 at 10:50 PM Yuanjian Li 
 wrote:

> +1, especially Python 2
>
> Holden Karau  于2020年7月2日周四 上午10:20写道:
>
>> I’m ok with us dropping Python 2, 3.4, and 3.5 in Spark 3.1 forward.
>> It will be exciting to get to use more recent Python features. The most
>> recent Ubuntu LTS ships with 3.7, and while the previous LTS ships with
>> 3.5, if folks really can’t upgrade there’s conda.
>>
>> Is there anyone with a large Python 3.5 fleet who can’t use conda?
>>
>> On Wed, Jul 1, 2020 at 7:15 PM Hyukjin Kwon 
>> wrote:
>>
>>> Yeah, sure. It will be dropped at Spark 3.1 onwards. I don't think
>>> we should make such changes in maintenance releases
>>>
>>> 2020년 7월 2일 (목) 오전 11:13, Holden Karau 님이 작성:
>>>
 To be clear the plan is to drop them in Spark 3.1 onwards, yes?

 On Wed, Jul 1, 2020 at 7:11 PM Hyukjin Kwon 
 wrote:

> Hi all,
>
> I would like to discuss dropping deprecated Python versions 2, 3.4
> and 3.5 at https://github.com/apache/spark/pull/28957. I assume
> people support it in general
> but I am writing this to make sure everybody is happy.
>
> Fokko made a very good investigation on it, see
> https://github.com/apache/spark/pull/28957#issuecomment-652022449.
> Assuming from the statistics, I think we're pretty safe to drop
> them.
> Also note that dropping Python 2 was actually declared at
> https://python3statement.org/
>
> Roughly speaking, there are many main advantages by dropping them:
>   1. It removes a bunch of hacks we added around 700 lines in
> PySpark.
>   2. PyPy2 has a critical bug that causes a flaky test,
> https://issues.apache.org/jira/browse/SPARK-28358 given my
> testing and investigation.
>   3. Users can use Python type hints with Pandas UDFs without
> thinking about Python version
>   4. Users can leverage one latest cloudpickle,
> https://github.com/apache/spark/pull/28950. With Python 3.8+ it
> can also leverage C pickle.
>   5. ...
>
> So it benefits both users and dev. WDYT guys?
>
>
> --
 Twitter: https://twitter.com/holdenkarau
 Books (Learning Spark, High Performance Spark, etc.):
 https://amzn.to/2MaRAG9  
 YouTube Live Streams: https://www.youtube.com/user/holdenkarau

>>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>
>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>


-- 
Shane Knapp
Computer Guy / Voice of Reason
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu


Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Liang-Chi Hsieh
Congrats! Welcome all!


Dongjoon Hyun-2 wrote
> Welcome everyone! :D
> 
> Bests,
> Dongjoon.
> 
> On Tue, Jul 14, 2020 at 11:21 AM Xiao Li 

> lixiao@

>  wrote:
> 
>> Welcome, Dilip, Huaxin and Jungtaek!
>>
>> Xiao
>>
>> On Tue, Jul 14, 2020 at 11:02 AM Holden Karau 

> holden@

> 
>> wrote:
>>
>>> So excited to have our committer pool growing with these awesome folks,
>>> welcome y'all!
>>>
>>> On Tue, Jul 14, 2020 at 10:59 AM Driesprong, Fokko 

> fokko@

> 
>>> wrote:
>>>
 Welcome!

 Op di 14 jul. 2020 om 19:53 schreef shane knapp ☠ 

> sknapp@

> :

> welcome, all!
>
> On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia 

> matei.zaharia@

> 
> wrote:
>
>> Hi all,
>>
>> The Spark PMC recently voted to add several new committers. Please
>> join me in welcoming them to their new roles! The new committers are:
>>
>> - Huaxin Gao
>> - Jungtaek Lim
>> - Dilip Biswal
>>
>> All three of them contributed to Spark 3.0 and we’re excited to have
>> them join the project.
>>
>> Matei and the Spark PMC
>> -
>> To unsubscribe e-mail: 

> dev-unsubscribe@.apache

>>
>>
>
> --
> Shane Knapp
> Computer Guy / Voice of Reason
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>

>>>
>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>> Books (Learning Spark, High Performance Spark, etc.):
>>> https://amzn.to/2MaRAG9  https://amzn.to/2MaRAG9;
>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>
>>
>>
>> --
>> https://databricks.com/sparkaisummit/north-america;
>>





--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Dongjoon Hyun
Welcome everyone! :D

Bests,
Dongjoon.

On Tue, Jul 14, 2020 at 11:21 AM Xiao Li  wrote:

> Welcome, Dilip, Huaxin and Jungtaek!
>
> Xiao
>
> On Tue, Jul 14, 2020 at 11:02 AM Holden Karau 
> wrote:
>
>> So excited to have our committer pool growing with these awesome folks,
>> welcome y'all!
>>
>> On Tue, Jul 14, 2020 at 10:59 AM Driesprong, Fokko 
>> wrote:
>>
>>> Welcome!
>>>
>>> Op di 14 jul. 2020 om 19:53 schreef shane knapp ☠ :
>>>
 welcome, all!

 On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia 
 wrote:

> Hi all,
>
> The Spark PMC recently voted to add several new committers. Please
> join me in welcoming them to their new roles! The new committers are:
>
> - Huaxin Gao
> - Jungtaek Lim
> - Dilip Biswal
>
> All three of them contributed to Spark 3.0 and we’re excited to have
> them join the project.
>
> Matei and the Spark PMC
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

 --
 Shane Knapp
 Computer Guy / Voice of Reason
 UC Berkeley EECS Research / RISELab Staff Technical Lead
 https://rise.cs.berkeley.edu

>>>
>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>
>
> --
> 
>


Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-07-14 Thread Dongjoon Hyun
Hi, Yi.

Could you explain why you think that is a blocker? For the given example
from the JIRA description,

spark.udf.register("key", udf((m: Map[String, String]) => m.keys.head.toInt))

Seq(Map("1" -> "one", "2" -> "two")).toDF("a").createOrReplaceTempView("t")

checkAnswer(sql("SELECT key(a) AS k FROM t GROUP BY key(a)"), Row(1) :: Nil)


Apache Spark 3.0.0 seems to work like the following.

scala> spark.version
res0: String = 3.0.0

scala> spark.udf.register("key", udf((m: Map[String, String]) =>
m.keys.head.toInt))
res1: org.apache.spark.sql.expressions.UserDefinedFunction =
SparkUserDefinedFunction($Lambda$1958/948653928@5d6bed7b,IntegerType,List(Some(class[value[0]:
map])),None,false,true)

scala> Seq(Map("1" -> "one", "2" ->
"two")).toDF("a").createOrReplaceTempView("t")

scala> sql("SELECT key(a) AS k FROM t GROUP BY key(a)").collect
res3: Array[org.apache.spark.sql.Row] = Array([1])


Could you provide a reproducible example?

Bests,
Dongjoon.


On Tue, Jul 14, 2020 at 10:04 AM Yi Wu  wrote:

> This probably be a blocker:
> https://issues.apache.org/jira/browse/SPARK-32307
>
> On Tue, Jul 14, 2020 at 11:13 PM Sean Owen  wrote:
>
>> https://issues.apache.org/jira/browse/SPARK-32234 ?
>>
>> On Tue, Jul 14, 2020 at 9:57 AM Shivaram Venkataraman
>>  wrote:
>> >
>> > Hi all
>> >
>> > Just wanted to check if there are any blockers that we are still
>> waiting for to start the new release process.
>> >
>> > Thanks
>> > Shivaram
>> >
>>
>


Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Xiao Li
Welcome, Dilip, Huaxin and Jungtaek!

Xiao

On Tue, Jul 14, 2020 at 11:02 AM Holden Karau  wrote:

> So excited to have our committer pool growing with these awesome folks,
> welcome y'all!
>
> On Tue, Jul 14, 2020 at 10:59 AM Driesprong, Fokko 
> wrote:
>
>> Welcome!
>>
>> Op di 14 jul. 2020 om 19:53 schreef shane knapp ☠ :
>>
>>> welcome, all!
>>>
>>> On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia 
>>> wrote:
>>>
 Hi all,

 The Spark PMC recently voted to add several new committers. Please join
 me in welcoming them to their new roles! The new committers are:

 - Huaxin Gao
 - Jungtaek Lim
 - Dilip Biswal

 All three of them contributed to Spark 3.0 and we’re excited to have
 them join the project.

 Matei and the Spark PMC
 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org


>>>
>>> --
>>> Shane Knapp
>>> Computer Guy / Voice of Reason
>>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>>> https://rise.cs.berkeley.edu
>>>
>>
>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>


-- 



Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Holden Karau
So excited to have our committer pool growing with these awesome folks,
welcome y'all!

On Tue, Jul 14, 2020 at 10:59 AM Driesprong, Fokko 
wrote:

> Welcome!
>
> Op di 14 jul. 2020 om 19:53 schreef shane knapp ☠ :
>
>> welcome, all!
>>
>> On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia 
>> wrote:
>>
>>> Hi all,
>>>
>>> The Spark PMC recently voted to add several new committers. Please join
>>> me in welcoming them to their new roles! The new committers are:
>>>
>>> - Huaxin Gao
>>> - Jungtaek Lim
>>> - Dilip Biswal
>>>
>>> All three of them contributed to Spark 3.0 and we’re excited to have
>>> them join the project.
>>>
>>> Matei and the Spark PMC
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>>
>> --
>> Shane Knapp
>> Computer Guy / Voice of Reason
>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>> https://rise.cs.berkeley.edu
>>
>

-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Driesprong, Fokko
Welcome!

Op di 14 jul. 2020 om 19:53 schreef shane knapp ☠ :

> welcome, all!
>
> On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia 
> wrote:
>
>> Hi all,
>>
>> The Spark PMC recently voted to add several new committers. Please join
>> me in welcoming them to their new roles! The new committers are:
>>
>> - Huaxin Gao
>> - Jungtaek Lim
>> - Dilip Biswal
>>
>> All three of them contributed to Spark 3.0 and we’re excited to have them
>> join the project.
>>
>> Matei and the Spark PMC
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>
> --
> Shane Knapp
> Computer Guy / Voice of Reason
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>


Re: Welcoming some new Apache Spark committers

2020-07-14 Thread shane knapp ☠
welcome, all!

On Tue, Jul 14, 2020 at 10:37 AM Matei Zaharia 
wrote:

> Hi all,
>
> The Spark PMC recently voted to add several new committers. Please join me
> in welcoming them to their new roles! The new committers are:
>
> - Huaxin Gao
> - Jungtaek Lim
> - Dilip Biswal
>
> All three of them contributed to Spark 3.0 and we’re excited to have them
> join the project.
>
> Matei and the Spark PMC
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

-- 
Shane Knapp
Computer Guy / Voice of Reason
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu


Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Reynold Xin
Welcome all!

On Tue, Jul 14, 2020 at 10:36 AM, Matei Zaharia < matei.zaha...@gmail.com > 
wrote:

> 
> 
> 
> Hi all,
> 
> 
> 
> The Spark PMC recently voted to add several new committers. Please join me
> in welcoming them to their new roles! The new committers are:
> 
> 
> 
> - Huaxin Gao
> - Jungtaek Lim
> - Dilip Biswal
> 
> 
> 
> All three of them contributed to Spark 3.0 and we’re excited to have them
> join the project.
> 
> 
> 
> Matei and the Spark PMC
> - To
> unsubscribe e-mail: dev-unsubscribe@ spark. apache. org (
> dev-unsubscr...@spark.apache.org )
> 
> 
>

smime.p7s
Description: S/MIME Cryptographic Signature


Welcoming some new Apache Spark committers

2020-07-14 Thread Matei Zaharia
Hi all,

The Spark PMC recently voted to add several new committers. Please join me in 
welcoming them to their new roles! The new committers are:

- Huaxin Gao
- Jungtaek Lim
- Dilip Biswal

All three of them contributed to Spark 3.0 and we’re excited to have them join 
the project.

Matei and the Spark PMC
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-07-14 Thread Yi Wu
This probably be a blocker:
https://issues.apache.org/jira/browse/SPARK-32307

On Tue, Jul 14, 2020 at 11:13 PM Sean Owen  wrote:

> https://issues.apache.org/jira/browse/SPARK-32234 ?
>
> On Tue, Jul 14, 2020 at 9:57 AM Shivaram Venkataraman
>  wrote:
> >
> > Hi all
> >
> > Just wanted to check if there are any blockers that we are still waiting
> for to start the new release process.
> >
> > Thanks
> > Shivaram
> >
>


Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-07-14 Thread Sean Owen
https://issues.apache.org/jira/browse/SPARK-32234 ?

On Tue, Jul 14, 2020 at 9:57 AM Shivaram Venkataraman
 wrote:
>
> Hi all
>
> Just wanted to check if there are any blockers that we are still waiting for 
> to start the new release process.
>
> Thanks
> Shivaram
>

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-07-14 Thread Shivaram Venkataraman
Hi all

Just wanted to check if there are any blockers that we are still waiting
for to start the new release process.

Thanks
Shivaram

On Sun, Jul 5, 2020, 06:51 wuyi  wrote:

> Ok, after having another look, I think it only affects local cluster deploy
> mode, which is for testing only.
>
>
> wuyi wrote
> > Please also includes https://issues.apache.org/jira/browse/SPARK-32120
>  in
> > Spark 3.0.1. It's a regression compares to Spark 3.0.0-preview2.
> >
> > Thanks,
> > Yi Wu
> >
> >
> > Yuanjian Li wrote
> >> Hi dev-list,
> >>
> >> I’m writing this to raise the discussion about Spark 3.0.1 feasibility
> >> since 4 blocker issues were found after Spark 3.0.0:
> >>
> >>
> >>1.
> >>
> >>[SPARK-31990]
> >> https://issues.apache.org/jira/browse/SPARK-31990;
> >> The
> >>state store compatibility broken will cause a correctness issue when
> >>Streaming query with `dropDuplicate` uses the checkpoint written by
> >> the
> >> old
> >>Spark version.
> >>2.
> >>
> >>[SPARK-32038]
> >> https://issues.apache.org/jira/browse/SPARK-32038;
> >> The
> >>regression bug in handling NaN values in COUNT(DISTINCT)
> >>3.
> >>
> >>[SPARK-31918]
> >> https://issues.apache.org/jira/browse/SPARK-31918[WIP]
> >>CRAN requires to make it working with the latest R 4.0. It makes the
> >> 3.0
> >>release unavailable on CRAN, and only supports R [3.5, 4.0)
> >>4.
> >>
> >>[SPARK-31967]
> >> https://issues.apache.org/jira/browse/SPARK-31967;
> >>Downgrade vis.js to fix Jobs UI loading time regression
> >>
> >>
> >> I also noticed branch-3.0 already has 39 commits
> >> 
> https://issues.apache.org/jira/browse/SPARK-32038?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%203.0.1
> ;
> >> after Spark 3.0.0. I think it would be great if we have Spark 3.0.1 to
> >> deliver the critical fixes.
> >>
> >> Any comments are appreciated.
> >>
> >> Best,
> >>
> >> Yuanjian
> >
> >
> >
> >
> >
> > --
> > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
> >
> > -
> > To unsubscribe e-mail:
>
> > dev-unsubscribe@.apache
>
>
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [PSA] Apache Spark uses GitHub Actions to run the tests

2020-07-14 Thread Hyukjin Kwon
Perfect. Plus, Github Actions is only for master branch at this moment.

BTW, I think we can enable Java(Scala) doc build and dependency test back
in Jenkins for simplicity.
Seems like the Jenkins machine came back to normal.

2020년 7월 14일 (화) 오후 4:08, Wenchen Fan 님이 작성:

> To clarify, we need to wait for:
> 1. Java documentation build test in github actions
> 2. dependency test in github actions
> 3. either github action all green or jenkin pass
>
> If the PR touches Kinesis, or it uses other profiles, we must wait for
> jenkins to pass.
>
> Do I miss something?
>
> On Tue, Jul 14, 2020 at 2:18 PM Hyukjin Kwon  wrote:
>
>> Hi dev,
>>
>> Github Actions build was introduced to run the regular Spark test cases
>> at https://github.com/apache/spark/pull/29057and
>> https://github.com/apache/spark/pull/29086.
>> This is virtually the duplication of default Jenkins PR builder at this
>> moment.
>>
>> The only differences are:
>> - Github Actions does not run the tests for Kinesis, see SPARK-32246
>> - Github Actions does not support other profiles such as JDK 11 or Hive
>> 1.2, see SPARK-32255
>> - Jenkins build does not run Java documentation build, see SPARK-32233
>> - Jenkins build does not run the dependency test, see SPARK-32178
>>
>> Therefore, I do believe PRs can be merged in most general cases once the
>> Jenkins PR
>> builder or Github Actions build passes when we expect the successful test
>> results from
>> the default Jenkins PR builder.
>>
>> Thanks.
>>
>


Re: [PSA] Apache Spark uses GitHub Actions to run the tests

2020-07-14 Thread Wenchen Fan
To clarify, we need to wait for:
1. Java documentation build test in github actions
2. dependency test in github actions
3. either github action all green or jenkin pass

If the PR touches Kinesis, or it uses other profiles, we must wait for
jenkins to pass.

Do I miss something?

On Tue, Jul 14, 2020 at 2:18 PM Hyukjin Kwon  wrote:

> Hi dev,
>
> Github Actions build was introduced to run the regular Spark test cases at
> https://github.com/apache/spark/pull/29057and
> https://github.com/apache/spark/pull/29086.
> This is virtually the duplication of default Jenkins PR builder at this
> moment.
>
> The only differences are:
> - Github Actions does not run the tests for Kinesis, see SPARK-32246
> - Github Actions does not support other profiles such as JDK 11 or Hive
> 1.2, see SPARK-32255
> - Jenkins build does not run Java documentation build, see SPARK-32233
> - Jenkins build does not run the dependency test, see SPARK-32178
>
> Therefore, I do believe PRs can be merged in most general cases once the
> Jenkins PR
> builder or Github Actions build passes when we expect the successful test
> results from
> the default Jenkins PR builder.
>
> Thanks.
>


[PSA] Apache Spark uses GitHub Actions to run the tests

2020-07-14 Thread Hyukjin Kwon
Hi dev,

Github Actions build was introduced to run the regular Spark test cases at
https://github.com/apache/spark/pull/29057and
https://github.com/apache/spark/pull/29086.
This is virtually the duplication of default Jenkins PR builder at this
moment.

The only differences are:
- Github Actions does not run the tests for Kinesis, see SPARK-32246
- Github Actions does not support other profiles such as JDK 11 or Hive
1.2, see SPARK-32255
- Jenkins build does not run Java documentation build, see SPARK-32233
- Jenkins build does not run the dependency test, see SPARK-32178

Therefore, I do believe PRs can be merged in most general cases once the
Jenkins PR
builder or Github Actions build passes when we expect the successful test
results from
the default Jenkins PR builder.

Thanks.