Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-07-01 Thread Shivaram Venkataraman
Thanks Holden -- it would be great to also get 2.4.7 started Thanks Shivaram On Tue, Jun 30, 2020 at 10:31 PM Holden Karau wrote: > > I can take care of 2.4.7 unless someone else wants to do it. > > On Tue, Jun 30, 2020 at 8:29 PM Jason Moore > wrote: >> >> Hi all, >> >> >> >> Could I get

Re: Setting spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=1 and Doc issue

2020-07-01 Thread Steve Loughran
https://issues.apache.org/jira/browse/MAPREDUCE-7282 "MR v2 commit algorithm is dangerous, should be deprecated and not the default" someone do a PR to change the default & if it doesn't break too much I'l merge it On Mon, 29 Jun 2020 at 13:20, Steve Loughran wrote: > v2 does a file-by-file

Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-07-01 Thread Jungtaek Lim
https://issues.apache.org/jira/browse/SPARK-32148 was reported yesterday, and if the report is valid it looks to be a blocker. I'll try to take a look sooner. On Thu, Jul 2, 2020 at 12:48 AM Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Thanks Holden -- it would be great to also

Re: Spark 3 pod template for the driver

2020-07-01 Thread Edward Mitchell
Okay, I see what's going on here. Looks like the way that spark is coded, the driver container image (specified by --conf spark.kubernetes.driver.container.image) and executor container image (specified by --conf spark.kubernetes.executor.container.image) is required. If they're not specified

[DISCUSS] Drop Python 2, 3.4 and 3.5

2020-07-01 Thread Hyukjin Kwon
Hi all, I would like to discuss dropping deprecated Python versions 2, 3.4 and 3.5 at https://github.com/apache/spark/pull/28957. I assume people support it in general but I am writing this to make sure everybody is happy. Fokko made a very good investigation on it, see

Re: [VOTE] Decommissioning SPIP

2020-07-01 Thread Imran Rashid
+1 I think this is going to be a really important feature for Spark and I'm glad to see Holden focusing on it. On Wed, Jul 1, 2020 at 8:38 PM Mridul Muralidharan wrote: > +1 > > Thanks, > Mridul > > On Wed, Jul 1, 2020 at 6:36 PM Hyukjin Kwon wrote: > >> +1 >> >> 2020년 7월 2일 (목) 오전 10:08,

Re: [DISCUSS] Drop Python 2, 3.4 and 3.5

2020-07-01 Thread Hyukjin Kwon
Yeah, sure. It will be dropped at Spark 3.1 onwards. I don't think we should make such changes in maintenance releases 2020년 7월 2일 (목) 오전 11:13, Holden Karau 님이 작성: > To be clear the plan is to drop them in Spark 3.1 onwards, yes? > > On Wed, Jul 1, 2020 at 7:11 PM Hyukjin Kwon wrote: > >> Hi

Re: [VOTE] Decommissioning SPIP

2020-07-01 Thread Marcelo Vanzin
I reviewed the docs and PRs from way before an SPIP was explicitly asked, so I'm comfortable with giving a +1 even if I haven't really fully read the new document, On Wed, Jul 1, 2020 at 6:05 PM Holden Karau wrote: > > Hi Spark Devs, > > I think discussion has settled on the SPIP doc at >

Re: [DISCUSS] Drop Python 2, 3.4 and 3.5

2020-07-01 Thread Holden Karau
I’m ok with us dropping Python 2, 3.4, and 3.5 in Spark 3.1 forward. It will be exciting to get to use more recent Python features. The most recent Ubuntu LTS ships with 3.7, and while the previous LTS ships with 3.5, if folks really can’t upgrade there’s conda. Is there anyone with a large

[VOTE] Decommissioning SPIP

2020-07-01 Thread Holden Karau
Hi Spark Devs, I think discussion has settled on the SPIP doc at https://docs.google.com/document/d/1EOei24ZpVvR7_w0BwBjOnrWRy4k-qTdIlx60FsHZSHA/edit?usp=sharing , design doc at https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE/edit, or JIRA

Re: [DISCUSS] Drop Python 2, 3.4 and 3.5

2020-07-01 Thread Holden Karau
To be clear the plan is to drop them in Spark 3.1 onwards, yes? On Wed, Jul 1, 2020 at 7:11 PM Hyukjin Kwon wrote: > Hi all, > > I would like to discuss dropping deprecated Python versions 2, 3.4 and 3.5 > at https://github.com/apache/spark/pull/28957. I assume people support it > in general >

Fwd: Announcing ApacheCon @Home 2020

2020-07-01 Thread Felix Cheung
-- Forwarded message - We are pleased to announce that ApacheCon @Home will be held online, September 29 through October 1. More event details are available at https://apachecon.com/acah2020 but there’s a few things that I want to highlight for you, the members. Yes, the CFP

Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-07-01 Thread Xiao Li
+1 on releasing both 3.0.1 and 2.4.7 Great! Three committers volunteer to be a release manager. Ruifeng, Prashant and Holden. Holden just helped release Spark 2.4.6. This time, maybe, Ruifeng and Prashant can be the release manager of 3.0.1 and 2.4.7 respectively. Xiao On Wed, Jul 1, 2020 at

Re: m2 cache issues in Jenkins?

2020-07-01 Thread Hyukjin Kwon
Nope, do we have an existing ticket? I think we can reopen if there is. 2020년 7월 2일 (목) 오후 1:43, Holden Karau 님이 작성: > Huh interesting that it’s the same worker. Have you filed a ticket to > Shane? > > On Wed, Jul 1, 2020 at 8:50 PM Hyukjin Kwon wrote: > >> Hm .. seems this is happening again

Re: m2 cache issues in Jenkins?

2020-07-01 Thread Holden Karau
We don't I didn't file one originally, but Shane reminded me to in the future. On Wed, Jul 1, 2020 at 9:44 PM Hyukjin Kwon wrote: > Nope, do we have an existing ticket? I think we can reopen if there is. > > 2020년 7월 2일 (목) 오후 1:43, Holden Karau 님이 작성: > >> Huh interesting that it’s the same

Re: [VOTE] Decommissioning SPIP

2020-07-01 Thread Hyukjin Kwon
+1 2020년 7월 2일 (목) 오전 10:08, Marcelo Vanzin 님이 작성: > I reviewed the docs and PRs from way before an SPIP was explicitly > asked, so I'm comfortable with giving a +1 even if I haven't really > fully read the new document, > > On Wed, Jul 1, 2020 at 6:05 PM Holden Karau wrote: > > > > Hi Spark

Re: [VOTE] Decommissioning SPIP

2020-07-01 Thread Mridul Muralidharan
+1 Thanks, Mridul On Wed, Jul 1, 2020 at 6:36 PM Hyukjin Kwon wrote: > +1 > > 2020년 7월 2일 (목) 오전 10:08, Marcelo Vanzin 님이 작성: > >> I reviewed the docs and PRs from way before an SPIP was explicitly >> asked, so I'm comfortable with giving a +1 even if I haven't really >> fully read the new

Re: m2 cache issues in Jenkins?

2020-07-01 Thread Holden Karau
Huh interesting that it’s the same worker. Have you filed a ticket to Shane? On Wed, Jul 1, 2020 at 8:50 PM Hyukjin Kwon wrote: > Hm .. seems this is happening again in amp-jenkins-worker-04 ;(. > > 2020년 6월 25일 (목) 오전 3:15, shane knapp ☠ 님이 작성: > >> done: >> -bash-4.1$ cd .m2 >> -bash-4.1$ ls

Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-07-01 Thread Holden Karau
I’m happy to have Prashant do 2.4.7 :) On Wed, Jul 1, 2020 at 9:40 PM Xiao Li wrote: > +1 on releasing both 3.0.1 and 2.4.7 > > Great! Three committers volunteer to be a release manager. Ruifeng, > Prashant and Holden. Holden just helped release Spark 2.4.6. This time, > maybe, Ruifeng and

Re: [VOTE] Decommissioning SPIP

2020-07-01 Thread Stephen Boesch
+1 Thx for seeing this through On Wed, 1 Jul 2020 at 20:03, Imran Rashid wrote: > +1 > > I think this is going to be a really important feature for Spark and I'm > glad to see Holden focusing on it. > > On Wed, Jul 1, 2020 at 8:38 PM Mridul Muralidharan > wrote: > >> +1 >> >> Thanks, >> Mridul

Re: m2 cache issues in Jenkins?

2020-07-01 Thread Hyukjin Kwon
Hm .. seems this is happening again in amp-jenkins-worker-04 ;(. 2020년 6월 25일 (목) 오전 3:15, shane knapp ☠ 님이 작성: > done: > -bash-4.1$ cd .m2 > -bash-4.1$ ls > repository > -bash-4.1$ time rm -rf * > > real17m4.607s > user0m0.950s > sys 0m18.816s > -bash-4.1$ > > On Wed, Jun 24, 2020

Re: m2 cache issues in Jenkins?

2020-07-01 Thread Hyukjin Kwon
Ah, okay. Actually there already is - https://issues.apache.org/jira/browse/SPARK-31693. I am reopening. 2020년 7월 2일 (목) 오후 2:06, Holden Karau 님이 작성: > We don't I didn't file one originally, but Shane reminded me to in the > future. > > On Wed, Jul 1, 2020 at 9:44 PM Hyukjin Kwon wrote: > >>

Re: [DISCUSS] Drop Python 2, 3.4 and 3.5

2020-07-01 Thread Yuanjian Li
+1, especially Python 2 Holden Karau 于2020年7月2日周四 上午10:20写道: > I’m ok with us dropping Python 2, 3.4, and 3.5 in Spark 3.1 forward. It > will be exciting to get to use more recent Python features. The most recent > Ubuntu LTS ships with 3.7, and while the previous LTS ships with 3.5, if > folks

Re: Apache Spark 3.1 Feature Expectation (Dec. 2020)

2020-07-01 Thread Gabor Somogyi
Hi Dongjoon, I would add JDBC Kerberos support w/ keytab: https://issues.apache.org/jira/browse/SPARK-12312 BR, G On Mon, Jun 29, 2020 at 6:07 PM Dongjoon Hyun wrote: > Hi, All. > > After a short celebration of Apache Spark 3.0, I'd like to ask you the > community opinion on Apache Spark 3.1