+1
From: Denny Lee
Sent: Monday, April 1, 2024 10:06:14 AM
To: Hussein Awala
Cc: Chao Sun ; Hyukjin Kwon ; Mridul
Muralidharan ; dev
Subject: Re: [VOTE] SPIP: Pure Python Package in PyPI (Spark Connect)
+1 (non-binding)
On Mon, Apr 1, 2024 at 9:24 AM Hussein
+1 to doc, seed argument would be great if possible
From: Sean Owen
Sent: Monday, September 26, 2022 5:26:26 PM
To: Nicholas Gustafson
Cc: dev
Subject: Re: Why are hash functions seeded with 42?
Oh yeah I get why we love to pick 42 for random things. I'm
-- Forwarded message -
From: Gregor Seyer
Date: Wed, Oct 20, 2021 at 4:42 AM
Subject: Re: CRAN submission SparkR 3.2.0
To: Felix Cheung , CRAN <
cran-submissi...@r-project.org>
Thanks,
Please add \value to .Rd files regarding exported methods and explain
the functions r
ser' confirmation when we
> install.spark?
> IIRC, the auto installation is only triggered by interactive shell so
> getting user's confirmation should be fine.
>
> 2021년 6월 18일 (금) 오전 2:54, Felix Cheung 님이 작성:
>
>> Any suggestion or comment on this? They are going to re
Any suggestion or comment on this? They are going to remove the package by
6-28
Seems to me if we have a switch to opt in to install (and not by default
on), or prompt the user in interactive session, should be good as user
confirmation.
On Sun, Jun 13, 2021 at 11:25 PM Felix Cheung
wrote
, 2021 at 10:19 PM
Subject: CRAN package SparkR
To: Felix Cheung
CC:
Dear maintainer,
Checking this apparently creates the default directory as per
#' @param localDir a local directory where Spark is installed. The
directory con
tains
#' version-specific folders of Spark
Welcome!
From: Driesprong, Fokko
Sent: Friday, March 26, 2021 1:25:33 PM
To: Matei Zaharia
Cc: Spark Dev List
Subject: Re: Welcoming six new Apache Spark committers
Well deserved all! Welcome!
Op vr 26 mrt. 2021 om 21:21 schreef Matei Zaharia
Congrats and thanks!
From: Hyukjin Kwon
Sent: Wednesday, March 3, 2021 4:09:23 PM
To: Dongjoon Hyun
Cc: Gabor Somogyi ; Jungtaek Lim
; angers zhu ; Wenchen Fan
; Kent Yao ; Takeshi Yamamuro
; dev ; user @spark
Subject: Re: [ANNOUNCE] Announcing Apache Spark
consider
> dropping it as Dongjoon initially pointed out.
>
> 2020년 12월 30일 (수) 오후 1:59, Felix Cheung 님이 작성:
>
>> Ah, I don’t recall actually - maybe it was just missed?
>>
>> The last message I had, was in June when it was broken by R 4.0.1, which
>> was fixed.
>
-31918 and
> https://issues.apache.org/jira/browse/SPARK-32073.
> I wonder why other releases were not uploaded yet. Do you guys know any
> context or if there is a standing issue on this, @Felix Cheung
> or @Shivaram Venkataraman
> ?
>
> 2020년 12월 23일 (수) 오전 11:21, Mridul Mu
Ok - it took many years to get it first published, so it was hard to get
there.
On Tue, Dec 22, 2020 at 5:45 PM Hyukjin Kwon wrote:
> Adding @Shivaram Venkataraman and @Felix
> Cheung FYI
>
> 2020년 12월 23일 (수) 오전 9:22, Michael Heuer 님이 작성:
>
>> Anecdotally, as a projec
So IMO maintaining outside in a separate repo is going to be harder. That was
why I asked.
From: Maciej Szymkiewicz
Sent: Tuesday, August 4, 2020 12:59 PM
To: Sean Owen
Cc: Felix Cheung; Hyukjin Kwon; Driesprong, Fokko; Holden Karau; Spark Dev List
Subject: Re
What would be the reason for separate git repo?
From: Hyukjin Kwon
Sent: Monday, August 3, 2020 1:58:55 AM
To: Maciej Szymkiewicz
Cc: Driesprong, Fokko ; Holden Karau
; Spark Dev List
Subject: Re: [PySpark] Revisiting PySpark type annotations
Okay, seems like
+1
From: Holden Karau
Sent: Wednesday, July 22, 2020 10:49:49 AM
To: Steve Loughran
Cc: dev
Subject: Re: Exposing Spark parallelized directory listing & non-locality
listing in core
Wonderful. To be clear the patch is more to start the discussion about how we
Welcome!
From: Nick Pentreath
Sent: Tuesday, July 14, 2020 10:21:17 PM
To: dev
Cc: Dilip Biswal ; Jungtaek Lim
; huaxin gao
Subject: Re: Welcoming some new Apache Spark committers
Congratulations and welcome as Apache Spark committers!
On Wed, 15 Jul 2020 at
I think pluggable storage in shuffle is essential for k8s GA
From: Holden Karau
Sent: Monday, June 29, 2020 9:33 AM
To: Maxim Gekk
Cc: Dongjoon Hyun; dev
Subject: Re: Apache Spark 3.1 Feature Expectation (Dec. 2020)
Should we also consider the shuffle service
-- Forwarded message -
We are pleased to announce that ApacheCon @Home will be held online,
September 29 through October 1.
More event details are available at https://apachecon.com/acah2020 but
there’s a few things that I want to highlight for you, the members.
Yes, the CFP
Congrats
From: Jungtaek Lim
Sent: Thursday, June 18, 2020 8:18:54 PM
To: Hyukjin Kwon
Cc: Mridul Muralidharan ; Reynold Xin ;
dev ; user
Subject: Re: [ANNOUNCE] Apache Spark 3.0.0
Great, thanks all for your efforts on the huge step forward!
On Fri, Jun 19,
I think it’s a good idea
From: Hyukjin Kwon
Sent: Wednesday, January 15, 2020 5:49:12 AM
To: dev
Cc: Sean Owen ; Nicholas Chammas
Subject: Re: More publicly documenting the options under spark.sql.*
Resending to the dev list for archive purpose:
I think
; Christopher Crosbie ; Griselda
Cuevas ; Holden Karau ; Mayank Ahuja
; Kalyan Sivakumar ; alfo...@fb.com
; Felix Cheung ; Matt Cheah
; Yifei Huang (PD)
Subject: Re: Enabling fully disaggregated shuffle on Spark
That sounds great!
On Wed, Nov 20, 2019 at 9:02 AM John Zhuge
mailto:jzh
Just to add - hive 1.2 fork is definitely not more stable. We know of a few
critical bug fixes that we cherry picked into a fork of that fork to maintain
ourselves.
From: Dongjoon Hyun
Sent: Wednesday, November 20, 2019 11:07:47 AM
To: Sean Owen
Cc: dev
1000% with Steve, the org.spark-project hive 1.2 will need a solution. It is
old and rather buggy; and It’s been *years*
I think we should decouple hive change from everything else if people are
concerned?
From: Steve Loughran
Sent: Sunday, November 17, 2019
this is about test description and not test file name right?
if yes I don’t see a problem.
From: Hyukjin Kwon
Sent: Thursday, November 14, 2019 6:03:02 PM
To: Shixiong(Ryan) Zhu
Cc: dev ; Felix Cheung ;
Shivaram Venkataraman
Subject: Re: Adding JIRA ID
+1
From: Thomas graves
Sent: Wednesday, September 4, 2019 7:24:26 AM
To: dev
Subject: [VOTE] [SPARK-27495] SPIP: Support Stage level resource configuration
and scheduling
Hey everyone,
I'd like to call for a vote on SPARK-27495 SPIP: Support Stage level
I’d prefer strict mode and fail fast (analysis check)
Also I like what Alastair suggested about standard clarification.
I think we can re-visit this proposal and restart the vote
From: Ryan Blue
Sent: Friday, September 6, 2019 5:28 PM
To: Alastair Green
Cc:
(Hmm, what is spark-...@apache.org?)
From: Sean Owen
Sent: Tuesday, September 3, 2019 11:58:30 AM
To: Xiao Li
Cc: Tom Graves ; spark-...@apache.org
Subject: Re: maven 3.6.1 removed from apache maven repo
It's because build/mvn only queries ASF mirrors, and
I did review it and solving this problem makes sense. I will comment in the
JIRA.
From: Jungtaek Lim
Sent: Sunday, August 25, 2019 3:34:22 PM
To: dev
Subject: Design review of SPARK-28594
Hi devs,
I have been working on designing SPARK-28594 [1] (though I've
+1
Run tests, R tests, r-hub Debian, Ubuntu, mac, Windows
From: Hyukjin Kwon
Sent: Wednesday, August 28, 2019 9:14 PM
To: Takeshi Yamamuro
Cc: dev; Dongjoon Hyun
Subject: Re: [VOTE] Release Apache Spark 2.4.4 (RC3)
+1 (from the last blocker PR)
2019년 8월 29일
That’s great!
From: ☼ R Nair
Sent: Saturday, August 24, 2019 10:57:31 AM
To: Dongjoon Hyun
Cc: dev@spark.apache.org ; user @spark/'user
@spark'/spark users/user@spark
Subject: Re: JDK11 Support in Apache Spark
Finally!!! Congrats
On Sat, Aug 24, 2019, 11:11
+1
Glad to see the progress in this space - it’s been more than a year since the
original discussion and effort started.
From: Yinan Li
Sent: Monday, June 17, 2019 7:14:42 PM
To: rb...@netflix.com
Cc: Dongjoon Hyun; Saisai Shao; Imran Rashid; Ilan Filonenko; bo
How about pyArrow?
From: Holden Karau
Sent: Friday, June 14, 2019 11:06:15 AM
To: Felix Cheung
Cc: Bryan Cutler; Dongjoon Hyun; Hyukjin Kwon; dev; shane knapp
Subject: Re: [DISCUSS] Increasing minimum supported version of Pandas
Are there other Python
So to be clear, min version check is 0.23
Jenkins test is 0.24
I’m ok with this. I hope someone will test 0.23 on releases though before we
sign off?
From: shane knapp
Sent: Friday, June 14, 2019 10:23:56 AM
To: Bryan Cutler
Cc: Dongjoon Hyun; Holden Karau;
.
From: shane knapp
Sent: Friday, May 31, 2019 7:38:10 PM
To: Denny Lee
Cc: Holden Karau; Bryan Cutler; Erik Erlandson; Felix Cheung; Mark Hamstra;
Matei Zaharia; Reynold Xin; Sean Owen; Wenchen Fen; Xiangrui Meng; dev; user
Subject: Re: Should python-2 be supported in Spark 3.0?
+1000
We don’t usually reference a future release on website
> Spark website and state that Python 2 is deprecated in Spark 3.0
I suspect people will then ask when is Spark 3.0 coming out then. Might need to
provide some clarity on that.
From: Reynold Xin
Sent:
+1
I’d prefer to see more of the end goal and how that could be achieved (such as
ETL or SPARK-24579). However given the rounds and months of discussions we have
come down to just the public API.
If the community thinks a new set of public API is maintainable, I don’t see
any problem with
You could
df.filter(col(“c”) = “c1”).write().partitionBy(“c”).save
It could get some data skew problem but might work for you
From: Burak Yavuz
Sent: Tuesday, May 7, 2019 9:35:10 AM
To: Shubham Chaurasia
Cc: dev; u...@spark.apache.org
Subject: Re: Static
I ran basic tests on R, r-hub etc. LGTM.
+1 (limited - I didn’t get to run other usual tests)
From: Sean Owen
Sent: Wednesday, May 1, 2019 2:21 PM
To: Xiao Li
Cc: dev@spark.apache.org
Subject: Re: [VOTE] Release Apache Spark 2.4.3
+1 from me. There is little
Just my 2c
If there is a known security issue, we should fix it rather waiting for if it
actually could be might be affecting Spark to be found by a black hat, or worse.
I don’t think any of us want to see Spark in the news for this reason.
From: Sean Owen
+1
R tests, package tests on r-hub. Manually check commits under R, doc etc
From: Sean Owen
Sent: Saturday, April 20, 2019 11:27 AM
To: Wenchen Fan
Cc: Spark dev list
Subject: Re: [VOTE] Release Apache Spark 2.4.2
+1 from me too.
It seems like there is
Re shading - same argument I’ve made earlier today in a PR...
(Context- in many cases Spark has light or indirect dependencies but bringing
them into the process breaks users code easily)
From: Michael Heuer
Sent: Thursday, April 18, 2019 6:41 AM
To: Reynold
I kinda agree it is confusing when a parameter is not used...
From: Ryan Blue
Sent: Thursday, April 11, 2019 11:07:25 AM
To: Bruce Robbins
Cc: Dávid Szakállas; Spark Dev List
Subject: Re: Dataset schema incompatibility bug when reading column partitioned
data
Hi Spark community!
As you know ApacheCon NA 2019 is coming this Sept and it’s CFP is now open!
This is an important milestone as we celebrate 20 years of ASF. We have tracks
like Big Data and Machine Learning among many others. Please submit your
talks/thoughts/challenges/learnings here:
To: Bryan Cutler
Cc: Felix Cheung; Hyukjin Kwon; dev
Subject: Re: Upgrading minimal PyArrow version to 0.12.x [SPARK-27276]
i'm not opposed to 3.6 at all.
On Fri, Mar 29, 2019 at 4:16 PM Bryan Cutler
mailto:cutl...@gmail.com>> wrote:
PyArrow dropping Python 3.4 was mainly due to support goin
Definitely the part on the PR. Thanks!
From: shane knapp
Sent: Thursday, March 28, 2019 11:19 AM
To: dev; Stavros Kontopoulos
Subject: [k8s][jenkins] spark dev tool docs now have k8s+minikube instructions!
https://spark.apache.org/developer-tools.html
search
+1
build source
R tests
R package CRAN check locally, r-hub
From: d_t...@apple.com on behalf of DB Tsai
Sent: Wednesday, March 27, 2019 11:31 AM
To: dev
Subject: [VOTE] Release Apache Spark 2.4.1 (RC9)
Please vote on releasing the following candidate as Apache
(I think the .invalid is added by the list server)
Personally I’d rather everyone just +1 or -1, and shouldn’t add binding or not.
It’s really the responsibility of the RM to confirm if a vote is binding.
Mistakes have been made otherwise.
From: Marcelo Vanzin
3.4 is end of life but 3.5 is not. From your link
we expect to release Python 3.5.8 around September 2019.
From: shane knapp
Sent: Thursday, March 28, 2019 7:54 PM
To: Hyukjin Kwon
Cc: Bryan Cutler; dev; Felix Cheung
Subject: Re: Upgrading minimal PyArrow
Shane is also
correct in that newer versions of pyarrow have stopped support for Python 3.4,
so we should probably have Jenkins test against 2.7 and 3.5.
On Mon, Mar 25, 2019 at 9:44 PM Reynold Xin
mailto:r...@databricks.com>> wrote:
+1 on doing this in 3.0.
On Mon, Mar 25, 2019 at 9:31 PM,
I’m +1 if 3.0
From: Sean Owen
Sent: Monday, March 25, 2019 6:48 PM
To: Hyukjin Kwon
Cc: dev; Bryan Cutler; Takuya UESHIN; shane knapp
Subject: Re: Upgrading minimal PyArrow version to 0.12.x [SPARK-27276]
I don't know a lot about Arrow here, but seems
Reposting for shane here
[SPARK-27178]
https://github.com/apache/spark/commit/342e91fdfa4e6ce5cc3a0da085d1fe723184021b
Is problematic too and it’s not in the rc8 cut
https://github.com/apache/spark/commits/branch-2.4
(Personally I don’t want to delay 2.4.1 either..)
There is SPARK-26604 we are looking into
From: Saisai Shao
Sent: Wednesday, March 6, 2019 6:05 PM
To: shane knapp
Cc: Stavros Kontopoulos; Sean Owen; DB Tsai; Spark dev list; d_t...@apple.com
Subject: Re: [VOTE] Release Apache Spark 2.4.1 (RC2)
Do we have other
To: Xiangrui Meng
Cc: Felix Cheung; Xingbo Jiang; Yinan Li; dev; Weichen Xu; Marco Gaido
Subject: Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling
I think treating SPIPs as this high-level takes away much of the point
of VOTEing on them. I'm not sure that's even what Reynold is
suggesting elsewhere
this.
From: Sean Owen
Sent: Sunday, March 3, 2019 8:15 AM
To: Felix Cheung
Cc: Xingbo Jiang; Yinan Li; dev; Weichen Xu; Marco Gaido
Subject: Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling
I'm for this in general, at least a +0. I do think this has
I’m very hesitant with this.
I don’t want to vote -1, because I personally think it’s important to do, but
I’d like to see more discussion points addressed and not voting completely on
the spirit of it.
First, SPIP doesn’t match the format of SPIP proposed and agreed on. (Maybe
this is a
+1 on mesos - what Sean says
From: Andrew Melo
Sent: Friday, March 1, 2019 9:19 AM
To: Xingbo Jiang
Cc: Sean Owen; Xiangrui Meng; dev
Subject: Re: SPIP: Accelerator-aware Scheduling
Hi,
On Fri, Mar 1, 2019 at 9:48 AM Xingbo Jiang wrote:
>
> Hi Sean,
>
> To
I hear three topics in this thread
1. I don’t think we should remove string. Column and string can both be “type
safe”. And I would agree we don’t *need* to break API compatibility here.
2. Gaps in python API. Extending on #1, definitely we should be consistent and
add string as param where it
I merged the fix to 2.4.
From: Felix Cheung
Sent: Wednesday, February 20, 2019 9:34 PM
To: DB Tsai; Spark dev list
Cc: Cesar Delgado
Subject: Re: [VOTE] Release Apache Spark 2.4.1 (RC2)
Could you hold for a bit - I have one more fix to get
Could you hold for a bit - I have one more fix to get in
From: d_t...@apple.com on behalf of DB Tsai
Sent: Wednesday, February 20, 2019 12:25 PM
To: Spark dev list
Cc: Cesar Delgado
Subject: Re: [VOTE] Release Apache Spark 2.4.1 (RC2)
Okay. Let's fail rc2, and
+1
From: Ryan Blue
Sent: Tuesday, February 19, 2019 9:34 AM
To: Jamison Bennett
Cc: dev
Subject: Re: [VOTE] SPIP: Identifiers for multi-catalog Spark
+1
On Tue, Feb 19, 2019 at 8:41 AM Jamison Bennett
wrote:
+1 (non-binding)
Jamison Bennett
Cloudera
, Jan 25, 2019 at 1:41 AM Felix Cheung
mailto:felixche...@apache.org>> wrote:
Yes it was discussed on dev@. We are waiting for 2.3.3 to release to resubmit.
On Thu, Jan 24, 2019 at 5:33 AM Hyukjin Kwon
mailto:gurwls...@gmail.com>> wrote:
Hi all,
I happened to find SparkR is missing
This is super awesome!
From: Shivaram Venkataraman
Sent: Saturday, February 9, 2019 8:33 AM
To: Hyukjin Kwon
Cc: dev; Felix Cheung; Bryan Cutler; Liang-Chi Hsieh; Shivaram Venkataraman
Subject: Re: Vectorized R gapply[Collect]() implementation
Those speedups
Zhuge
Sent: Saturday, February 9, 2019 6:25 PM
To: Felix Cheung
Cc: Takeshi Yamamuro; Spark dev list
Subject: Re: [VOTE] Release Apache Spark 2.3.3 (RC2)
Not me. I am running zulu8, maven, and hadoop-2.7.
On Sat, Feb 9, 2019 at 5:42 PM Felix Cheung
mailto:felixcheun...@hotmail.com>> wrote:
On
-integration-tests` for the JDBC
integration tests.
I run these tests, and then I checked if they are passed.
On Sat, Feb 9, 2019 at 5:26 PM Herman van Hovell
mailto:her...@databricks.com>> wrote:
I count 2 binding votes :)...
Op vr 8 feb. 2019 om 22:36 schreef Felix Cheung
mailto:feli
For this case I’d agree with Ryan. I haven’t followed this thread and the
details of the change since it’s way too much for me to consume “in my free
time” (which is 0 nowadays) but I’m pretty sure the existing behavior works for
us and very likely we don’t want it to change because of some
Nope, still only 1 binding vote ;)
From: Mark Hamstra
Sent: Friday, February 8, 2019 7:30 PM
To: Marcelo Vanzin
Cc: Takeshi Yamamuro; Spark dev list
Subject: Re: [VOTE] Release Apache Spark 2.3.3 (RC2)
There are 2. C'mon Marcelo, you can make it 3!
On Fri, Feb
Likely need a shim (which we should have anyway) because of namespace/import
changes.
I’m huge +1 on this.
From: Hyukjin Kwon
Sent: Monday, February 4, 2019 12:27 PM
To: Xiao Li
Cc: Sean Owen; Felix Cheung; Ryan Blue; Marcelo Vanzin; Yuming Wang; dev
Subject
What’s the update and next step on this?
We have real users getting blocked by this issue.
From: Xiao Li
Sent: Wednesday, January 16, 2019 9:37 AM
To: Ryan Blue
Cc: Marcelo Vanzin; Hyukjin Kwon; Sean Owen; Felix Cheung; Yuming Wang; dev
Subject: Re: [DISCUSS
Yes it was discussed on dev@. We are waiting for 2.3.3 to release to
resubmit.
On Thu, Jan 24, 2019 at 5:33 AM Hyukjin Kwon wrote:
> Hi all,
>
> I happened to find SparkR is missing in CRAN. See
> https://cran.r-project.org/web/packages/SparkR/index.html
>
> I remember I saw some threads about
Agreed on the pros / cons, esp driver could be the data science notebook.
Is it worthwhile making it configurable?
From: Sean Owen
Sent: Monday, January 21, 2019 10:42 AM
To: Reynold Xin
Cc: dev
Subject: Re: Make proactive check for closure serializability
+1
My focus is on R (sorry couldn’t cross validate what’s Sean is seeing)
tested:
reviewed doc
R package test
win-builder, r-hub
Tarball/package signature
From: Takeshi Yamamuro
Sent: Thursday, January 17, 2019 6:49 PM
To: Spark dev list
Subject: [VOTE]
+1 I like Ryan last mail. Thank you for putting it clearly (should be a
spec/SPIP!)
I agree and understand the need for 3 part id. However I don’t think we should
make assumption that it must be or can only be as long as 3 parts. Once the
catalog is identified (ie. The first part), the catalog
of it) from the spark core project..
From: Xiao Li
Sent: Tuesday, January 15, 2019 10:03 AM
To: Felix Cheung
Cc: rb...@netflix.com; Yuming Wang; dev
Subject: Re: [DISCUSS] Upgrade built-in Hive to 2.3.4
Let me take my words back. To read/write a table, Spark users do
And we are super 100% dependent on Hive...
From: Ryan Blue
Sent: Tuesday, January 15, 2019 9:53 AM
To: Xiao Li
Cc: Yuming Wang; dev
Subject: Re: [DISCUSS] Upgrade built-in Hive to 2.3.4
How do we know that most Spark users are not using Hive? I wouldn't be
Resolving https://issues.apache.org/jira/browse/HIVE-16391 means to keep Spark
on Hive 1.2?
I’m not sure that is reducing dependency on Hive - Hive is still there and it’s
a very old Hive. IMO it is increasing the risk the longer we keep on this. (And
it’s been years)
Looking at the two PR.
13, 2019 5:45 AM
To: Felix Cheung
Cc: Dongjoon Hyun; dev
Subject: Re: Clean out https://dist.apache.org/repos/dist/dev/spark/ ?
Will do. Er, maybe add Shane here too -- should we disable this docs
job? are these docs used, and is there much value in nightly snapshots
of the whole site?
On Sat, Jan
These get “published” by doc nightly build from riselab Jenkins...
From: Dongjoon Hyun
Sent: Saturday, January 12, 2019 4:32 PM
To: Sean Owen
Cc: dev
Subject: Re: Clean out https://dist.apache.org/repos/dist/dev/spark/ ?
+1 for removing old docs there.
It seems
Awesome Shane!
From: shane knapp
Sent: Sunday, January 6, 2019 11:38 AM
To: Felix Cheung
Cc: Dongjoon Hyun; Wenchen Fan; dev
Subject: Re: Spark Packaging Jenkins
noted. i like the idea of building (but not signing) the release and will
update the job(s
https://spark.apache.org/release-process.html
Look for do-release-docker.sh script
From: Felix Cheung
Sent: Sunday, January 6, 2019 11:17 AM
To: Dongjoon Hyun; Wenchen Fan
Cc: dev; shane knapp
Subject: Re: Spark Packaging Jenkins
The release process doc should
The release process doc should have been updated on this - as mentioned we do
not use Jenkins for release signing (take this offline if further discussion is
needed)
The release build on Jenkins can still be useful for pre-validating the release
build process (without actually signing it)
+1 on 2.2.3 of course
From: Dongjoon Hyun
Sent: Wednesday, January 2, 2019 12:21 PM
To: Saisai Shao
Cc: Xiao Li; Felix Cheung; Sean Owen; dev
Subject: Re: Apache Spark 2.2.3 ?
Thank you for swift feedbacks and Happy New Year. :)
For 2.2.3 release on next week
Speaking of, it’s been 3 months since 2.3.2... (Sept 2018)
And 2 months since 2.4.0 (Nov 2018) - does the community feel 2.4 branch is
stabilizing?
From: Sean Owen
Sent: Tuesday, January 1, 2019 8:30 PM
To: Dongjoon Hyun
Cc: dev
Subject: Re: Apache Spark 2.2.3
I opened a PR on the vignettes fix to skip eval.
From: Shivaram Venkataraman
Sent: Wednesday, November 7, 2018 7:26 AM
To: Felix Cheung
Cc: Sean Owen; Shivaram Venkataraman; Wenchen Fan; Matei Zaharia; dev
Subject: Re: [CRAN-pretest-archived] CRAN submission
Considering the timing for Spark 3.0,
> deprecating lower versions, bumping up R to 3.4 might be reasonable
> option.
>
> Adding Shane as well.
>
> If we ended up with not upgrading it, I will forward this email to CRAN
> sysadmin to discuss further anyway.
>
>
>
&
One question is where will the list of capability strings be defined?
From: Ryan Blue
Sent: Thursday, November 8, 2018 2:09 PM
To: Reynold Xin
Cc: Spark Dev List
Subject: Re: DataSourceV2 capability API
Yes, we currently use traits that have methods. Something
Very cool!
From: Hyukjin Kwon
Sent: Thursday, November 8, 2018 10:29 AM
To: dev
Subject: Arrow optimization in conversion from R DataFrame to Spark DataFrame
Hi all,
I am trying to introduce R Arrow optimization by reusing PySpark Arrow
optimization.
It
_20181105_165757/Windows/00check.log
> and
> https://win-builder.r-project.org/incoming_pretest/SparkR_2.4.0_20181105_165757/Debian/00check.log,
> the tests run in 1s.
> On Tue, Nov 6, 2018 at 1:29 PM Felix Cheung wrote:
> >
> > I’d rather not mess with 2.4.0 at this point. On CRA
Is there a list of LTS release that I can reference?
From: Ryan Blue
Sent: Tuesday, November 6, 2018 1:28 PM
To: sn...@snazy.de
Cc: Spark Dev List; cdelg...@apple.com
Subject: Re: Test and support only LTS JDK release?
+1 for supporting LTS releases.
On Tue,
So to clarify, only scala 2.12 is supported in Spark 3?
From: Ryan Blue
Sent: Tuesday, November 6, 2018 1:24 PM
To: d_t...@apple.com
Cc: Sean Owen; Spark Dev List; cdelg...@apple.com
Subject: Re: Make Scala 2.12 as default Scala version in Spark 3.0
+1 to Scala
. Need to investigate but
worse case test_package can run with 0 test.
From: Sean Owen
Sent: Tuesday, November 6, 2018 10:51 AM
To: Shivaram Venkataraman
Cc: Felix Cheung; Wenchen Fan; Matei Zaharia; dev
Subject: Re: [CRAN-pretest-archived] CRAN submission SparkR
+1 for Spark 3, definitely
Thanks for the updates
From: Sean Owen
Sent: Tuesday, November 6, 2018 9:11 AM
To: Felix Cheung
Cc: dev
Subject: Re: Java 11 support
I think that Java 9 support basically gets Java 10, 11 support. But
the jump from 8 to 9
Speaking of, get we work to support Java 11?
That will fix all the problems below.
From: Felix Cheung
Sent: Tuesday, November 6, 2018 8:57 AM
To: Wenchen Fan
Cc: Matei Zaharia; Sean Owen; Spark dev list; Shivaram Venkataraman
Subject: Re: [CRAN-pretest-archived
We have not been able to publish to CRAN for quite some time (since 2.3.0 was
archived - the cause is Java 11)
I think it’s ok to announce the release of 2.4.0
From: Wenchen Fan
Sent: Tuesday, November 6, 2018 8:51 AM
To: Felix Cheung
Cc: Matei Zaharia; Sean
some ideas.
Matei
> On Nov 5, 2018, at 9:09 PM, Felix Cheung wrote:
>
> I don¡Št know what the cause is yet.
>
> The test should be skipped because of this check
> https://github.com/apache/spark/blob/branch-2.4/R/pkg/inst/tests/testthat/test_basic.R#L21
>
> And this
>
:
callJStatic("org.apache.spark.ml.r.GeneralizedLinearRegressionWrapper", "fit",
formula,
The earlier release was achived because of Java 11+ too so this unfortunately
isn’t new.
From: Sean Owen
Sent: Monday, November 5, 2018 7:22 PM
To: Felix Cheung
FYI. SparkR submission failed. It seems to detect Java 11 correctly with
vignettes but not skipping tests as would be expected.
Error: processing vignette ‘sparkr-vignettes.Rmd’ failed with diagnostics:
Java version 8 is required for this package; found version: 11.0.1
Execution halted
*
Thanks for being this up and much appreciate with keeping on top of this at
all times.
Can upgrading R able to fix the issue. Is this perhaps not necessarily
malform but some new format for new versions perhaps? Anyway we should
consider upgrading R version if that fixes the problem.
As an
+1
Checked R doc and all R API changes
From: Denny Lee
Sent: Wednesday, October 31, 2018 9:13 PM
To: Chitral Verma
Cc: Wenchen Fan; dev@spark.apache.org
Subject: Re: [VOTE] SPARK 2.4.0 (RC5)
+1
On Wed, Oct 31, 2018 at 12:54 PM Chitral Verma
Yes please!
From: Ryan Blue
Sent: Thursday, October 25, 2018 1:10 PM
To: Spark Dev List
Subject: DataSourceV2 hangouts sync
Hi everyone,
There's been some great discussion for DataSourceV2 in the last few months, but
it has been difficult to resolve some of
I’m in favor of it. If you check the PR it’s a few isolated script changes and
all test-only changes. Should have low impact on release but much better
integration test coverage.
From: Erik Erlandson
Sent: Tuesday, October 16, 2018 8:20 AM
To: dev
Subject:
Jars and libraries only accessible locally at the driver is fairly limited?
Don’t you want the same on all executor?
From: Yinan Li
Sent: Friday, October 5, 2018 11:25 AM
To: Stavros Kontopoulos
Cc: rve...@dotnetrdf.org; dev
Subject: Re: [DISCUSS][K8S] Local
1 - 100 of 237 matches
Mail list logo