Re: Spark branch-3.1

2020-12-04 Thread Hyukjin Kwon
Shane, do you mind setting up Jenkins jobs for branch-3.2 please? On Sat, 5 Dec 2020, 08:14 Hyukjin Kwon, wrote: > Great, thank you for doing this. > > On Sat, 5 Dec 2020, 02:02 Dongjoon Hyun, wrote: > >> Thank you so much, Hyukjin Kwon. >> >> I made a PR for u

Re: Spark branch-3.1

2020-12-04 Thread Hyukjin Kwon
Great, thank you for doing this. On Sat, 5 Dec 2020, 02:02 Dongjoon Hyun, wrote: > Thank you so much, Hyukjin Kwon. > > I made a PR for updating the `master` branch to 3.2.0-SNAPSHOT. > > https://github.com/apache/spark/pull/30606 > [SPARK-33662][BUILD] Setting version

Spark branch-3.1

2020-12-04 Thread Hyukjin Kwon
Hi all, It’s 4th PDT and branch-3.1 is cut out now as planned. Mid Dec 2020 QA period. Focus on bug fixes, tests, stability and docs. Generally, no new features merged Now we’re in the QA period. Please focus on testing, polishing, stability and docs for Spark 3.1.0, and hope we can have a nice

Re: Apache ORC 1.6.6 Release

2020-12-03 Thread Hyukjin Kwon
It's still good to know since Spark uses ORC :-) 2020년 12월 4일 (금) 오전 3:34, Dongjoon Hyun 님이 작성: > Oh, my bad. The previous email was written for `d...@orc.apache.org`. > > Apache ORC 1.6.6 is not for Apache Spark 3.1. > > It's prepared for Apache Spark 3.2 (2020 Summer) to provide mainly > column

Re: Spark 3.1 branch cut 4th Dec?

2020-11-26 Thread Hyukjin Kwon
2021 as planned:Early Jan 2021 Release candidates (RC), voting, etc. until final release passes I know this is Thanksgiving day now in the US. Hope you guys enjoy the rest of the holidays. Thanks! 2020년 11월 21일 (토) 오전 8:15, Hyukjin Kwon 님이 작성: > Just for the record, I'll stick to the

Re: [build system] IMPORTANT UPDATE

2020-11-25 Thread Hyukjin Kwon
Thanks Shane. On Thu, 26 Nov 2020, 10:19 shane knapp ☠, wrote: > alright, builds are looking solid except for SBT... if someone here could > take a look at those failures i'd be most appreciative. > > the important ones: PRB, PRB-K8s, k8s, snapshot and maven builds all > green! > > i'm literal

Re: Spark 3.1 branch cut 4th Dec?

2020-11-20 Thread Hyukjin Kwon
ao Li >>>>>>> wrote: >>>>>>> >>>>>>>> Correction: >>>>>>>> >>>>>>>> Merging the feature work after the branch cut should not be >>>>>>>> encouraged in general, although some committers did

Spark 3.1 branch cut 4th Dec?

2020-11-19 Thread Hyukjin Kwon
Hi all, I think we haven’t decided yet the exact branch-cut, code freeze and release manager. As we planned in https://spark.apache.org/versioning-policy.html Early Dec 2020 Code freeze. Release branch cut Code freeze and branch cutting is coming. Therefore, we should finish if there are any r

Re: [DISCUSS] Review/merge phase, and post-review

2020-11-14 Thread Hyukjin Kwon
In practice, I usually wait some more when the changes look complicated, when there are many reviews/discussions, when the change can potentially be controversial, etc. When I think its pretty clear to go, for example, multiple approvals from committers, when the changes look pretty clear and stra

Re: I'm going to be out starting Nov 5th

2020-10-31 Thread Hyukjin Kwon
Oh, take care Holden! On Sun, 1 Nov 2020, 03:04 Denny Lee, wrote: > Best wishes Holden! :) > > On Sat, Oct 31, 2020 at 11:00 Dongjoon Hyun > wrote: > >> Take care, Holden! I believe everything goes well. >> >> Bests, >> Dongjoon. >> >> On Sat, Oct 31, 2020 at 10:24 AM Reynold Xin wrote: >> >>>

Re: [DISCUSS][SPIP] Standardize Spark Exception Messages

2020-10-26 Thread Hyukjin Kwon
Thanks for pointing this out, Nicholas. This SPIP seems focused on the Scala side, grouping the exception handling and providing some guidance about error messages. Yes, I think we can refer to it on the PySpark side. Probably I will follow up and file some JIRAs based on how this SPIP gose, and ru

Re: Mu-L/spark Github actions emails

2020-10-09 Thread Hyukjin Kwon
Yeah, looks like I received emails from the fork as well. I label and filter the emails from github so I didnt notice. I'll take a closer look tomorrow or next Monday and take an action. Thanks for the heads up. On Thu, 8 Oct 2020, 23:41 Sean Owen, wrote: > I'm getting emails from a repo called

Re: Broken rlang installation on AppVeyor

2020-10-09 Thread Hyukjin Kwon
ket, right? > > On 10/9/20 1:48 PM, Hyukjin Kwon wrote: > > Thanks for reporting this. I think we should change to "x64". Can you open > a PR to change? > > 2020년 10월 9일 (금) 오전 4:36, Maciej 님이 작성: > >> Hi Everyone, >> >> I've been digging in

Re: Broken rlang installation on AppVeyor

2020-10-09 Thread Hyukjin Kwon
Thanks for reporting this. I think we should change to "x64". Can you open a PR to change? 2020년 10월 9일 (금) 오전 4:36, Maciej 님이 작성: > Hi Everyone, > > I've been digging into AppVeyor test failures for > https://github.com/apache/spark/pull/29978 > > > I see the following error > > [00:01:48] tryin

Re: Apache Spark 3.1 Preparation Status (Oct. 2020)

2020-10-03 Thread Hyukjin Kwon
Nice summary. Thanks Dongjoon. One minor correction -> I believe we dropped R 3.5 and below at branch 2.4 as well. On Sun, 4 Oct 2020, 09:17 Dongjoon Hyun, wrote: > Hi, All. > > As of today, master branch (Apache Spark 3.1.0) resolved > 852+ JIRA issues and 606+ issues are 3.1.0-only patches. >

Re: Running K8s integration tests for changes in core?

2020-09-24 Thread Hyukjin Kwon
+1 On Fri, 25 Sep 2020, 02:21 Holden Karau, wrote: > Thanks Shane! > > On Thu, Sep 24, 2020 at 10:17 AM shane knapp ☠ > wrote: > >> just revisiting this thread... >> >> re presubmit strategy: i don't think this would be easy to set up... >> and i'm not sure what benefit it will give us. >> >>

Re: [VOTE] Release Spark 3.0.1 (RC3)

2020-09-02 Thread Hyukjin Kwon
For a quick correction: > - For Apache Spark 3.1, we are testing R 4.0 on `master` branch, > but we don't have test coverage on `branch-3.0`. > So, I'm wondering if Spark 3.0.1 supports R 4.0 without any issue. I believe we now test SparkR at branch-3.0 with R 4.0 after https://githu

Re: pip/conda distribution headless mode

2020-08-30 Thread Hyukjin Kwon
I am going to take a look if nobody is interested in it. 2020년 8월 31일 (월) 오후 1:48, Georg Heiler 님이 작성: > Many thanks. > > Best, > Georg > > Am Mo., 31. Aug. 2020 um 01:12 Uhr schrieb Xiao Li >: > >> Hi, Georg, >> >> This is being tracked by >> https://issues.apache.org/jira/browse/SPARK-32017 Yo

Re: [PySpark] Revisiting PySpark type annotations

2020-08-27 Thread Hyukjin Kwon
Thanks Maciej and Fokko. 2020년 8월 28일 (금) 오전 6:09, Maciej 님이 작성: > On my side, I'll try to identify any possible problems by the end of the > week or so (at somewhat crude inspection there is nothing unexpected or > particularly hard to resolve, but sometimes problem occur when you try to > refin

Re: [PySpark] Revisiting PySpark type annotations

2020-08-27 Thread Hyukjin Kwon
Thanks! 2020년 8월 20일 (목) 오후 8:39, Driesprong, Fokko 님이 작성: > No worries, thanks for the update! > > Op do 20 aug. 2020 om 12:50 schreef Hyukjin Kwon > >> Yeah, we had a short meeting. I had to check a few other things so some >> delays happened. I will share soon. >&g

Re: [PySpark] Revisiting PySpark type annotations

2020-08-20 Thread Hyukjin Kwon
ete and >> does not annotate types in some other APIs (by using Any). Correct me if I >> am wrong, Maciej. >> >> For me, it is a bit like code coverage. You want this to be high to make >> sure that you cover most of the APIs, but it will take some time to make it >&

Re: Contributing to JIRA Maintenance

2020-08-06 Thread Hyukjin Kwon
ofile.jspa?name=rohitmishr1484> I will keep monitoring it too. Thanks. 2020년 8월 1일 (토) 오후 8:05, Hyukjin Kwon 님이 작성: > Thank you! > > On Sat, 1 Aug 2020, 19:31 Takeshi Yamamuro, wrote: > >> Great work and thanks for your JIRA maintenance and this heads-up (sorry >> for my late

Re: Need some help and contributions in PySpark API documentation

2020-08-05 Thread Hyukjin Kwon
> Rohit Mishra > > On Wed, Aug 5, 2020 at 12:12 PM Hyukjin Kwon wrote: > >> Hi all, >> >> I am trying to redesign the PySpark documentation at SPARK-31851 >> <https://issues.apache.org/jira/browse/SPARK-31851>. >> Basically from: >> >&g

Need some help and contributions in PySpark API documentation

2020-08-04 Thread Hyukjin Kwon
Hi all, I am trying to redesign the PySpark documentation at SPARK-31851 . Basically from: - https://spark.apache.org/docs/latest/api/python/index.html to: - https://hyukjin-spark.readthedocs.io/en/latest/index.html (draft) The base wor

Re: [PySpark] Revisiting PySpark type annotations

2020-08-04 Thread Hyukjin Kwon
ably something that should be discussed here. > On 8/4/20 11:06 PM, Felix Cheung wrote: > > So IMO maintaining outside in a separate repo is going to be harder. That > was why I asked. > > > > -- > *From:* Maciej Szymkiewicz > > *Sent

Re: [PySpark] Revisiting PySpark type annotations

2020-08-03 Thread Hyukjin Kwon
out typing only >> especially considering typing is arguably premature yet. >> >> >> This feels a bit weird to me, since you want to keep this in sync right? >> Do you provide different stubs for different versions of Python? I had to >> look up the literals: https:

PySpark documentation main page

2020-08-01 Thread Hyukjin Kwon
Hi all, I am trying to write up the main page of PySpark documentation at https://github.com/apache/spark/pull/29320. While I think the current proposal might be good enough, I would like to collect more feedback about the contents, structure and image since this is the entrance page of PySpark d

Re: Contributing to JIRA Maintenance

2020-08-01 Thread Hyukjin Kwon
om now on for the community's help. > > On Wed, Jul 29, 2020 at 10:52 AM Hyukjin Kwon wrote: > >> Yeah, to contribute to JIRA maintenance, it does not need a lot of codes >> given my experience. >> >> Just to share my own story: >> 4 years ago when I was on

[OSS DIGEST] The major changes of Apache Spark from June 17 to June 30

2020-07-30 Thread Hyukjin Kwon
Hi all, This is the bi-weekly Apache Spark digest from the Databricks OSS team. For each API/configuration/behavior change, an *[API] *tag is added in the title. CORE

Re: Contributing to JIRA Maintenance

2020-07-28 Thread Hyukjin Kwon
ks for doing this - and I will say this is a great way for anyone >> >> out there to contribute directly to the project. Issue trackers need >> >> maintenance too. It's not that hard to spot basic problems with JIRAs >> >> and request fixes, as a way to eng

Contributing to JIRA Maintenance

2020-07-27 Thread Hyukjin Kwon
Hi all, I would like to ask for some help about JIRA maintenance contributions in Apache Spark. I tend to see less and less people active in JIRA maintenance contributions. I have regularly checked all JIRAs and monitored them continuously for the last 4 years. For the last week, I didn't have ti

Re: Re: request the contributor permission

2020-07-27 Thread Hyukjin Kwon
signIssue.jspa?atl_token=A5KQ-2QAV-T4JA-FDED_d58408eb41144d9970c56fbb41300f40176aadfc_lin&id=13275018&assignee=linshan>to > me . I'm logged in. As shown in figure > > > > > 在 2020-07-27 17:04:54,"Hyukjin Kwon" 写道: > > Once you contribute (e.g., your PR is m

Re: request the contributor permission

2020-07-27 Thread Hyukjin Kwon
Once you contribute (e.g., your PR is merged to the codebase), you will be able to get the permission. BTW, you are already able to do most of the work as a contributor regardless of the permission. Would you mind if I ask what you specifically want to do? 2020년 7월 27일 (월) 오후 5:11, linshan 님이 작성

Re: [DISCUSS] Amend the commiter guidelines on the subject of -1s & how we expect PR discussion to be treated.

2020-07-25 Thread Hyukjin Kwon
+1 thanks Holden. On Fri, 24 Jul 2020, 22:34 Tom Graves, wrote: > +1 > > Tom > > On Tuesday, July 21, 2020, 03:35:18 PM CDT, Holden Karau < > hol...@pigscanfly.ca> wrote: > > > Hi Spark Developers, > > There has been a rather active discussion regarding the specific vetoes > that occured during

Re: Python xmlrunner being used?

2020-07-24 Thread Hyukjin Kwon
It's used in Jenkins IIRC 2020년 7월 24일 (금) 오후 11:43, Driesprong, Fokko 님이 작성: > I found this ticket: https://issues.apache.org/jira/browse/SPARK-7021 > > Is anybody actually using this? > > Cheers, Fokko > > Op vr 24 jul. 2020 om 16:27 schreef Driesprong, Fokko >: > >> Hi all, >> >> Does anyone

Re: [PSA] Apache Spark uses GitHub Actions to run the tests

2020-07-23 Thread Hyukjin Kwon
og/changelog/2020-07-06-github-actions-manual-triggers-with-workflow_dispatch/ > > thanks, > Imran > > On Tue, Jul 14, 2020 at 1:18 AM Hyukjin Kwon wrote: > >> Hi dev, >> >> Github Actions build was introduced to run the regular Spark test cases >> at https://github.c

Re: [PySpark] Revisiting PySpark type annotations

2020-07-21 Thread Hyukjin Kwon
Yeah, I tend to be positive about leveraging the Python type hints in general. However, just to clarify, I don’t think we should just port the type hints into the main codes yet but maybe think about having/porting Maciej's work, pyi files as stubs. For now, I tend to think adding type hints to th

Re: Welcoming some new Apache Spark committers

2020-07-17 Thread Hyukjin Kwon
, *Takeshi Yamamuro,* *Sean Owen*, *Dongjoon > hyun*, *Hyukjin Kwon, *and *Liang-Chi Hsieh,* who all helped review the > majority of my PRs allowing me to grow technically. > > Thanks again and looking forward to working with you all. > > Regards, > Dilip > > On Thu, Jul 16, 2

Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Hyukjin Kwon
Congrats! 2020년 7월 15일 (수) 오전 7:56, Takeshi Yamamuro 님이 작성: > Congrats, all! > > On Wed, Jul 15, 2020 at 5:15 AM Takuya UESHIN > wrote: > >> Congrats and welcome! >> >> On Tue, Jul 14, 2020 at 1:07 PM Bryan Cutler wrote: >> >>> Congratulations and welcome! >>> >>> On Tue, Jul 14, 2020 at 12:36

Re: [PSA] Apache Spark uses GitHub Actions to run the tests

2020-07-14 Thread Hyukjin Kwon
> > On Tue, Jul 14, 2020 at 2:18 PM Hyukjin Kwon wrote: > >> Hi dev, >> >> Github Actions build was introduced to run the regular Spark test cases >> at https://github.com/apache/spark/pull/29057and >> https://github.com/apache/spark/pull/29086. >>

[PSA] Apache Spark uses GitHub Actions to run the tests

2020-07-13 Thread Hyukjin Kwon
Hi dev, Github Actions build was introduced to run the regular Spark test cases at https://github.com/apache/spark/pull/29057and https://github.com/apache/spark/pull/29086. This is virtually the duplication of default Jenkins PR builder at this moment. The only differences are: - Github Actions d

Re: [PSA] Python 2, 3.4 and 3.5 are now dropped

2020-07-13 Thread Hyukjin Kwon
cc user mailing list too. 2020년 7월 14일 (화) 오전 11:27, Hyukjin Kwon 님이 작성: > I am sending another email to make sure dev people know. Python 2, 3.4 and > 3.5 are now dropped at https://github.com/apache/spark/pull/28957. > > >

[PSA] Python 2, 3.4 and 3.5 are now dropped

2020-07-13 Thread Hyukjin Kwon
I am sending another email to make sure dev people know. Python 2, 3.4 and 3.5 are now dropped at https://github.com/apache/spark/pull/28957.

Re: [DISCUSS] Drop Python 2, 3.4 and 3.5

2020-07-13 Thread Hyukjin Kwon
Thank you all. Python 2, 3.4 and 3.5 are dropped now in the master branch at https://github.com/apache/spark/pull/28957 2020년 7월 3일 (금) 오전 10:01, Hyukjin Kwon 님이 작성: > Thanks Dongjoon. That makes much more sense now! > > 2020년 7월 3일 (금) 오전 12:11, Dongjoon Hyun 님이 작성: > >>

Re: restarting jenkins build system tomorrow (7/8) ~930am PDT

2020-07-10 Thread Hyukjin Kwon
Couple of flaky tests can happen. It's usual. Seems it got better now at least. I will keep monitoring the builds. 2020년 7월 10일 (금) 오후 4:33, ukby1234 님이 작성: > Looks like Jenkins isn't stable still. My PR fails two times in a row: > > https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuil

Re: restarting jenkins build system tomorrow (7/8) ~930am PDT

2020-07-09 Thread Hyukjin Kwon
Thank you Shane. 2020년 7월 10일 (금) 오전 2:35, shane knapp ☠ 님이 작성: > and -06 is back! i'll keep an eye on things today, but suffice to say > on each worker i: > > 1) rebooted > 2) cleaned ~/.ivy2, ~/.m2, and other associated caches > > we should be g2g! please reply here if you continue to see

Re: restarting jenkins build system tomorrow (7/8) ~930am PDT

2020-07-08 Thread Hyukjin Kwon
Thanks Shane! BTW, it's getting serious .. e.g) https://github.com/apache/spark/pull/28969 . The tests could not pass in 7 days .. Hopefully restarting the machines will make the current situation better :-) Separately, I am working on a PR to run the Spark tests in Github Actions. We could hopef

Re: m2 cache issues in Jenkins?

2020-07-05 Thread Hyukjin Kwon
t this, please) in Github comment. > > On Thu, Jul 2, 2020 at 2:12 PM Hyukjin Kwon wrote: > >> Ah, okay. Actually there already is - >> https://issues.apache.org/jira/browse/SPARK-31693. I am reopening. >> >> 2020년 7월 2일 (목) 오후 2:06, Holden Karau 님이 작성: >>

Re: Jenkins is down

2020-07-05 Thread Hyukjin Kwon
arted 10 minutes ago. >> >> Bests, >> Dongjoon. >> >> >> On Fri, Jul 3, 2020 at 4:43 AM Hyukjin Kwon wrote: >> >>> Hi all and Shane, >>> >>> Is there something wrong with the Jenkins machines? Seems they are down. >>> >> > > -- > Shane Knapp > Computer Guy / Voice of Reason > UC Berkeley EECS Research / RISELab Staff Technical Lead > https://rise.cs.berkeley.edu >

Jenkins is down

2020-07-03 Thread Hyukjin Kwon
Hi all and Shane, Is there something wrong with the Jenkins machines? Seems they are down.

Re: [DISCUSS] Drop Python 2, 3.4 and 3.5

2020-07-02 Thread Hyukjin Kwon
gt;> Ubuntu LTS ships with 3.7, and while the previous LTS ships with 3.5, if >>> folks really can’t upgrade there’s conda. >>> >>> Is there anyone with a large Python 3.5 fleet who can’t use conda? >>> >>> On Wed, Jul 1, 2020 at 7:15 PM Hyukjin Kwon wrote

Re: m2 cache issues in Jenkins?

2020-07-01 Thread Hyukjin Kwon
Ah, okay. Actually there already is - https://issues.apache.org/jira/browse/SPARK-31693. I am reopening. 2020년 7월 2일 (목) 오후 2:06, Holden Karau 님이 작성: > We don't I didn't file one originally, but Shane reminded me to in the > future. > > On Wed, Jul 1, 2020 at 9:44

Re: m2 cache issues in Jenkins?

2020-07-01 Thread Hyukjin Kwon
Nope, do we have an existing ticket? I think we can reopen if there is. 2020년 7월 2일 (목) 오후 1:43, Holden Karau 님이 작성: > Huh interesting that it’s the same worker. Have you filed a ticket to > Shane? > > On Wed, Jul 1, 2020 at 8:50 PM Hyukjin Kwon wrote: > >> Hm .. seems th

Re: m2 cache issues in Jenkins?

2020-07-01 Thread Hyukjin Kwon
Hm .. seems this is happening again in amp-jenkins-worker-04 ;(. 2020년 6월 25일 (목) 오전 3:15, shane knapp ☠ 님이 작성: > done: > -bash-4.1$ cd .m2 > -bash-4.1$ ls > repository > -bash-4.1$ time rm -rf * > > real17m4.607s > user0m0.950s > sys 0m18.816s > -bash-4.1$ > > On Wed, Jun 24, 2020 at

Re: [DISCUSS] Drop Python 2, 3.4 and 3.5

2020-07-01 Thread Hyukjin Kwon
Yeah, sure. It will be dropped at Spark 3.1 onwards. I don't think we should make such changes in maintenance releases 2020년 7월 2일 (목) 오전 11:13, Holden Karau 님이 작성: > To be clear the plan is to drop them in Spark 3.1 onwards, yes? > > On Wed, Jul 1, 2020 at 7:11 PM Hyukjin Kwon

[DISCUSS] Drop Python 2, 3.4 and 3.5

2020-07-01 Thread Hyukjin Kwon
Hi all, I would like to discuss dropping deprecated Python versions 2, 3.4 and 3.5 at https://github.com/apache/spark/pull/28957. I assume people support it in general but I am writing this to make sure everybody is happy. Fokko made a very good investigation on it, see https://github.com/apache/

Re: [VOTE] Decommissioning SPIP

2020-07-01 Thread Hyukjin Kwon
+1 2020년 7월 2일 (목) 오전 10:08, Marcelo Vanzin 님이 작성: > I reviewed the docs and PRs from way before an SPIP was explicitly > asked, so I'm comfortable with giving a +1 even if I haven't really > fully read the new document, > > On Wed, Jul 1, 2020 at 6:05 PM Holden Karau wrote: > > > > Hi Spark Dev

Re: [DISCUSS][SPIP] Graceful Decommissioning

2020-06-25 Thread Hyukjin Kwon
Thank you so much, Holden. PS: I cc'ed some people who might be interested in this too FYI. 2020년 6월 26일 (금) 오전 11:26, Holden Karau 님이 작성: > At the recommendation of Hyukjin, I'm converting the graceful > decommissioning work to an SPIP. The SPIP document is at > https://docs.google.com/document

Re: Use Hadoop-3.2 as a default Hadoop profile in 3.0.0?

2020-06-25 Thread Hyukjin Kwon
I dont have a strong opinion on changing default too but I also a little bit more prefer to have the option to switch Hadoop version first just to stay safer. To be clear, we're more now discussing about the timing about when to set Hadoop 3.0.0 by default, and which change has to be first, right?

Re: [DISCUSS] Apache Spark 3.0.1 Release

2020-06-23 Thread Hyukjin Kwon
+1. Just as a note, - SPARK-31918 is fixed now, and there's no blocker. - When we build SparkR, we should use the latest R version at least 4.0.0+. 2020년 6월 24일 (수) 오전 11:20, Dongjoon Hyun 님이 작성: > +1 > > Bests, > Dongjoon. > > On Tue, Jun 23,

Re: Initial Decom PR for Spark 3?

2020-06-22 Thread Hyukjin Kwon
On Sun, 21 Jun 2020 at 19:05, Hyukjin Kwon wrote: > >> Yeah, I believe the community decided to do a SPIP for such significant >> changes. It would be best if we stick to the standard approaches. >> >> 2020년 6월 21일 (일) 오전 8:52, Holden Karau 님이 작성: >> >>> I

Re: Initial Decom PR for Spark 3?

2020-06-21 Thread Hyukjin Kwon
2020 at 4:23 PM Stephen Boesch wrote: > >> Hi given there is a design doc (contrary to that common) - is this going >> to move forward? >> >> On Thu, 18 Jun 2020 at 18:05, Hyukjin Kwon wrote: >> >>> Looks it had to be with SPIP and a proper design doc to dis

Re: [ANNOUNCE] Apache Spark 3.0.0

2020-06-18 Thread Hyukjin Kwon
Yay! 2020년 6월 19일 (금) 오전 4:46, Mridul Muralidharan 님이 작성: > Great job everyone ! Congratulations :-) > > Regards, > Mridul > > On Thu, Jun 18, 2020 at 10:21 AM Reynold Xin wrote: > >> Hi all, >> >> Apache Spark 3.0.0 is the first release of the 3.x line. It builds on >> many of the innovations f

Re: Initial Decom PR for Spark 3?

2020-06-18 Thread Hyukjin Kwon
Looks it had to be with SPIP and a proper design doc to discuss. 2020년 2월 9일 (일) 오전 1:23, Erik Erlandson 님이 작성: > I'd be willing to pull this in, unless others have concerns post > branch-cut. > > On Tue, Feb 4, 2020 at 2:51 PM Holden Karau wrote: > >> Hi Y’all, >> >> I’ve got a K8s graceful dec

Re: [ANNOUNCE] Apache Spark 2.4.6 released

2020-06-10 Thread Hyukjin Kwon
Yay! 2020년 6월 11일 (목) 오전 10:38, Holden Karau 님이 작성: > We are happy to announce the availability of Spark 2.4.6! > > Spark 2.4.6 is a maintenance release containing stability, correctness, > and security fixes. > This release is based on the branch-2.4 maintenance branch of Spark. We > strongly re

Re: Quick sync: what goes in migration guide vs release notes?

2020-06-10 Thread Hyukjin Kwon
I think the proposal doesn't mean to don't add the JIRAs with release-notes into the release notes (?). People will still label the JIRAs when the change is significant or breaking whether it's a bug or not, and they will be in the release notes. I guess the proposal TL;DR is: - If that's a legit

Re: [vote] Apache Spark 3.0 RC3

2020-06-08 Thread Hyukjin Kwon
+1 2020년 6월 9일 (화) 오후 3:16, Xiao Li 님이 작성: > +1 (binding) > > Xiao > > On Mon, Jun 8, 2020 at 10:13 PM Xingbo Jiang > wrote: > >> +1(non-binding) >> >> Jiaxin Shan 于2020年6月8日 周一下午9:50写道: >> >>> +1 >>> I build binary using the following command, test spark workloads on >>> Kubernetes (AWS EKS) an

Re: Build time limit in PR builder

2020-05-28 Thread Hyukjin Kwon
I remember we were able to cut down pretty considerably in the past. For example, I investigated ( https://github.com/apache/spark/pull/21822#issuecomment-407295739) and fixed some before at, like https://github.com/apache/spark/pull/23111. Maybe we could skim again to reduce the build/testing time

Build time limit in PR builder

2020-05-28 Thread Hyukjin Kwon
Hi all, Seems we're hitting the time limit in PR builders (see https://github.com/apache/spark/pull/28627), in particular wen it's Maven build which takes more time compared to SBT in general. Should we maybe increase the PR builder a bit more (10 ~ 20 mins?) to unblock these PRs and focus on cutt

Re: [VOTE] Apache Spark 3.0 RC2

2020-05-22 Thread Hyukjin Kwon
Ryan, > I'm fine with the commit, other than the fact that it violated ASF norms to commit without waiting for a review. Looks it became the different proposal as you and other people discussed and suggested there, which you didn't technically vote

Re: [build system] jenkins rebooting now

2020-05-14 Thread Hyukjin Kwon
Thanks Shane. On Fri, 15 May 2020, 02:29 Dongjoon Hyun, wrote: > Thank you so much, Shane! > > > On Thu, May 14, 2020 at 9:51 AM Xiao Li wrote: > >> Thank you, Shane! >> >> On Thu, May 14, 2020 at 9:50 AM shane knapp ☠ >> wrote: >> >>> we're back. doesn't seem to have fixed the issue of the w

Re: [DISCUSS] Java specific APIs design concern and choice

2020-05-11 Thread Hyukjin Kwon
as a style guide, I do not believe we should be putting API >> policy on that page, it should live on an Apache Spark page. >> >> I think if you want to implement an API policy like this it should go >> through an official vote thread, not just a discuss thread where we

Re: [DISCUSS] Java specific APIs design concern and choice

2020-05-11 Thread Hyukjin Kwon
I think if you want to implement an API policy like this it should go > through an official vote thread, not just a discuss thread where we have > not had a lot of feedback on it. > > Tom > > > > On Monday, May 11, 2020, 06:44:31 AM CDT, Hyukjin Kwon < > gurwls...@gm

Re: [DISCUSS] Java specific APIs design concern and choice

2020-05-11 Thread Hyukjin Kwon
I will wait a couple of more days and if there's no objection I hear, I will document this at https://github.com/databricks/scala-style-guide#java-interoperability. 2020년 5월 7일 (목) 오후 9:18, Hyukjin Kwon 님이 작성: > Hi all, I would like to proceed this. Are there more thoughts on this? If

Re: [DISCUSS] Java specific APIs design concern and choice

2020-05-07 Thread Hyukjin Kwon
Hi all, I would like to proceed this. Are there more thoughts on this? If not, I would like to go ahead with the proposal here. 2020년 4월 30일 (목) 오후 10:54, Hyukjin Kwon 님이 작성: > Nothing is urgent. I just don't want to leave it undecided and just keep > adding Java APIs inconsisten

Re: [DISCUSS] Java specific APIs design concern and choice

2020-04-30 Thread Hyukjin Kwon
it's a common practice to handle > Scala types conversions by self when Java programmers prepare to > invoke Scala libraries. I'm not sure which one is the Java programmers' > root complaint, Scala type instance or Scala Jar file. > > My 2 cents. > > -

Re: In Apache Spark JIRA, spark/dev/github_jira_sync.py not running properly

2020-04-29 Thread Hyukjin Kwon
Let actually me just take a look by myself and bring some updates soon. 2020년 4월 30일 (목) 오전 9:13, Hyukjin Kwon 님이 작성: > WDYT @Josh Rosen ? > Seems > https://github.com/databricks/spark-pr-dashboard/blob/1e799c9e510fa8cdc9a6c084a777436bebeabe10/sparkprs/controllers/tasks.py#L131-L14

Re: [DISCUSS] Java specific APIs design concern and choice

2020-04-29 Thread Hyukjin Kwon
27;t completely buy the argument about Scala/Java friendly because using Java instance is already documented in the official Scala documentation. Users still need to search if we have Java specific methods for *some* APIs. 2020년 4월 30일 (목) 오전 8:58, Hyukjin Kwon 님이 작성: > Hm, I thought you meant

Re: In Apache Spark JIRA, spark/dev/github_jira_sync.py not running properly

2020-04-29 Thread Hyukjin Kwon
ll auto-link from a Jira > ticket to the PRs that mention that ticket. I don't think it will update > the ticket's status, though. > > Would you like me to file a ticket with Infra and see what they say? > > On Tue, Apr 28, 2020 at 12:21 AM Hyukjin Kwon wrote: > &

Re: [DISCUSS] Java specific APIs design concern and choice

2020-04-29 Thread Hyukjin Kwon
utting it up for a vote or just waiting to get more feedback? I disagree > with saying option 4 is the rule but agree having a general rule makes > sense. I think we need a lot more input to make the rule as it affects the > api's. > > Tom > > On Wednesday, Apri

Re: [DISCUSS] Java specific APIs design concern and choice

2020-04-29 Thread Hyukjin Kwon
0년 4월 28일 (화) 오후 5:03, Hyukjin Kwon 님이 작성: > Spark has targeted to have a unified API set rather than having separate > Java classes to reduce the maintenance cost, > e.g.) JavaRDD <> RDD vs DataFrame. These JavaXXX are more about the legacy. > > I think it's best to sti

Re: [DISCUSS] Java specific APIs design concern and choice

2020-04-28 Thread Hyukjin Kwon
API with `.asScala` or `.asJava`'s help if Java API > is not ready. Then switch to Java API when it's well cooked. > > The cons is more efforts to maintain. > > My 2 cents. > > -- > Cheers, > -z > > On Tue, 28 Apr 2020 12:07:36 +0900 > Hyukjin Kwon wrote:

Re: In Apache Spark JIRA, spark/dev/github_jira_sync.py not running properly

2020-04-27 Thread Hyukjin Kwon
gration instead. We use it > at my day job, for example. > > On Fri, Apr 24, 2020 at 12:39 AM Hyukjin Kwon wrote: > >> Hi all, >> >> Seems like this github_jira_sync.py >> <https://github.com/apache/spark/blob/master/dev/github_jira_sync.py> script >

Re: [DISCUSS] Java specific APIs design concern and choice

2020-04-27 Thread Hyukjin Kwon
-collections.html > [2] > https://www.scala-lang.org/api/2.13.0/scala/jdk/javaapi/CollectionConverters$.html > [3] > https://www.scala-lang.org/api/2.13.0/scala/jdk/CollectionConverters$.html > [4] > https://www.scala-lang.org/api/2.12.11/scala/collection/convert/ImplicitConversionsT

Re: [DISCUSS] Java specific APIs design concern and choice

2020-04-27 Thread Hyukjin Kwon
t looks to me 4. approach is closer to what Spark has targeted so far. 2020년 4월 28일 (화) 오전 8:34, Hyukjin Kwon 님이 작성: > > One thing we could do here is use Java collections internally and make > the Scala API a thin wrapper around Java -- like how Python works. > > Then adding a

Re: [DISCUSS] Java specific APIs design concern and choice

2020-04-27 Thread Hyukjin Kwon
lso help avoid Scala collections leaking > into internals. > > On Mon, Apr 27, 2020 at 8:49 AM Hyukjin Kwon wrote: > >> Let's stick to the less maintenance efforts then rather than we leave it >> undecided and delay with leaving this inconsistency. >> >> I do

Re: [DISCUSS] Java specific APIs design concern and choice

2020-04-27 Thread Hyukjin Kwon
ng preference over option 3 or 4. We may need to > collect more data points from actual users. > > On Mon, Apr 27, 2020 at 9:50 PM Hyukjin Kwon wrote: > >> Scala users are arguably more prevailing compared to Java users, yes. >> Using the Java instances in Scala side is le

Re: [DISCUSS] Java specific APIs design concern and choice

2020-04-27 Thread Hyukjin Kwon
ndly to that. > > Tom > > On Monday, April 27, 2020, 04:04:28 AM CDT, Hyukjin Kwon < > gurwls...@gmail.com> wrote: > > > Hi all, > > I would like to discuss Java specific APIs and which design we will choose. > This has been discussed in multiple places so far, for

[DISCUSS] Java specific APIs design concern and choice

2020-04-27 Thread Hyukjin Kwon
Hi all, I would like to discuss Java specific APIs and which design we will choose. This has been discussed in multiple places so far, for example, at https://github.com/apache/spark/pull/28085#discussion_r407334754 *The problem:* In short, I would like us to have clear guidance on how we suppo

Re: In Apache Spark JIRA, spark/dev/github_jira_sync.py not running properly

2020-04-23 Thread Hyukjin Kwon
hich JIRA is in progress with a PR or not. 2019년 7월 26일 (금) 오후 1:20, Hyukjin Kwon 님이 작성: > Just FYI, I had to come up with a better JQL to filter out the JIRAs that > already have linked PRs. > In case it helps someone, I use this JQL now to look through the open > JIRAs: > &g

Re: Automatic PR labeling

2020-04-13 Thread Hyukjin Kwon
Thanks! 2020년 4월 14일 (화) 오전 7:42, Jungtaek Lim 님이 작성: > Nice addition, looks pretty good! > > On Tue, Apr 14, 2020 at 1:17 AM Xiao Li wrote: > >> Looks great! >> >> Thanks for making this happen. This is pretty helpful. >> >> Xiao >> >> O

Re: Automatic PR labeling

2020-04-12 Thread Hyukjin Kwon
Okay, now it started to work. Let's see if it works well! 2020년 4월 3일 (금) 오전 11:41, Hyukjin Kwon 님이 작성: > Seems like this email missed to cc the mailing list, forwarding it for > trackability. > > -- Forwarded message - > 보낸사람: Ismaël Mejía > Date:

Fwd: Automatic PR labeling

2020-04-02 Thread Hyukjin Kwon
Seems like this email missed to cc the mailing list, forwarding it for trackability. -- Forwarded message - 보낸사람: Ismaël Mejía Date: 2020년 4월 2일 (목) 오후 4:46 Subject: Re: Automatic PR labeling To: Hyukjin Kwon +1 Just for ref there is a really simple Github App for this: https

Re: Automatic PR labeling

2020-04-02 Thread Hyukjin Kwon
Awesome! 2020년 4월 3일 (금) 오전 7:13, Nicholas Chammas 님이 작성: > SPARK-31330 <https://issues.apache.org/jira/browse/SPARK-31330>: > Automatically label PRs based on the paths they touch > > On Wed, Apr 1, 2020 at 11:34 PM Hyukjin Kwon wrote: > >> @Nicholas Chammas Wo

Re: Automatic PR labeling

2020-04-01 Thread Hyukjin Kwon
@Nicholas Chammas Would you be interested in tacking a look? I would love this to be done. 2020년 3월 25일 (수) 오전 10:30, Hyukjin Kwon 님이 작성: > That should be cool. There were a bit of discussions about which account > should label. If we can replace it, I think it sounds great! > > 202

Re: [DISCUSS] filling affected versions on JIRA issue

2020-04-01 Thread Hyukjin Kwon
> 2) check with older versions to fill up affects version for bug I don't agree with this in general. To me usually it's "For the type of bug, assign one valid version" instead. > The only place where I can see some amount of investigation being required would be for security issues or correctness

Re: Automatic PR labeling

2020-03-24 Thread Hyukjin Kwon
That should be cool. There were a bit of discussions about which account should label. If we can replace it, I think it sounds great! 2020년 3월 25일 (수) 오전 5:08, Nicholas Chammas 님이 작성: > Public Service Announcement: There is a GitHub action that lets you > automatically label PRs based on what pat

Re: [DISCUSS] Null-handling of primitive-type of untyped Scala UDF in Scala 2.12

2020-03-17 Thread Hyukjin Kwon
Option 2 seems fine to me. 2020년 3월 17일 (화) 오후 3:41, Wenchen Fan 님이 작성: > I don't think option 1 is possible. > > For option 2: I think we need to do it anyway. It's kind of a bug that the > typed Scala UDF doesn't support case class that thus can't support > struct-type input columns. > > For op

Re: Auto-linking from PRs to Jira tickets

2020-03-11 Thread Hyukjin Kwon
Cool, nice! 2020년 3월 12일 (목) 오전 8:54, Takeshi Yamamuro 님이 작성: > Cool! Thanks, Dongjoon! > > Bests, > Takeshi > > On Thu, Mar 12, 2020 at 8:27 AM Dongjoon Hyun > wrote: > >> Hi, All. >> >> Autolinking from PR to JIRA started. >> >> *Inside PR* >> https://github.com/apache/spark/pull/27881 >> >> *

Re: [VOTE] Amend Spark's Semantic Versioning Policy

2020-03-09 Thread Hyukjin Kwon
The proposal itself seems good as the factors to consider, Thanks Michael. Several concerns mentioned look good points, in particular: > ... assuming that this is for public stable APIs, not APIs that are marked as unstable, evolving, etc. ... I would like to confirm this. We already have API ann

<    1   2   3   4   5   6   7   8   >