Re: [DISCUSS] SPIP: FunctionCatalog

2021-02-10 Thread Dongjoon Hyun
Thank you all for making a giant move forward for Apache Spark 3.2.0. I'm really looking forward to seeing Wenchen's implementation. That would be greatly helpful to make a decision! > I'll implement my idea after the holiday and then we can have more effective discussions. We can also do benchmar

Re: [DISCUSS] SPIP: FunctionCatalog

2021-02-10 Thread Dongjoon Hyun
n calls. > > This also highlights that the approach used in DSv2 and this proposal is > working: start small and use extensions to layer on more complex support. > > On Wed, Feb 10, 2021 at 9:04 AM Dongjoon Hyun > wrote: > > Thank you all for making a giant move forward for

Re: [DISCUSS] SPIP: FunctionCatalog

2021-02-10 Thread Dongjoon Hyun
make DSv2 more complete. This will unblock other DSv2 features, too. Bests, Dongjoon. On Wed, Feb 10, 2021 at 10:58 AM Dongjoon Hyun wrote: > Hi, Ryan. > > We didn't move past anything (both yours and Wenchen's). What Wenchen > suggested is double-checking the alternatives w

Apache Spark 3.0.2 Release ?

2021-02-12 Thread Dongjoon Hyun
Hi, All. As of today, `branch-3.0` has 307 patches (including 25 correctness patches) since v3.0.1 tag (released on September 8th, 2020). Since we stabilized branch-3.0 during 3.1.x preparation so far, it would be great if we start to release Apache Spark 3.0.2 next week. And, I'd like to volunte

Re: Apache Spark 3.0.2 Release ?

2021-02-12 Thread Dongjoon Hyun
Thank you, Sean! On Fri, Feb 12, 2021 at 11:41 AM Sean Owen wrote: > Sounds like a fine time to me, sure. > > On Fri, Feb 12, 2021 at 1:39 PM Dongjoon Hyun > wrote: > >> Hi, All. >> >> As of today, `branch-3.0` has 307 patches (including 25 correctness >>

Re: Apache Spark 3.0.2 Release ?

2021-02-13 Thread Dongjoon Hyun
gt;>> >>>> Happy Lunar New Year! >>>> >>>> Xiao >>>> >>>> On Fri, Feb 12, 2021 at 5:33 PM Hyukjin Kwon >>>> wrote: >>>> >>>>> Yeah, +1 too >>>>> >>>>> 202

[VOTE] Release Spark 3.0.2 (RC1)

2021-02-15 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version 3.0.2. The vote is open until February 19th 9AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.0.2 [ ] -1 Do not release this package because

Re: [DISCUSS] SPIP: FunctionCatalog

2021-02-16 Thread Dongjoon Hyun
arquet readers although there are a lot of >> >>>> >> details >> >>>> >> that would need to be considered there. >> >>>> >> >> >>>> >> .. Owen >> >>>> >> >> >>>> >

Re: [VOTE] Release Spark 3.0.2 (RC1)

2021-02-16 Thread Dongjoon Hyun
+1 Bests, Dongjoon. On Tue, Feb 16, 2021 at 2:27 AM Herman van Hovell wrote: > +1 > > On Tue, Feb 16, 2021 at 11:08 AM Hyukjin Kwon wrote: > >> +1 >> >> 2021년 2월 16일 (화) 오후 5:10, Prashant Sharma 님이 작성: >> >>> +1 >>> >&g

Re: [DISCUSS] SPIP: FunctionCatalog

2021-02-17 Thread Dongjoon Hyun
ngjoon pointed out, it would be good to know rough ETA to make sure >> making progress in this, and people can compare more easily. >> >> >> FWIW, there’s the saying I like in the zen of Python >> <https://www.python.org/dev/peps/pep-0020/>: >> >> T

Re: [VOTE] Release Spark 3.0.2 (RC1)

2021-02-17 Thread Dongjoon Hyun
e5448a3/testJar-1613607380546.jar' > 'SparkSubmitClassA' 'SparkSubmitClassB' > > > > external shuffle service *** FAILED *** > FAILED did not equal FINISHED (stdout/stderr was not captured) > (BaseYarnClusterSuite.scala:199) > > > On Tue, Feb 1

Re: Bug?

2021-02-18 Thread Dongjoon Hyun
Thank you for sharing, Tyson. Spark 2.4.4 looks too old to me. Do you think it will occur at 3.x? Bests, Dongjoon. On Thu, Feb 18, 2021 at 11:07 AM Tyson wrote: > We observed an interesting stack trace that I'd like to share with you. > The logging level is WARN, but it appears to be causing

Re: [VOTE] Release Spark 3.0.2 (RC1)

2021-02-19 Thread Dongjoon Hyun
t or specific > to my env; just checking if anyone else sees this. Obviously the main test > builds do not fail on Jenkins. > > On Wed, Feb 17, 2021 at 10:47 PM Dongjoon Hyun > wrote: > >> I didn't see them. Could you describe your environment: OS, Java, >> Maven/S

[VOTE][RESULT] Release Spark 3.0.2 (RC1)

2021-02-19 Thread Dongjoon Hyun
The vote passes. Thanks to all who helped with the release! (* = binding) +1 - Prashant Sharma * - Hyukjin Kwon * - Herman van Hovell * - Dongjoon Hyun * - Wenchen Fan * - Maxim Gekk - John Zhuge - Takeshi Yamamuro - Sean Owen * +0: None -1: None

[ANNOUNCE] Announcing Apache Spark 3.0.2

2021-02-19 Thread Dongjoon Hyun
. Dongjoon Hyun

Re: [DISCUSS] SPIP: FunctionCatalog

2021-02-23 Thread Dongjoon Hyun
> where I think it's better to directly take the input columns as the UDF >>>>>>>> parameter, instead of wrapping the input columns with InternalRow >>>>>>>> and taking the InternalRow as the UDF parameter. It's not only for >>>>

Re: [VOTE] Release Spark 3.1.1 (RC3)

2021-02-24 Thread Dongjoon Hyun
+1 Bests, Dongjoon On Wed, Feb 24, 2021 at 5:46 AM Gabor Somogyi wrote: > +1 (non-binding) > > Tested my added security related featues, found an issue but not a blocker. > > On Wed, 24 Feb 2021, 09:47 Hyukjin Kwon, wrote: > >> I remember HiveExternalCatalogVersionsSuite was flaky for a while

Apache Spark 3.2 Expectation

2021-02-25 Thread Dongjoon Hyun
Hi, All. Since we have been preparing Apache Spark 3.2.0 in master branch since December 2020, March seems to be a good time to share our thoughts and aspirations on Apache Spark 3.2. According to the progress on Apache Spark 3.1 release, Apache Spark 3.2 seems to be the last minor release of thi

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Dongjoon Hyun
ready some good stuff in 3.2 and will be a good minor release in 5-6 > months. > > On Thu, Feb 25, 2021 at 10:57 AM Dongjoon Hyun > wrote: > >> Hi, All. >> >> Since we have been preparing Apache Spark 3.2.0 in master branch since >> December 2020, March see

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Dongjoon Hyun
On Fri, Feb 26, 2021 at 11:13 AM Xiao Li wrote: > Do we have enough features in the current master branch? > Hi, Xiao. Is this a question to Sean's previous comment, `There is already some good stuff in 3.2 and will be a good minor release in 5-6 months.`? On Thu, Feb 25, 2021 at 9:33 AM Sean

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Dongjoon Hyun
urrent master branch? If not, are we able to >> finish major features we collected here? Do they have a timeline or project >> plan? >> >> Xiao >> >> Dongjoon Hyun 于2021年2月26日周五 上午10:07写道: >> >>> Thank you, Mridul and Sean. >>> >&

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Dongjoon Hyun
ate if we have some more time for 3.2. > > In addition, It would also be great if we follow the schedule and catch > potential blockers quickly during QA instead of when we cut RCs. That will > considerably speed up the process and make it on time. > > Thanks. > > > On Sat,

Re: Please take a look at the draft of the Spark 3.1.1 release notes

2021-02-27 Thread Dongjoon Hyun
Thank you for sharing, Hyukjin! Dongjoon. On Sat, Feb 27, 2021 at 12:36 AM Hyukjin Kwon wrote: > Hi all, > > I am preparing to publish and announce Spark 3.1.1. > This is the draft of the release note, and I plan to edit a bit more and > use it as the final release note. > Please take a look an

Re: minikube and kubernetes cluster versions for integration testing

2021-03-02 Thread Dongjoon Hyun
Thank you for sharing and suggestion, Attila. Additionally, given the following information, - The latest Minikube is v1.18.0 with K8s v1.20.2 - AWS EKS will add K8s v1.20 on April, 2021 - The end of support in AWS EKS are K8s v1.15 (May 3, 2021) K8s v1.16 (July, 2021) K8s v1.17 (Sept

Re: [ANNOUNCE] Announcing Apache Spark 3.1.1

2021-03-03 Thread Dongjoon Hyun
It took a long time. Thank you, Hyukjin and all! Bests, Dongjoon. On Wed, Mar 3, 2021 at 3:23 AM Gabor Somogyi wrote: > Good to hear and great work Hyukjin! 👏 > > On Wed, 3 Mar 2021, 11:15 Jungtaek Lim, > wrote: > >> Thanks Hyukjin for driving the huge release, and thanks everyone for >> contr

Apache Spark 2.4.8 (and EOL of 2.4)

2021-03-03 Thread Dongjoon Hyun
Hi, All. We successfully completed Apache Spark 3.1.1 and 3.0.2 releases and started 3.2.0 discussion already. Let's talk about branch-2.4 because there exists some discussions on JIRA and GitHub about skipping backporting to 2.4. Since `branch-2.4` has been maintained well as LTS, I'd like to s

Re: Apache Spark 2.4.8 (and EOL of 2.4)

2021-03-03 Thread Dongjoon Hyun
Thank you for volunteering as Apache Spark 2.4.8 release manager, Liang-chi! On Wed, Mar 3, 2021 at 10:13 AM Liang-Chi Hsieh wrote: > > Thanks Dongjoon! > > +1 and I volunteer to do the release of 2.4.8 if it passes. > > > Liang-Chi > > > > > -- > Sent from: http://apache-spark-developers-list.1

Re: Apache Spark 2.4.8 (and EOL of 2.4)

2021-03-03 Thread Dongjoon Hyun
gt; about right timing-wise. > > We should in any event release 2.4.8, yes. We can of course choose to > release a 2.4.9 if some critical issue is found, later. > > But yeah based on the velocity of back-ports to 2.4.x, it seems about time > to call it EOL. > > Sean > &g

Re: Apache Spark 3.2 Expectation

2021-03-03 Thread Dongjoon Hyun
community already had fairly > detailed discussions. > > Thanks, > John > > On Thu, Feb 25, 2021 at 8:57 AM Dongjoon Hyun > wrote: > >> Hi, All. >> >> Since we have been preparing Apache Spark 3.2.0 in master branch since >> December 2020, March seems to b

Re: [DISCUSS] SPIP: FunctionCatalog

2021-03-03 Thread Dongjoon Hyun
Hi, All. We shared many opinions in different perspectives. However, we didn't reach a consensus even on a partial merge by excluding something (on the PR by me, on this mailing thread by Wenchen). For the following claims, we have another alternative to mitigate it. > I don't like it becaus

Re: Apache Spark 2.4.8 (and EOL of 2.4)

2021-03-04 Thread Dongjoon Hyun
Thank you, Liang-Chi! Next Monday sounds good. To All. Please ping Liang-Chi if you have a missed backport. Bests, Dongjoon. On Thu, Mar 4, 2021 at 7:00 PM Xiao Li wrote: > Thank you, Liang-Chi! > > Xiao > > On Thu, Mar 4, 2021 at 6:25 PM Hyukjin Kwon wrote: > >> Thanks @Liang-Chi Hsieh fo

Max Gekk

2021-03-05 Thread Dongjoon Hyun
Hi, Xiao. After your last email about him, I observed his strong and persistent activity. Thank you for your mentoring. As of today, he is the Top-9th contributor. Do you think you need more time to mentor him for additional dimensions? From my side, it looks enough and I'd like to nominate Max Ge

Re: [VOTE] SPIP: Add FunctionCatalog

2021-03-08 Thread Dongjoon Hyun
+1 (binding) Thank you, Ryan. Bests, Dongjoon. On Mon, Mar 8, 2021 at 5:20 PM Chao Sun wrote: > +1 (non-binding) > > On Mon, Mar 8, 2021 at 5:13 PM John Zhuge wrote: > >> +1 (non-binding) >> >> On Mon, Mar 8, 2021 at 4:32 PM Holden Karau wrote: >> >>> +1 (binding) >>> >>> On Mon, Mar 8, 202

Re: Apache Spark 2.4.8 (and EOL of 2.4)

2021-03-09 Thread Dongjoon Hyun
Thank you for the update. +1 for your plan. Bests, Dongjoon. On Tue, Mar 9, 2021 at 12:46 PM Liang-Chi Hsieh wrote: > I just contacted Shane and seems there is ongoing github fetches timing out > issue on Jenkins. > > That being said, currently the QA test is unavailable. I guess it is unsafe

Re: Apache Spark 3.2 Expectation

2021-03-10 Thread Dongjoon Hyun
in the current master branch? If not, are we able to finish major features we collected here? Do they have a timeline or project plan? Bests, Dongjoon. On Wed, Mar 3, 2021 at 2:58 PM Dongjoon Hyun wrote: > Hi, John. > > This thread aims to share your expectations and goals (and maybe

Re: Apache Spark 3.2 Expectation

2021-03-11 Thread Dongjoon Hyun
n in Spark 3.2: Lateral >>> Join support <https://issues.apache.org/jira/browse/SPARK-28379>, >>> interval data type, timestamp without time zone, un-nesting arbitrary >>> queries, the returned metrics of DSV2, and error message standardization. >>> Spar

Re: [DISCUSS] Support pandas API layer on PySpark

2021-03-14 Thread Dongjoon Hyun
Thank you for the proposal. It looks like a good addition. BTW, what is the future plan for the existing APIs? Are we going to deprecate it eventually in favor of Koalas (because we don't remove the existing APIs in general)? > Fourthly, PySpark is still not Pythonic enough. For example, I hear co

Re: [VOTE] SPIP: Support pandas API layer on PySpark

2021-03-26 Thread Dongjoon Hyun
+1 Thank you, Hyukjin Bests, Dongjoon. On Fri, Mar 26, 2021 at 8:08 AM Hyukjin Kwon wrote: > I'll start with my +1 (binding) > > On Fri, 26 Mar 2021, 23:52 Hyukjin Kwon, wrote: > >> Hi all, >> >> I’d like to start a vote for SPIP: Support pandas API layer on PySpark. >> >> The proposal is to

Re: Welcoming six new Apache Spark committers

2021-03-26 Thread Dongjoon Hyun
Congratulations! :) Bests, Dongjoon. On Fri, Mar 26, 2021 at 5:55 PM angers zhu wrote: > Congratulations > > Prashant Sharma 于2021年3月27日周六 上午8:35写道: > >> Congratulations 🎊 all!! >> >> On Sat, Mar 27, 2021, 5:10 AM huaxin gao wrote: >> >>> Congratulations to you all!! >>> >>> On Fri, Mar 2

Re: Apache Spark 2.4.8 (and EOL of 2.4)

2021-04-04 Thread Dongjoon Hyun
Given that Maven passed already with that profile and you tested locally, I'm +1 for staring RC. Thanks, Dongjoon. On Sun, Apr 4, 2021 at 2:24 AM Hyukjin Kwon wrote: > I would +1for just going ahead. That looks flaky to me too. > > Thanks Langchi for driving this! > > > On Sun, 4 Apr 2021, 18:1

Re: [PSA] Please read: PR builder now runs test and build in your forked repository

2021-04-14 Thread Dongjoon Hyun
Thank you again, Hyukjin. Bests, Dongjoon. On Wed, Apr 14, 2021 at 5:25 AM Kent Yao wrote: > Cool, thanks! > > Hyukjin Kwon 于2021年4月14日周三 下午8:19写道: > >> Good point! I had to clarify. >> Once is enough. The sync is needed for your branch to include the changes >> of https://github.com/apache/sp

Re: [VOTE] Release Spark 2.4.8 (RC2)

2021-04-14 Thread Dongjoon Hyun
+1 Bests, Dongjoon. On Tue, Apr 13, 2021 at 10:38 PM Kent Yao wrote: > +1 (non-binding) > > *Kent Yao * > @ Data Science Center, Hangzhou Research Institute, NetEase Corp. > *a spark enthusiast* > *kyuubi is a unified multi-tenant JDBC > interface for large-s

Re: [PSA] Please read: PR builder now runs test and build in your forked repository

2021-04-14 Thread Dongjoon Hyun
ps://github.com/apache/spark/pull/32168 (PR after sync) >4.2. https://github.com/apache/spark/pull/32172 (PR after re-forked) >4.3. > https://github.com/attilapiros/spark/runs/2344911058?check_suite_focus=true > (some other failures noticed) > > > Bests, > >

Re: [VOTE] Release Spark 2.4.8 (RC3)

2021-04-28 Thread Dongjoon Hyun
+1 Thank you! Bests, Dongjoon. On Wed, Apr 28, 2021 at 10:32 AM Maxim Gekk wrote: > +1 (non-binding) > > On Wed, Apr 28, 2021 at 8:12 PM Wenchen Fan wrote: > >> +1 (binding) >> >> On Thu, Apr 29, 2021 at 1:05 AM DB Tsai wrote: >> >>> +1 (binding) >>> >>> > On Apr 28, 2021, at 9:26 AM, Liang-

Re: Bintray replacement for spark-packages.org

2021-04-29 Thread Dongjoon Hyun
I agree with Wenchen. I can volunteer for Apache Spark 3.1.2 release manager at least. Bests, Dongjoon. On Wed, Apr 28, 2021 at 10:15 AM Wenchen Fan wrote: > Shall we make new releases for 3.0 and 3.1? So that people don't need to > change the sbt resolver/pom files to work around Bintray sun

Re: [VOTE] Release Spark 2.4.8 (RC4)

2021-05-09 Thread Dongjoon Hyun
+1 The additional one patch for the old bug fix looks safe. Thank you, Liang-Chi. Bests, Dongjoon. On Sun, May 9, 2021 at 2:22 PM Liang-Chi Hsieh wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.4.8. > > The vote is open until May 14th at 9AM PST and passe

Re: Bintray replacement for spark-packages.org

2021-05-11 Thread Dongjoon Hyun
Thank you, Yi and all. Then, after 2.4.8 release, shall we start to roll 3.1.2 and 3.0.3. Bests, Dongjoon. On Mon, May 10, 2021 at 10:50 PM Yi Wu wrote: > Hi wenchen, > > I'd like to volunteer for Apache Spark 3.0.3 release. > > Thanks, > Yi > > On Fri, Apr 30, 2

Apache Spark 3.1.2 Release?

2021-05-17 Thread Dongjoon Hyun
Hi, All. Since Apache Spark 3.1.1 tag creation (Feb 21), new 172 patches including 9 correctness patches and 4 K8s patches arrived at branch-3.1. Shall we make a new release, Apache Spark 3.1.2, as the second release at 3.1 line? I'd like to volunteer for the release manager for Apache Spark 3.1.

Re: [ANNOUNCE] Apache Spark 2.4.8 released

2021-05-17 Thread Dongjoon Hyun
Finally! Thank you, Liang-Chi. Bests, Dongjoon. On Mon, May 17, 2021 at 10:14 PM Takeshi Yamamuro wrote: > Thank you for the release work, Liang-Chi~ > > On Tue, May 18, 2021 at 2:11 PM Hyukjin Kwon wrote: > >> Yay! >> >> 2021년 5월 18일 (화) 오후 12:57, Liang-Chi Hsieh 님이 작성: >> >>> We are happy t

Re: Apache Spark 3.1.2 Release?

2021-05-18 Thread Dongjoon Hyun
RM :) >>>>>> >>>>>> On Mon, May 17, 2021 at 4:09 PM Takeshi Yamamuro < >>>>>> linguin@gmail.com> wrote: >>>>>> >>>>>>> Thank you, Dongjoon~ sgtm, too. >>>>>>>

Re: Resolves too old JIRAs as incomplete

2021-05-19 Thread Dongjoon Hyun
+1. Thank you, Takeshi. On Wed, May 19, 2021 at 7:49 PM Hyukjin Kwon wrote: > Yeah, I wanted to discuss this. I agree since 2.4.x became EOL > > 2021년 5월 20일 (목) 오전 10:54, Sean Owen 님이 작성: > >> I agree. Such old JIRAs are 99% obsolete. If anyone objects to a >> particular issue being closed, th

[VOTE] Release Spark 3.1.2 (RC1)

2021-05-23 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version 3.1.2. The vote is open until May 27th 1AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.1.2 [ ] -1 Do not release this package because ...

Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-25 Thread Dongjoon Hyun
gt; Gengliang > > > > > On Tue, May 25, 2021 at 11:31 PM Sean Owen wrote: > >> +1 same result as in previous tests >> >> On Mon, May 24, 2021 at 1:14 AM Dongjoon Hyun >> wrote: >> >>> Please vote on releasing the following candidate as Apache Sp

Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-26 Thread Dongjoon Hyun
tras>A** library t**hat > brings useful functions from various modern database management systems to > **Apache > Spark <http://spark.apache.org/>.* > > > > On 05/27/2021 10:44,Yuming Wang > wrote: > > +1 (non-binding) > > On Wed, May 26, 2021 at 11:27

Re: [VOTE] Release Spark 3.1.2 (RC1)

2021-05-27 Thread Dongjoon Hyun
This vote passed. I'll finalize this vote. Thank you all. Dongjoon. On 2021/05/27 16:26:55, Chao Sun wrote: > +1 (non-binding) - thanks Dongjoon for the work! > > On Wed, May 26, 2021 at 8:35 PM Dongjoon Hyun > wrote: > > > +1 > > > > Bests, > > D

[VOTE][RESULT] Release Spark 3.1.2 (RC1)

2021-05-27 Thread Dongjoon Hyun
The vote passes with 13 +1s (4 binding +1s). Thanks to all who helped with the release! (* = binding) +1: - Sean Owen * - Gengliang Wang - Takeshi Yamamuro - Cheng Su - Hyukjin Kwon * - Liang-Chi Hsieh - John Zhuge - Thomas Graves * - Maxim Gekk - Yuming Wang - Kent Yao - Dongjoon Hyun * - Chao

[ANNOUNCE] Apache Spark 3.1.2 released

2021-06-01 Thread Dongjoon Hyun
. Dongjoon Hyun

Re: Apache Spark 3.0.3 Release?

2021-06-08 Thread Dongjoon Hyun
+1, Thank you! :) Bests, Dongjoon. On Tue, Jun 8, 2021 at 9:05 PM Kent Yao wrote: > +1. Thanks, Yi ~ > > Bests, > *Kent Yao * > @ Data Science Center, Hangzhou Research Institute, NetEase Corp. > *a spark enthusiast* > *kyuubi is a unified multi-tenant JDBC >

Re: Apache Spark 3.2 Expectation

2021-06-16 Thread Dongjoon Hyun
Thank you for volunteering, Gengliang. Apache Spark 3.2.0 is the first version enabling AQE by default. I'm also watching some on-going improvements on that. https://issues.apache.org/jira/browse/SPARK-33828 (SQL Adaptive Query Execution QA) To Liang-Chi, I'm -1 for postponing the branch cut

UPDATE: Apache Spark 3.2 Release

2021-06-16 Thread Dongjoon Hyun
This is a continuation of the previous thread, `Apache Spark 3.2 Expectation`, in order to give you updates. - https://lists.apache.org/thread.html/r61897da071729913bf586ddd769311ce8b5b068e7156c352b51f7a33%40%3Cdev.spark.apache.org%3E First of all, the AS-IS schedule is here - https://spark.ap

Re: Apache Spark 3.2 Expectation

2021-06-16 Thread Dongjoon Hyun
garding the soft cut date and the >> flexibility for including the remaining tickets under SPARK-30602 into >> branch-3.2. >> >> Best, >> Min >> >> On Wed, Jun 16, 2021 at 1:20 PM Liang-Chi Hsieh wrote: >> >>> >>> Thanks Dongjoo

Re: UPDATE: Apache Spark 3.2 Release

2021-06-17 Thread Dongjoon Hyun
he.org/release/3.3.1.html > > Regards, > Yikun > > > Dongjoon Hyun 于2021年6月17日周四 上午5:58写道: > > > This is a continuation of the previous thread, `Apache Spark 3.2 > > Expectation`, in order to give you updates. > > > > - > > https://

Re: [VOTE] Release Spark 3.0.3 (RC1)

2021-06-20 Thread Dongjoon Hyun
+1 Thank you, Yi. Bests, Dongjoon. On Sat, Jun 19, 2021 at 6:57 PM Yuming Wang wrote: > +1 > > Tested a batch of production query with Thrift Server. > > On Sat, Jun 19, 2021 at 3:04 PM Mridul Muralidharan > wrote: > >> >> +1 >> >> Signatures, digests, etc check out fine. >> Checked out tag

Re: Missing module spark-hadoop-cloud in Maven central

2021-06-21 Thread Dongjoon Hyun
Hi, Steve. Here is the PR for publishing it as a part of Apache Spark 3.2.0+ https://github.com/apache/spark/pull/33003 Bests, Dongjoon. On 2021/06/01 17:09:53, Steve Loughran wrote: > (can't reply to user@, so pulling @dev instead. sorry) > > (can't reply to user@, so pulling @dev instead)

Re: [DISCUSS] Rename hadoop-3.2/hadoop-2.7 profile to hadoop-3/hadoop-2?

2021-06-24 Thread Dongjoon Hyun
For renaming, I'd target it for Apache Spark 3.3 instead of Apache Spark 3.2 because this is the first release of using Apache Hadoop 3.3.1 and we may need to revert Apache Hadoop 3.3.1 during RC period. Dongjoon. On Thu, Jun 24, 2021 at 12:24 PM Sean Owen wrote: > The downside here is that it

Re: [ANNOUNCE] Apache Spark 3.0.3 released

2021-06-25 Thread Dongjoon Hyun
Thank you, Yi! On Thu, Jun 24, 2021 at 10:52 PM Yi Wu wrote: > We are happy to announce the availability of Spark 3.0.3! > > Spark 3.0.3 is a maintenance release containing stability fixes. This > release is based on the branch-3.0 maintenance branch of Spark. We strongly > recommend all 3.0 u

Re: Apache Spark 3.2 Expectation

2021-07-01 Thread Dongjoon Hyun
can speed up on these items in the list too. >>> >>> >>> On Thu, 17 Jun 2021, 15:08 Gengliang Wang, wrote: >>> >>>> Thanks for the suggestions from Dongjoon, Liangchi, Min, and Xiao! >>>> Now we make it clear that it's a soft cut and

Re: Flaky build in GitHub Actions

2021-07-21 Thread Dongjoon Hyun
Thank you, Hyukjin! Dongjoon. On Tue, Jul 20, 2021 at 8:53 PM Hyukjin Kwon wrote: > I filed a ticket at GitHub. I will share more details when I get a > response from them. > > 2021년 7월 20일 (화) 오후 7:30, Hyukjin Kwon 님이 작성: > >> Hi all, >> >> Looks like there's something going on in the machines

Re: Adding Spark 4 to JIRA for targetted versions

2021-09-13 Thread Dongjoon Hyun
I'm fine to have the version number, but breaking API compatibility should be discussed separately in the community. We decided to strive to avoid breaking APIs even in major versions and made a policy for that. https://spark.apache.org/versioning-policy.html > The Spark project strives to avoid b

Re: [VOTE] Release Spark 3.2.0 (RC5)

2021-09-27 Thread Dongjoon Hyun
Unfortunately, it's the same for me recently. Not only that, but I also hit MetaspaceSize OOM, too. I ended up with MAVEN_OPTS like the following. -Xms12g -Xmx12g -Xss128M -XX:MaxMetaspaceSize=4g ... Dongjoon. On Mon, Sep 27, 2021 at 12:18 PM Sean Owen wrote: > Has anyone seen a StackOverflow

Re: [VOTE] Release Spark 3.2.0 (RC7)

2021-10-07 Thread Dongjoon Hyun
+1 for Apache Spark 3.2.0 RC7. It looks good to me. I tested with EKS 1.21 additionally. Cheers, Dongjoon. On Thu, Oct 7, 2021 at 7:46 PM 郑瑞峰 wrote: > +1 (non-binding) > > > -- 原始邮件 -- > *发件人:* "Sean Owen" ; > *发送时间:* 2021年10月7日(星期四) 晚上10:23 > *收件人:* "Gengliang

Re: [ANNOUNCE] Apache Spark 3.2.0

2021-10-19 Thread Dongjoon Hyun
Thank you so much, Gengliang and all! Dongjoon. On Tue, Oct 19, 2021 at 8:48 AM Xiao Li wrote: > Thank you, Gengliang! > > Congrats to our community and all the contributors! > > Xiao > > Henrik Peng 于2021年10月19日周二 上午8:26写道: > >> Congrats and thanks! >> >> >> Gengliang Wang 于2021年10月19日 周二下午10

Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-26 Thread Dongjoon Hyun
+1 for this SPIP. On Sun, Oct 24, 2021 at 9:59 AM huaxin gao wrote: > +1. Thanks for lifting the current restrictions on bucket join and making > this more generalized. > > On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue wrote: > >> +1 from me as well. Thanks Chao for doing so much to get it to this

Re: Update Spark 3.3 release window?

2021-10-28 Thread Dongjoon Hyun
+1 for mid March for Spark 3.3. For 2.4, our document already mentioned its EOL like " For example, 2.4.0 was released in November 2nd 2018 and had been maintained for 31 months until 2.4.8 was released on May 2021. 2.4.8 is the last release and no more 2.4.x releases should be expected even for

Re: [VOTE] SPIP: Storage Partitioned Join for Data Source V2

2021-10-29 Thread Dongjoon Hyun
+1 Dongjoon On 2021/10/29 17:48:59, Russell Spitzer wrote: > +1 This is a great idea, (I have no Apache Spark voting points) > > On Fri, Oct 29, 2021 at 12:41 PM L. C. Hsieh wrote: > > > > > I'll start with my +1. > > > > On 2021/10/29 17:30:03, L. C. Hsieh wrote: > > > Hi all, > > > > > >

[FYI] Build and run tests on Java 17 for Apache Spark 3.3

2021-11-12 Thread Dongjoon Hyun
Hi, All. Apache Spark community has been working on Java 17 support under the following JIRA. https://issues.apache.org/jira/browse/SPARK-33772 As of today, Apache Spark starts to have daily Java 17 test coverage via GitHub Action jobs for Apache Spark 3.3. https://github.com/apache/spark/

Re: [VOTE] SPIP: Row-level operations in Data Source V2

2021-11-12 Thread Dongjoon Hyun
+1 On Fri, Nov 12, 2021 at 6:58 PM huaxin gao wrote: > +1 > > On Fri, Nov 12, 2021 at 6:44 PM Yufei Gu > wrote: > >> +1 >> >> > On Nov 12, 2021, at 6:25 PM, L. C. Hsieh wrote: >> > >> > Hi all, >> > >> > I’d like to start a vote for SPIP: Row-level operations in Data Source >> V2. >> > >> > Th

Re: [Apache Spark Jenkins] build system shutting down Dec 23th, 2021

2021-12-06 Thread Dongjoon Hyun
I really want to thank you for all your help. You've done so many things for the Apache Spark community. Sincerely, Dongjoon On Mon, Dec 6, 2021 at 12:02 PM shane knapp ☠ wrote: > hey everyone! > > after a marathon run of nearly a decade, we're finally going to be > shutting down {amp|rise}lab

Re: Time for Spark 3.2.1?

2021-12-07 Thread Dongjoon Hyun
+1 for new releases. Dongjoon. On Mon, Dec 6, 2021 at 8:51 PM Wenchen Fan wrote: > +1 to make new maintenance releases for all 3.x branches. > > On Tue, Dec 7, 2021 at 8:57 AM Sean Owen wrote: > >> Always fine by me if someone wants to roll a release. >> >> It's been ~6 months since the last 3

Re: Tries on migrating Spark Linux arm64 Job from Jenkins to GitHub Actions

2022-01-09 Thread Dongjoon Hyun
;> policy". >>> >>> *## Pros of self-hosted github aciton:* >>> - Satisfy the simple demands of Linux arm64 sheduled jobs. >>> - Reuse the main workflow of github action. >>> - All changes are visible on github is easy to review. >>> - Easy t

Re: Apache Spark Jenkins Infra 2022

2022-01-10 Thread Dongjoon Hyun
e spark jenkins lives on! > > @dongjoon, let me know if there's anything you need. nice work, as > always. :) > > shane > > On Sat, Jan 8, 2022 at 7:40 PM Yikun Jiang wrote: > >> @Dongjoon Hyun Thanks for your work on “Apache >> Spark Jenkins Infra 202

Re: [VOTE] Release Spark 3.2.1 (RC1)

2022-01-15 Thread Dongjoon Hyun
Hi, Bjorn. It seems that you are confused about my announcement. The test coverage announcement is about the `master` branch which is for the upcoming Apache Spark 3.3.0. Apache Spark 3.3 will start to support Java 17, not old release branches like Apache Spark 3.2.x/3.1.x/3.0.x. > 1. If I change

Re: [VOTE] Release Spark 3.2.1 (RC2)

2022-01-24 Thread Dongjoon Hyun
+1 Dongjoon. On Sat, Jan 22, 2022 at 7:19 AM Mridul Muralidharan wrote: > > +1 > > Signatures, digests, etc check out fine. > Checked out tag and build/tested with -Pyarn -Pmesos -Pkubernetes > > Regards, > Mridul > > On Fri, Jan 21, 2022 at 9:01 PM Sean Owen wrote: > >> +1 with same result as

Re: [VOTE][RESULT] Release Spark 3.2.1 (RC2)

2022-01-25 Thread Dongjoon Hyun
Thank you, Huaxin and all! Dongjoon On Tue, Jan 25, 2022 at 8:13 PM huaxin gao wrote: > The vote passes with 13 +1s (4 binding +1s). Thanks to all who helped with > the release! (* = binding) +1: > - Sean Owen * > - Mridul Muralidharan * > - Dongjoon Hyun * > - Genglian

Re: [ANNOUNCE] Apache Spark 3.2.1 released

2022-01-28 Thread Dongjoon Hyun
Thank you again, Huaxin! Dongjoon. On Fri, Jan 28, 2022 at 6:23 PM DB Tsai wrote: > Thank you, Huaxin for the 3.2.1 release! > > Sent from my iPhone > > On Jan 28, 2022, at 5:45 PM, Chao Sun wrote: > >  > Thanks Huaxin for driving the release! > > On Fri, Jan 28, 2022 at 5:37 PM Ruifeng Zheng

Re: MetadataFetchFailedException due to decommission block migrations

2022-02-02 Thread Dongjoon Hyun
Thank you for sharing, Emil. > I willing to help up to develop a fix, but might need some guidance of > how this case could be handled better. Could you file an official Apache JIRA for your finding and propose a PR for that too with the test case? We can continue our discussion on your PR. Dong

Re: [VOTE] Spark 3.1.3 RC3

2022-02-03 Thread Dongjoon Hyun
Unfortunately, -1 for 3.1.3 RC3 due to the packaging issue. It seems that the master branch release script didn't work properly for Hadoop 2 binary distribution, Holden. $ curl -s https://dist.apache.org/repos/dist/dev/spark/v3.1.3-rc3-bin/spark-3.1.3-bin-hadoop2.tgz | tar tz | grep hadoop-common

Re: Apache Spark 3.3 Release

2022-03-03 Thread Dongjoon Hyun
Thank you, Max, for volunteering for Apache Spark 3.3 release manager. Ya, I'm also +1 for the original plan. Dongjoon On Thu, Mar 3, 2022 at 10:52 AM Mridul Muralidharan wrote: > > Agree with Sean, code freeze by mid March sounds good. > > Regards, > Mridul > > On Thu, Mar 3, 2022 at 12:47 PM

Re: Apache Spark 3.3 Release

2022-03-04 Thread Dongjoon Hyun
elease branch cut. >> >> Btw, would we be open for modification of critical/blocker issues after >> the release branch cut? I have a blocker JIRA ticket and the PR is open for >> reviewing, but need some time to gain traction as well as going through >> actual revi

Re: Apache Spark 3.3 Release

2022-03-15 Thread Dongjoon Hyun
The following was tested and merged a few minutes ago. So, we can remove it from the list. #35819 [SPARK-38524][SPARK-38553][K8S] Bump Volcano to v1.5.1 Thanks, Dongjoon. On Tue, Mar 15, 2022 at 9:48 AM Xiao Li wrote: > Let me clarify my above sugge

Re: Apache Spark 3.3 Release

2022-03-15 Thread Dongjoon Hyun
;> #34659 [SPARK-34863][SQL] Support complex types for Parquet vectorized > reader > >> #35848 [SPARK-38548][SQL] New SQL function: try_sum > >> > >> Do you mean we should include them, or exclude them from 3.3? > >> > >> Thanks, > >> Chao

Re: Apache Spark 3.3 Release

2022-03-15 Thread Dongjoon Hyun
very few 3.4-only feature work that will be affected. > > Xiao > > Dongjoon Hyun 于2022年3月15日周二 11:49写道: > >> Hi, Max, Chao, Xiao, Holden and all. >> >> I have a different idea. >> >> Given the situation and small patch list, I don't think we need

Re: Apache Spark 3.3 Release

2022-03-15 Thread Dongjoon Hyun
eature work. In the next three days, > let us collect the actively developed PRs that we want to make an exception > (i.e., merged to 3.3 after the upcoming branch cut). Does that make sense? > > Dongjoon Hyun 于2022年3月15日周二 14:54写道: > >> Xiao. You are working against what you are

Re: Apache Spark 3.3 Release

2022-03-15 Thread Dongjoon Hyun
ng them in an ad hoc way. In the past, we spent a > lot of time on the revert of the PRs that were merged after the branch cut. > I hope we can minimize unnecessary arguments in this release. Do you agree, > Dongjoon? > > > > Dongjoon Hyun 于2022年3月15日周二 15:55写道: > >>

Re: Skip single integration test case in Spark on K8s

2022-03-16 Thread Dongjoon Hyun
-user@spark For cloud backend, you need to exclude minikube specific tests and local-only test (SparkRemoteFileTest). -Dtest.exclude.tags=minikube,local You can find more options including SBT commands here. https://github.com/apache/spark/tree/master/resource-managers/kubernetes/integrati

Re: bazel and external/

2022-03-17 Thread Dongjoon Hyun
Thank you for posting this, Alkis. Before the question (1) and (2), I'm curious if the Apache Spark community has other downstreams using Bazel. To All. If there are some Bazel users with Apache Spark code, could you share your practice? If you are using renaming, what is your renamed directory n

Re: Apache Spark 3.3 Release

2022-03-18 Thread Dongjoon Hyun
park/pull/35262) >> >> >> >> It's already reviewed and approved. >> >> >> >> On Wed, Mar 16, 2022 at 9:13 AM Tom Graves >> wrote: >> >> > >> >> > It looks like the version hasn't been updated on master and s

Re: Probable bug in async commit of Kafka offset in DirectKafkaInputDStream

2022-03-29 Thread Dongjoon Hyun
Hi, Souvik Could you file a JIRA issue for that? Thanks, Dongjoon On Thu, Mar 24, 2022 at 11:08 AM Paul, Souvik wrote: > Hi Dev, > > I added a few debug statements at the following lines and found few issues. > > 1. At line 254 of override def compute(validTime: Time): > Option[KafkaRDD[K, V]]

Re: Probable bug in async commit of Kafka offset in DirectKafkaInputDStream

2022-04-08 Thread Dongjoon Hyun
Thank you, Souvik. Dongjoon. On Thu, Apr 7, 2022 at 10:59 AM Paul, Souvik wrote: > Hi Dongjoon, > > > > Raised the JIRA at https://issues.apache.org/jira/browse/SPARK-38824 > > > > Thanks, > > Souvik > > > > *From:* Dongjoon Hyun > *Sent:

Re: Spark client for Hadoop 2.x

2022-04-10 Thread Dongjoon Hyun
Hi, Amin In general, the Apache Spark community has received many feedbacks and been moving forward to - Use the latest Hadoop versions for more bug fixes including CVEs. - Use Hadoop's shaded clients to minimize the dependency issues Since the above is not achievable with Hadoop 2 clients, I be

  1   2   3   4   5   6   7   8   9   >