Re: PR builder not working now

2022-04-12 Thread Dongjoon Hyun
Thank you for sharing that information! Bests Dongjoon. On Mon, Apr 11, 2022 at 10:29 PM Hyukjin Kwon wrote: > Hi all, > > There is a bug in GitHub Actions' RESTful API (see > https://github.com/HyukjinKwon/spark/actions?query=branch%3Adebug-ga-detection > as an example). > So, currently OSS P

Re: PR builder not working now

2022-04-19 Thread Dongjoon Hyun
It's great! Thank you. :) On Tue, Apr 19, 2022 at 4:42 PM Hyukjin Kwon wrote: > It's fixed now. > > On Tue, 19 Apr 2022 at 08:33, Hyukjin Kwon wrote: > >> It's still persistent. I will send an email to GitHub support today >> >> On Wed,

Re: [VOTE] Release Spark 3.3.0 (RC1)

2022-05-06 Thread Dongjoon Hyun
Hi, Sean. It's interesting. I didn't see those failures from my side. Hi, Maxim. In the following link, there are 17 in-progress and 6 to-do JIRA issues which look irrelevant to this RC1 vote. https://issues.apache.org/jira/projects/SPARK/versions/12350369 Since RC1 is started, could you move th

Re: [VOTE] Release Spark 3.3.0 (RC1)

2022-05-11 Thread Dongjoon Hyun
gt;>>>> compilation error >>>>>> <https://github.com/apache/spark/commit/fd998c8a6783c0c8aceed8dcde4017cd479e42c8> >>>>>> >>>>>> So -1 from me. We should have RC2 to include the fix. >>>>>> >>>>>>

Re: SIGMOD System Award for Apache Spark

2022-05-13 Thread Dongjoon Hyun
Ya, it's really great!. Congratulations to the whole community! Dongjoon. On Fri, May 13, 2022 at 8:12 AM Chao Sun wrote: > Huge congrats to the whole community! > > On Fri, May 13, 2022 at 1:56 AM Wenchen Fan wrote: > >> Great! Congratulations to everyone! >> >> On Fri, May 13, 2022 at 10:38

Re: Introducing "Pandas API on Spark" component in JIRA, and use "PS" PR title component

2022-05-18 Thread Dongjoon Hyun
+1 Thank you for the suggestion, Hyukjin. Dongjoon. On Wed, May 18, 2022 at 11:08 AM Bjørn Jørgensen wrote: > +1 > But can will have PR Title and PR label the same, PS > > ons. 18. mai 2022 kl. 18:57 skrev Xinrong Meng > : > >> Great! >> >> It saves us from always specifying "Pandas API on Sp

Re: [VOTE] Release Spark 3.3.0 (RC2)

2022-05-20 Thread Dongjoon Hyun
Thank you, Maxim! Dongjoon. On Thu, May 19, 2022 at 11:49 PM Maxim Gekk wrote: > Hi All, > > The voting for Spark 3.3.0 RC2 has failed since there aren't enough +1 and > due to reported bugs. I will prepare RC3 at the beginning of next week. > > All known issues have been resolved in 3.3 alread

Re: The draft of the Spark 3.3.0 release notes

2022-06-03 Thread Dongjoon Hyun
You are right. After SPARK-36837, we tried to ship Apache Spark 3.3.0 with Apache Kafka 3.1.1 via the following PR. https://github.com/apache/spark/pull/36135 [WIP][SPARK-38850][BUILD] Upgrade Kafka to 3.1.1 However, the final decision was to revert it from `branch-3.3` and move directly to Apac

Re: [VOTE] Release Spark 3.3.0 (RC5)

2022-06-06 Thread Dongjoon Hyun
+1. I double-checked the following additionally. - Run unit tests on Apple Silicon with Java 17/Python 3.9.11/R 4.1.2 - Run unit tests on Linux with Java11/Scala 2.12/2.13 - K8s integration test (including Volcano batch scheduler) on K8s v1.24 - Check S3 read/write with spark-shell with Scala 2.1

Re: [VOTE] Release Spark 3.3.0 (RC6)

2022-06-13 Thread Dongjoon Hyun
+1 Thanks, Dongjoon. On Mon, Jun 13, 2022 at 3:54 PM Chris Nauroth wrote: > +1 (non-binding) > > I repeated all checks I described for RC5: > > https://lists.apache.org/thread/ksoxmozgz7q728mnxl6c2z7ncmo87vls > > Maxim, thank you for your dedication on these release candidates. > > Chris Naurot

Re: [VOTE][RESULT] Release Spark 3.3.0 (RC6)

2022-06-14 Thread Dongjoon Hyun
ng > Sean Owen (*) > Tom Graves (*) > Mridul Muralidharan (*) > Chris Nauroth > Dongjoon Hyun (*) > Yuming Wang > Holden Karau (*) > L. C. Hsieh (*) > Cheng Su > Chao Sun > Martin Grigorov > Peter Toth > Max Gekk > > 0: None > > -1: None > > Maxim Gekk > > Software Engineer > > Databricks, Inc. >

Re: Re: [VOTE][SPIP] Spark Connect

2022-06-15 Thread Dongjoon Hyun
+1 On Wed, Jun 15, 2022 at 9:22 AM Xiao Li wrote: > +1 > > Xiao > > beliefer 于2022年6月14日周二 03:35写道: > >> +1 >> Yeah, I tried to use Apache Livy, so as we can runing interactive query. >> But the Spark Driver in Livy looks heavy. >> >> The SPIP may resolve the issue. >> >> >> >> At 2022-06-14 18

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-22 Thread Dongjoon Hyun
h-3.3, JDK 11, >>JDK 17 and Scala 2.13), see https://github.com/apache/spark/actions >>cc @Tom Graves @Dongjoon Hyun >> FYI >>- >> >>except one test that is being failed due to OOM. That’s being fixed >>at https://github.com/apache/s

Re: [SPARK-39515] Improve scheduled jobs in GitHub Actions

2022-06-23 Thread Dongjoon Hyun
another time slot that works for more ppl. >> >> On Thu, 23 Jun 2022 at 00:19, Dongjoon Hyun >> wrote: >> >>> Thank you, Hyukjin! :) >>> >>> BTW, unfortunately, it seems that I cannot join that quick meeting. >>> I have another schedule at

Apache Spark 3.2.2 Release?

2022-07-06 Thread Dongjoon Hyun
Hi, All. Since Apache Spark 3.2.1 tag creation (Jan 19), new 197 patches including 11 correctness patches arrived at branch-3.2. Shall we make a new release, Apache Spark 3.2.2, as the third release at 3.2 line? I'd like to volunteer as the release manager for Apache Spark 3.2.2. I'm thinking abo

Re: Apache Spark 3.2.2 Release?

2022-07-07 Thread Dongjoon Hyun
Thank you all. I'll check and prepare RC1 for next week. Dongjoon.

Re: [DISCUSS][Catalog API] Deprecate 4 Catalog API that takes two parameters which are (dbName, tableName/functionName)

2022-07-07 Thread Dongjoon Hyun
Thank you for starting the official discussion, Rui. 'Unneeded API' doesn't sound like a good frame for this discussion because it ignores the existing users and codes completely. Technically, the above mentioned reasons look irrelevant to any specific existing bugs or future maintenance cost savi

Re: Apache Spark 3.2.2 Release?

2022-07-08 Thread Dongjoon Hyun
te columns > - https://issues.apache.org/jira/browse/SPARK-38787 : Possible correctness > issue on stream-stream join when handling edge case > > > On Thu, Jul 7, 2022 at 6:12 PM Dongjoon Hyun wrote: >> >> Thank you

[VOTE] Release Spark 3.2.2 (RC1)

2022-07-11 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version 3.2.2. The vote is open until July 15th 1AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.2.2 [ ] -1 Do not release this package because ...

Re: [VOTE] Release Spark 3.2.2 (RC1)

2022-07-12 Thread Dongjoon Hyun
Dongjoon. On Mon, Jul 11, 2022 at 10:30 PM Yang,Jie(INF) wrote: > Does this happen when running all UTs? I ran this suite several times > alone using OpenJDK(zulu) 8u322-b06 on my Mac, but no similar error > occurred > > > > *发件人**: *Sean Owen > *日期**: *2022年7月12日 星期

Re: [VOTE] Release Spark 3.2.2 (RC1)

2022-07-12 Thread Dongjoon Hyun
gt; Does this happen when running all UTs? I ran this suite several times >> alone using OpenJDK(zulu) 8u322-b06 on my Mac, but no similar error >> occurred >> >> >> >> *发件人**: *Sean Owen >> *日期**: *2022年7月12日 星期二 10:45 >> *收件人**: *Dongjoon Hyun >

Re: [VOTE] Release Spark 3.2.2 (RC1)

2022-07-14 Thread Dongjoon Hyun
The update to the API doc was made since v3.2.1. > > As far as I can tell, 3.2.2 doesn't support TimestampNTZType. > > > On Mon, Jul 11, 2022 at 2:58 PM Dongjoon Hyun > wrote: > >> Please vote on releasing the following candidate as Apache Spark version >> 3.2

Re: [VOTE] Release Spark 3.2.2 (RC1)

2022-07-15 Thread Dongjoon Hyun
believe I can remove those four places after uploading the docs > >> to > >>> our website. > >>> > >>> Dongjoon. > >>> > >>> On Thu, Jul 14, 2022 at 2:16 PM Bruce Robbins > >> > >>> wrote: > >>> >

[VOTE][RESULT] Release Spark 3.2.2 (RC1)

2022-07-15 Thread Dongjoon Hyun
The vote passes with 12 +1s (8 binding +1s). Thanks to all who helped with the release! (* = binding) +1: - Hyukjin Kwon * - Liang-Chi Hsieh * - Cheng Su - Dongjoon Hyun * - Yang Jie - Maxim Gekk * - Mridul Muralidharen * - Yikun Jiang - Wenchen Fan * - Chao Sun - Gengliang Wang * - Kousuke

[ANNOUNCE] Apache Spark 3.2.2 released

2022-07-17 Thread Dongjoon Hyun
. Dongjoon Hyun

Re: Update Spark 3.4 Release Window?

2022-07-20 Thread Dongjoon Hyun
Thank you for initiating this discussion, Xinrong. I also agree with Sean. +1 for February 2023 (Release Candidate) and January 2021 (Code freeze). Dongjoon. On Wed, Jul 20, 2022 at 1:42 PM Sean Owen wrote: > > I don't know any better than others when it will actually happen, though > historic

Re: Update Spark 3.4 Release Window?

2022-07-20 Thread Dongjoon Hyun
I fixed typos :) +1 for February 2023 (Release Candidate) and January 2023 (Code freeze). On 2022/07/20 20:59:30 Dongjoon Hyun wrote: > Thank you for initiating this discussion, Xinrong. I also agree with Sean. > > +1 for February 2023 (Release Candidate) and January 2021 (Co

Re: Setting spark.kubernetes.driver.connectionTimeout, spark.kubernetes.submission.connectionTimeout to default spark.network.timeout

2022-08-01 Thread Dongjoon Hyun
Hi, Pralabh. Could you elaborate on your situation more? I'm interested in your needs. Currently, the default value of spark.network.timeout, 120s, is quite bigger than the default value of spark.kubernetes.driver.connectionTimeout, 10s. It would be a breaking change if we increase `spark.kuberne

Re: Welcome Xinrong Meng as a Spark committer

2022-08-09 Thread Dongjoon Hyun
Congrat! :) Dongjoon. On Tue, Aug 9, 2022 at 10:40 AM Takuya UESHIN wrote: > > Congratulations, Xinrong! > > On Tue, Aug 9, 2022 at 10:07 AM Gengliang Wang wrote: >> >> Congratulations, Xinrong! Well deserved. >> >> >> On Tue, Aug 9, 2022 at 7:09 AM Yi Wu wrote: >>> >>> Congrats Xinrong!! >>>

Re: Welcoming three new PMC members

2022-08-09 Thread Dongjoon Hyun
Congrat to all! Dongjoon. On Tue, Aug 9, 2022 at 5:13 PM Takuya UESHIN wrote: > > Congratulations! > > On Tue, Aug 9, 2022 at 4:57 PM Hyukjin Kwon wrote: >> >> Congrats everybody! >> >> On Wed, 10 Aug 2022 at 05:50, Mridul Muralidharan wrote: >>> >>> >>> Congratulations ! >>> Great to have you

Re: Time for Spark 3.3.1 release?

2022-09-12 Thread Dongjoon Hyun
+1 Thanks, Dongjoon. On Mon, Sep 12, 2022 at 6:38 AM Yuming Wang wrote: > Hi, All. > > > > Since Apache Spark 3.3.0 tag creation (Jun 10), new 138 patches including > 7 correctness patches arrived at branch-3.3. > > > > Shall we make a new release, Apache Spark 3.3.1, as the second release at >

Re: Time for Spark 3.3.1 release?

2022-09-14 Thread Dongjoon Hyun
gt;>> >>>> >>>> >>>> Thanks Yuming ~ >>>> >>>> >>>> >>>> *发件人**: *Hyukjin Kwon >>>> *日期**: *2022年9月13日 星期二 08:19 >>>> *收件人**: *Gengliang Wang >>>> *抄送**: *"L. C. Hsieh

Re: Time for Spark 3.3.1 release?

2022-09-14 Thread Dongjoon Hyun
also off old Hadoop versions. > You can of course build the combo you like. > > On Wed, Sep 14, 2022 at 11:26 AM Denis Bolshakov < > bolshakov.de...@gmail.com> wrote: > >> Unfortunately it's for hadoop 3 only. >> >> ср, 14 сент. 2022 г., 19:04 Dongjoon

Re: Time for Spark 3.3.1 release?

2022-09-14 Thread Dongjoon Hyun
Jørgensen wrote: > At least we should upgrade hadoop to the latest version > https://hadoop.apache.org/release/2.10.2.html > > Are there some spesial reasons why we have a hadoop version that is 7 > years old? > > ons. 14. sep. 2022, 20:25 skrev Dongjoon Hyun : > >> Ya,

Re: Creating a new component "Connect" in JIRA

2022-09-16 Thread Dongjoon Hyun
Thank you for sharing that information. +1 for the proposed way. Dongjoon. On Fri, Sep 16, 2022 at 5:07 AM Hyukjin Kwon wrote: > Hi all, > > I created a new component called "Connect" temporarily for the Spark > Connect project, > see https://issues.apache.org/jira/browse/SPARK-39375 because a

Re: [VOTE] Release Spark 3.3.1 (RC1)

2022-09-18 Thread Dongjoon Hyun
I also agree with Chao on that issue. SPARK-39833 landed at 3.3.1 and 3.2.3 to avoid a correctness issue at the cost of perf regression. Luckily, SPARK-40169 provided a correct fix and removed the main workaround code of SPARK-39833 before the official release. -1 for Apache Spark 3.3.1 RC1. Don

Re: [VOTE] SPIP: Support Docker Official Image for Spark

2022-09-22 Thread Dongjoon Hyun
+1 On Wed, Sep 21, 2022 at 11:02 PM Denny Lee wrote: > +1 (non-binding) > > On Wed, Sep 21, 2022 at 10:33 PM Ankit Gupta > wrote: > >> +1 >> >> Regards, >> >> Ankit Prakash Gupta >> >> On Thu, Sep 22, 2022 at 10:38 AM Yang,Jie(INF) >> wrote: >> >>> +1 (non-binding) >>> >>> >>> >>> Regards, >>>

Re: [VOTE] Release Spark 3.3.1 (RC2)

2022-10-03 Thread Dongjoon Hyun
Sorry, but -1 due to the undocumented breaking query result change. Apache Spark 3.2.0, 3.2.1, 3.2.2, 3.3.0 has the following result for `grouping_id()` and `grouping__id`. scala> sql("SELECT count(*), grouping__id from (VALUES (1,1,1),(2,2,2)) AS t(k1,k2,v) GROUP BY k1 GROUPING SETS (k2) ").show

Dropping Apache Spark Hadoop2 Binary Distribution?

2022-10-03 Thread Dongjoon Hyun
Hi, All. I'm wondering if the following Apache Spark Hadoop2 Binary Distribution is still used by someone in the community or not. If it's not used or not useful, we may remove it from Apache Spark 3.4.0 release. https://downloads.apache.org/spark/spark-3.3.0/spark-3.3.0-bin-hadoop2.tgz Here is

Re: Dropping Apache Spark Hadoop2 Binary Distribution?

2022-10-04 Thread Dongjoon Hyun
er APIs to work > with the data. > > Finally note that while that scatter/gather read call will only be on > 3.3.5 we are doing a shim lib to offer the API to apps on older builds > -it'll use readFully() to do the reads, just as the default implementation > on all filesystems does

Re: Dropping Apache Spark Hadoop2 Binary Distribution?

2022-10-04 Thread Dongjoon Hyun
changes. I opened a PR to make this thread visible in Apache Spark 3.4.0. SPARK-40651 Drop Hadoop2 binary distribution from release process https://github.com/apache/spark/pull/38099 Dongjoon. On 2022/10/04 19:32:52 Dongjoon Hyun wrote: > Yes, it's yours. I added you (Steve Lough

Re: Dropping Apache Spark Hadoop2 Binary Distribution?

2022-10-05 Thread Dongjoon Hyun
t;>>> Xiao > >>>> > >>>> On Wed, Oct 5, 2022 at 12:49 PM Sean Owen wrote: > >>>>> > >>>>> I'm OK with this. It simplifies maintenance a bit, and specifically > may allow us to finally move off of the ancient version of

Re: [VOTE] Release Spark 3.3.1 (RC2)

2022-10-11 Thread Dongjoon Hyun
Yes, that's the current status. FYI, 3.3.1-rc3 tag was created 6 days ago but the vote was not started because we are waiting for https://issues.apache.org/jira/browse/SPARK-40703 Chao Sun pinged the release manager 4 days ago and has been working on it. Now, his PR is ready for 3.3.1 release he

Re: [VOTE] Release Spark 3.3.1 (RC2)

2022-10-11 Thread Dongjoon Hyun
e waiting for something. Will the > v3.3.1-rc3 tag be moved once SPARK-40703 is out? (Is that even possible?) > Or will you just cut rc4 eventually and never vote on rc3? > > On Tue, Oct 11, 2022 at 1:14 PM Dongjoon Hyun > wrote: > >> Yes, that's the current status. &

Re: [DISCUSS] Flip the default value of Kafka offset fetching config (spark.sql.streaming.kafka.useDeprecatedOffsetFetching)

2022-10-14 Thread Dongjoon Hyun
+1 I agree with Jungtaek and Gabor about switching the default value of configurations with the migration guide. Dongjoon On Thu, Oct 13, 2022 at 12:46 AM Gabor Somogyi wrote: > Hi Jungtaek, > > Good to hear that the new approach is working fine. +1 from my side. > > BR, > G > > > On Thu, Oct

Re: [VOTE] Release Spark 3.3.1 (RC4)

2022-10-18 Thread Dongjoon Hyun
+1 Thank you, Yuming and all! Dongjoon. On Tue, Oct 18, 2022 at 9:22 AM Yang,Jie(INF) wrote: > Use maven to test Java 17 + Scala 2.13 and test passed, +1 for me > > > > *发件人**: *Sean Owen > *日期**: *2022年10月17日 星期一 21:34 > *收件人**: *Yuming Wang > *抄送**: *dev > *主题**: *Re: [VOTE] Release Spar

Re: Apache Spark 3.2.3 Release?

2022-10-18 Thread Dongjoon Hyun
+1 Thank you for volunteering, Chao! Dongjoon. On Tue, Oct 18, 2022 at 9:55 AM Sean Owen wrote: > OK by me, if someone is willing to drive it. > > On Tue, Oct 18, 2022 at 11:47 AM Chao Sun wrote: > >> Hi All, >> >> It's been more than 3 months since 3.2.2 (tagged at Jul 11) was >> released T

Re: [VOTE] Release Spark 3.3.1 (RC4)

2022-10-21 Thread Dongjoon Hyun
Could you provide your environment and test profile? Both community CIs look fine to me. GitHub Action: https://github.com/apache/spark/actions?query=branch%3Abranch-3.3 Apple Silicon Jenkins Farm: https://apache-spark.s3.fr-par.scw.cloud/BRANCH-3.3.html Dongjoon. On Fri, Oct 21, 2022 at 8:48 A

Re: 3.3.1 Release

2022-10-25 Thread Dongjoon Hyun
ial release? Thanks! > > > > *[VOTE][RESULT] Release Spark 3.3.1 (RC4)* > > The vote passes with 11 +1s (6 binding +1s). > > Thanks to all who helped with the release! > > > > (* = binding) > > +1: > > - Sean Owen (*) > > - Yang,Jie &

Re: [ANNOUNCE] Apache Spark 3.3.1 released

2022-10-25 Thread Dongjoon Hyun
It's great. Thank you so much, Yuming! Dongjoon On Tue, Oct 25, 2022 at 11:23 PM Yuming Wang wrote: > We are happy to announce the availability of Apache Spark 3.3.1! > > Spark 3.3.1 is a maintenance release containing stability fixes. This > release is based on the branch-3.3 maintenance branc

Re: Spark Context Shutodown

2022-10-27 Thread Dongjoon Hyun
Hi, Shrikant. It seems that you are using non-GA features. FYI, since Apache Spark 3.1.1, Kubernetes Support became GA in the community. https://spark.apache.org/releases/spark-release-3-1-1.html In addition, Apache Spark 3.1 reached EOL last month. Could you try the latest distribution li

Re: Spark Context Shutodown

2022-10-29 Thread Dongjoon Hyun
Shrikant Prasad wrote: > Thanks Dongjoon for replying. I have tried with Spark 3.2 and still facing > the same issue. > > Looking for some pointers which can help in debugging to find the > root cause. > > Regards, > Shrikant > > On Thu, 27 Oct 2022 at 10:36 PM, D

Re: Upgrade guava to 31.1-jre and remove hadoop2

2022-11-06 Thread Dongjoon Hyun
For dropping the `hadoop-2` profile, we need to discuss it further after releasing Apache Spark 3.4 and monitoring the adoption of 3.4. For now, we have no plan of dropping the `hadoop-2` profile yet because it's known to be used used in the community against the custom Hadoop 2 distributions (in

Re: ASF board report draft for November

2022-11-07 Thread Dongjoon Hyun
Shall we mention Spark 3.2.3 release preparation since Chao is currently actively working on it? Dongjoon. On Mon, Nov 7, 2022 at 11:53 AM Matei Zaharia wrote: > It’s time to send our quarterly report to the ASF board on Wednesday. Here > is a draft, let me know if you have suggestions: > > ===

Re: [VOTE] Release Spark 3.2.3 (RC1)

2022-11-14 Thread Dongjoon Hyun
+1 Thank you, Chao. On Mon, Nov 14, 2022 at 4:12 PM Chao Sun wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.2.3. > > The vote is open until 11:59pm Pacific time Nov 17th and passes if a > majority +1 PMC votes are cast, with a minimum of 3 +1 votes. > > [

Re: [VOTE] Release Spark 3.2.3 (RC1)

2022-11-15 Thread Dongjoon Hyun
3$adapted(Runner.scala:1316) > >at > org.scalatest.tools.Runner$$$Lambda$7245/0x00080193e840.apply(Unknown > Source) > >at scala.collection.immutable.List.foreach(List.scala:333) > >at > org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.sc

Re: [ANNOUNCE] Apache Spark 3.2.3 released

2022-11-30 Thread Dongjoon Hyun
Thank you, Chao! On Wed, Nov 30, 2022 at 8:16 AM Yang,Jie(INF) wrote: > Thanks, Chao! > > > > *发件人**: *Maxim Gekk > *日期**: *2022年11月30日 星期三 19:40 > *收件人**: *Jungtaek Lim > *抄送**: *Wenchen Fan , Chao Sun , > dev , user > *主题**: *Re: [ANNOUNCE] Apache Spark 3.2.3 released > > > > Thank you, Cha

Re: [VOTE][SPIP] Asynchronous Offset Management in Structured Streaming

2022-12-02 Thread Dongjoon Hyun
+1 Dongjoon. On 2022/12/01 13:17:00 Wenchen Fan wrote: > +1 > > On Thu, Dec 1, 2022 at 12:31 PM Shixiong Zhu wrote: > > > +1 > > > > > > On Wed, Nov 30, 2022 at 8:04 PM Hyukjin Kwon wrote: > > > >> +1 > >> > >> On Thu, 1 Dec 2022 at 12:39, Mridul Muralidharan > >> wrote: > >> > >>> > >>> +1

Re: Time for Spark 3.4.0 release?

2023-01-03 Thread Dongjoon Hyun
+1 Thank you! Dongjoon On Tue, Jan 3, 2023 at 9:44 PM Rui Wang wrote: > +1 to cut the branch starting from a workday! > > Great to see this is happening! > > Thanks Xinrong! > > -Rui > > On Tue, Jan 3, 2023 at 9:21 PM 416161...@qq.com > wrote: > >> +1, thank you Xinrong for driving this relea

Re: [DISCUSS] Deprecate DStream in 3.4

2023-01-12 Thread Dongjoon Hyun
+1 for the proposal (guiding only without any code change). Thanks, Dongjoon. On Thu, Jan 12, 2023 at 9:33 PM Shixiong Zhu wrote: > +1 > > > On Thu, Jan 12, 2023 at 5:08 PM Tathagata Das > wrote: > >> +1 >> >> On Thu, Jan 12, 2023 at 7:46 PM Hyukjin Kwon wrote: >> >>> +1 >>> >>> On Fri, 13 Ja

Re: [DISCUSS] Deprecate DStream in 3.4

2023-01-12 Thread Dongjoon Hyun
ating > this proposal. > > Sorry to make confusion. I just wanted to make sure the goal of the > proposal is not "removing" the API. The discussion on the removal of API > doesn't tend to go well, so I wanted to make sure I don't mean that. > > On Fri, Jan 13, 2023

Re: SparkR build with AppVeyor, broken by external reason

2023-01-16 Thread Dongjoon Hyun
Thank you for checking and sharing, Hyukjin. :) Dongjoon. On Mon, Jan 16, 2023 at 4:37 PM Hyukjin Kwon wrote: > Hi all, > > AppVeyor is currently broken assuming the flaky Github authorization issue > ( > https://help.appveyor.com/discussions/problems/11287-the-build-phase-is-set-to-msbuild-mod

Re: Time for Spark 3.4.0 release?

2023-01-24 Thread Dongjoon Hyun
wrote: >>>>> >>>>>> Thanks Xinrong. >>>>>> >>>>>> On Sat, Jan 7, 2023 at 9:18 AM Xinrong Meng >>>>>> wrote: >>>>>> >>>>>>> The release window for Apache Spark 3.4.0 is updated per &g

Re: Time for release v3.3.2

2023-01-30 Thread Dongjoon Hyun
+1 Thank you so much, Liang-Chi. 3.3.2 release will help 3.4.0 release too because they share many bug fixes. Dongjoon On Mon, Jan 30, 2023 at 5:56 PM Hyukjin Kwon wrote: > +100! > > On Tue, 31 Jan 2023 at 10:54, Chao Sun wrote: > >> +1, thanks Liang-Chi for volunteering! >> >> Chao >> >> On

Re: [DISCUSS] SPIP: Lazy Materialization for Parquet Read Performance Improvement

2023-02-01 Thread Dongjoon Hyun
+1 On Wed, Feb 1, 2023 at 12:52 AM Mich Talebzadeh wrote: > +1 > > > >view my Linkedin profile > > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any >

Re: ASF board report draft for Feb 2023

2023-02-06 Thread Dongjoon Hyun
Thank you, Matei. Could you include the following addtionally? 1. Liang-Chi is preparing v3.3.2 (This month). https://lists.apache.org/thread/nwzr3o2cxyyf6sbb37b8yylgcvmbtp16 2. Since Spark 3.4.0, we attached SBOM to Apache Spark Maven artifacts [SPARK-41893] in line with other ASF projects.

Re: [VOTE] Release Spark 3.3.2 (RC1)

2023-02-11 Thread Dongjoon Hyun
+1 I also verified additional internal tests. Dongjoon. On Sat, Feb 11, 2023 at 11:17 AM Mridul Muralidharan wrote: > > Looks like it was an issue with wget not fetching all the artifacts, my > bad ! > > Looks good to me, +1 for release - thanks ! > > > Regards, > Mridul > > > On Sat, Feb 11,

Re: [VOTE] Release Spark 3.3.2 (RC1)

2023-02-13 Thread Dongjoon Hyun
Hi, All. As the author of that `Improvement` patch, I strongly disagree with giving the wrong idea which Python 3.11 is officially supported in Spark 3.3. I only developed and delivered it for Apache Spark 3.4.0 specifically as `Improvement`. We may want to backport it branch-3.3 but it's also a

Re: [VOTE][SPIP] Lazy Materialization for Parquet Read Performance Improvement

2023-02-13 Thread Dongjoon Hyun
+1 Dongjoon On 2023/02/13 22:52:59 "L. C. Hsieh" wrote: > Hi all, > > I'd like to start the vote for SPIP: Lazy Materialization for Parquet > Read Performance Improvement. > > The high summary of the SPIP is that it proposes an improvement to the > Parquet reader with lazy materialization which

Re: [DISCUSS] Make release cadence predictable

2023-02-14 Thread Dongjoon Hyun
+1 for Hyukjin and Sean's opinion. Thank you for initiating this discussion. If we have a fixed-predefined regular 6-month, I believe we can persuade the incomplete features to wait for next releases more easily. In addition, I want to add the first RC1 date requirement because RC1 always did a

Re: [VOTE][RESULT] Release Spark 3.3.2 (RC1)

2023-02-15 Thread Dongjoon Hyun
Great! Thank you, Liang-Chi! Dongjoon. On Wed, Feb 15, 2023 at 9:22 AM L. C. Hsieh wrote: > The vote passes with 12 +1s (4 binding +1s). > Thanks to all who helped with the release! > > (* = binding) > +1: > - Mridul Muralidharan (*) > - Dongjoon Hyun (*) > - Sean Ow

Re: [ANNOUNCE] Apache Spark 3.3.2 released

2023-02-17 Thread Dongjoon Hyun
Thank you, Liang-Chi! Dongjoon. On Fri, Feb 17, 2023 at 8:45 AM Chao Sun wrote: > Thanks Liang-Chi! > > On Fri, Feb 17, 2023 at 1:28 AM kazuyuki tanimura > wrote: > >> Great, Thank you Liang-Chi >> >> Kazu >> >> On Feb 17, 2023, at 1:02 AM, Wanqiang Ji wrote: >> >> Congratulations! >> >> On F

Re: SPIP: Shutting down spark structured streaming when the streaming process completed current process

2023-02-18 Thread Dongjoon Hyun
Thank you for considering me, but may I ask what makes you think to put me there, Mich? I'm curious about your reason. > I have put dongjoon.hyun as a shepherd. BTW, unfortunately, I cannot help you with that due to my on-going personal stuff. I'll adjust the JIRA first. Thanks, Dongjoon. On S

Re: [DISCUSS] Show Python code examples first in Spark documentation

2023-02-22 Thread Dongjoon Hyun
I have two questions to clarify the scope and boundaries. 1. Does this suggestion imply Python API implementation will be the new blocker in the future in terms of feature parity among languages? Until now, Python API feature parity was one of the audit items because it's not enforced. In other wo

Re: [DISCUSS] Show Python code examples first in Spark documentation

2023-02-23 Thread Dongjoon Hyun
ty to solve in order to claim that. As we say >>> at SPARK-41454, Python language also introduces breaking changes to us >>> historically and we have many `Pinned` python libraries issues. >>> >>> Yes. In fact, regardless of this change, I do believe we

Re: [Question] LimitedInputStream license issue in Spark source.

2023-02-28 Thread Dongjoon Hyun
Since both license headers are Apache License 2.0, we don't see any issue there. They are compatible. The first line of the second license header means the file was copied from Google Guava project originally. Apache Spark community keeps the original header because it has `Authorship` part, `Cop

Re: [Question] LimitedInputStream license issue in Spark source.

2023-02-28 Thread Dongjoon Hyun
May I ask why do you thinkn in that way? Could you elaborate a little more about your concerns if you mean it from a legal perspective? > The ASF header states "Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements.” > I ‘m not sure this is true with thi

Re: [Question] LimitedInputStream license issue in Spark source.

2023-03-02 Thread Dongjoon Hyun
Thank you. Here is the PR to fix that. https://github.com/apache/spark/pull/40249 [SPARK-42649][CORE] Remove the standard Apache License header from the top of third-party source files Dongjoon. On Wed, Mar 1, 2023 at 11:53 PM wrote: > Hi, > > See https://www.apache.org/legal/src-headers.html

Re: [VOTE] Release Apache Spark 3.4.0 (RC3)

2023-03-09 Thread Dongjoon Hyun
Yes, I also confirmed that the v3.4.0-rc3 tag is invalid. I guess we need RC4. Dongjoon. On Thu, Mar 9, 2023 at 7:13 AM Emil Ejbyfeldt wrote: > It might being caused by the v3.4.0-rc3 tag not being part of the 3.4 > branch branch-3.4: > > $ git log --pretty='format:%d %h' --graph origin/branch

Re: [VOTE] Release Apache Spark 3.4.0 (RC3)

2023-03-09 Thread Dongjoon Hyun
new release > > On Thu, Mar 9, 2023, 1:36 PM Dongjoon Hyun > wrote: > >> Yes, I also confirmed that the v3.4.0-rc3 tag is invalid. >> >> I guess we need RC4. >> >> Dongjoon. >> >> On Thu, Mar 9, 2023 at 7:13 AM Emil Ejbyfeldt >> wrote: >&

Re: Ammonite as REPL for Spark Connect

2023-03-23 Thread Dongjoon Hyun
I also support Herman's `SPARK-42884 Add Ammonite REPL integration` PR. Thanks, Dongjoon. On Thu, Mar 23, 2023 at 7:51 AM Mridul Muralidharan wrote: > > Sounds good, thanks for clarifying ! > > Regards, > Mridul > > On Thu, Mar 23, 2023 at 9:09 AM Herman van Hovell > wrote: > >> The goal of a

Re: Slack for PySpark users

2023-03-30 Thread Dongjoon Hyun
Hi, Xiao and all. (cc Matei) Please hold on the vote. There is a concern expressed by ASF board because recent Slack activities created an isolated silo outside of ASF mailing list archive. We need to establish a way to embrace it back to ASF archive before starting anything official. Bests, D

Re: Slack for PySpark users

2023-03-30 Thread Dongjoon Hyun
gt; open source communities. TBH, we are kind of late. I think we can do the > same in our community? > > We can follow the guide when the ASF has an official process for ASF > archiving. Since our PMC are the owner of the slack workspace, we can make > a change based on the policy. W

Re: [VOTE] Release Apache Spark 3.4.0 (RC5)

2023-04-03 Thread Dongjoon Hyun
+1 I also verified that RC5 has SBOM artifacts. https://repository.apache.org/content/repositories/orgapachespark-1439/org/apache/spark/spark-core_2.12/3.4.0/spark-core_2.12-3.4.0-cyclonedx.json https://repository.apache.org/content/repositories/orgapachespark-1439/org/apache/spark/spark-core_2.1

Re: Slack for PySpark users

2023-04-03 Thread Dongjoon Hyun
don't think that matters. However, I stand >>>>>> corrected >>>>>> - To be clear, I intentionally didn't refer to any specific mailing >>>>>> list because we didn't set up any rule here yet. >>>>>> fair enough

Re: Slack for PySpark users

2023-04-03 Thread Dongjoon Hyun
rom relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Mon, 3 Apr 2023 at 20:59, Dongjoon Hyun > wrote: > >> As Mich T

Re: Slack for PySpark users

2023-04-03 Thread Dongjoon Hyun
ASF) >>> have preference. They are going with the way they are convenient. >>> >>> Same applies here - if ASF Slack requires a restricted invitation >>> mechanism then it won't work. Looks like there is a link for an invitation, >>> but we are also

Re: Slack for PySpark users

2023-04-03 Thread Dongjoon Hyun
to keep this > active. > > > > On Mon, Apr 3, 2023 at 16:46 Dongjoon Hyun > wrote: > >> Shall we summarize the discussion so far? >> >> To sum up, "ASF Slack" vs "3rd-party Slack" was the real background to >> initiate this thread instead

Apache Spark 3.2.4 EOL Release?

2023-04-04 Thread Dongjoon Hyun
Hi, All. Since Apache Spark 3.2.0 passed RC7 vote on October 12, 2021, branch-3.2 has been maintained and served well until now. - https://github.com/apache/spark/releases/tag/v3.2.0 (tagged on Oct 6, 2021) - https://lists.apache.org/thread/jslhkh9sb5czvdsn7nz4t40xoyvznlc7 As of today, branch-3.

Re: Apache Spark 3.2.4 EOL Release?

2023-04-05 Thread Dongjoon Hyun
Thank you all. Dongjoon. On 2023/04/05 18:32:07 Gengliang Wang wrote: > +1 > > On Wed, Apr 5, 2023 at 11:27 AM kazuyuki tanimura > wrote: > > > +1 > > > > On Apr 5, 2023, at 6:53 AM, Tom Graves > > wrote: > > > > +1 > > > > Tom

Re: Slack for Spark Community: Merging various threads

2023-04-05 Thread Dongjoon Hyun
Thank you so much, Denny. Yes, let me comment on a few things. > - While there is an ASF Slack , it >requires an @apache.org email address 1. This sounds a little misleading because we can see `guest` accounts in the same link. People can be invited by "I

Re: Apache Spark 3.2.4 EOL Release?

2023-04-06 Thread Dongjoon Hyun
Thank you for reporting. I'll check that, too. Dongjoon. On Thu, Apr 6, 2023 at 1:13 AM yangjie01 wrote: > Hi, Dongjoon Hyun > > Maybe we need include the fix of SPARK-39696 in Apache Spark 3.2.4 EOL > Release, this will fix a data race issue in access to > TaskMetrics

Re: sbt build is broken because repo is not available

2023-04-07 Thread Dongjoon Hyun
Thank you for the pointer, Yuming. Dongjoon. On Fri, Apr 7, 2023 at 12:18 AM Yuming Wang wrote: > Hi all, > > sbt build is broken because repo is not available. Please see: > https://github.com/sbt/sbt/issues/7202. > >

Re: [VOTE] Release Apache Spark 3.4.0 (RC6)

2023-04-07 Thread Dongjoon Hyun
Hi, Xinrong. I saw the RC7 tag. Maybe, RC7 vote is blocked due to the on-going build outage? Dongjoon. On Thu, Apr 6, 2023 at 6:17 PM Xinrong Meng wrote: > Thank you! Let me recut the RC then. > > On Thu, Apr 6, 2023 at 6:14 PM Hyukjin Kwon wrote: > >> Merged the fix. >> >> On Fri, 7 Apr 202

Re: Slack for Spark Community: Merging various threads

2023-04-07 Thread Dongjoon Hyun
Thank you, All. I'm very satisfied with the focused and right questions for the real issues by removing irrelevant claims. :) Let me collect your relevant comments simply. # Category 1: Invitation Hurdle > The key question here is that do PMC members have the bandwidth of inviting everyone in

Re: [VOTE] Release Apache Spark 3.4.0 (RC6)

2023-04-07 Thread Dongjoon Hyun
Got it. Thank you for sharing the current status. Dongjoon. On Fri, Apr 7, 2023 at 9:21 AM Xinrong Meng wrote: > Hi Dongjoon, > > Yes, it is. To be more specific, we failed to build documentation for RC7 > because of the sbt build outage. > > Xinrong > > On Fri, Apr 7,

Re: [VOTE] Release Apache Spark 3.4.0 (RC6)

2023-04-07 Thread Dongjoon Hyun
Thank you! Dongjoon On Fri, Apr 7, 2023 at 2:16 PM Xinrong Meng wrote: > I am able to proceed with the release now. I'll send an announcement when > the RC cut is completed. > > Xinrong > > On Fri, Apr 7, 2023 at 9:54 AM Dongjoon Hyun > wrote: > >> Got it

Re: [VOTE] Release Apache Spark 3.4.0 (RC7)

2023-04-09 Thread Dongjoon Hyun
+1 I verified the same steps like previous RCs. Dongjoon. On Sat, Apr 8, 2023 at 7:47 PM Mridul Muralidharan wrote: > > +1 > > Signatures, digests, etc check out fine. > Checked out tag and build/tested with -Phive -Pyarn -Pmesos -Pkubernetes > > Regards, > Mridul > > > On Sat, Apr 8, 2023 at

[VOTE] Release Apache Spark 3.2.4 (RC1)

2023-04-09 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version 3.2.4. The vote is open until August 13th 1AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.2.4 [ ] -1 Do not release this package because .

Re: [VOTE] Release Apache Spark 3.2.4 (RC1)

2023-04-09 Thread Dongjoon Hyun
Oh, there is a typo in the mail. The following should be `April` instead of `August`. > August 13th 1AM (PST) Dongjoon. On 2023/04/09 23:38:00 Dongjoon Hyun wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.2.4. > > The vote is open until

<    1   2   3   4   5   6   7   8   9   >