Hi all, I just cut branch-3.2 on Github and created version 3.3.0 on Jira. When merging PRs on the master branch before 3.2.0 RC, please help cherry-picking bug fixes and ongoing major features mentioned in this thread to branch-3.2, thanks!
On Fri, Jul 2, 2021 at 2:31 AM Dongjoon Hyun <[email protected]> wrote: > Thank you, Gengliang! > > On Wed, Jun 30, 2021 at 10:56 PM Gengliang Wang <[email protected]> wrote: > >> Hi all, >> >> Just as a gentle reminder, I will do the branch cut tomorrow. Please >> focus on finalizing the works to land in Spark 3.2.0. >> After the branch cut, we can still merge the ongoing major features >> mentioned in this thread. There should no be other new features in branch >> 3.2. >> Thanks! >> >> On Thu, Jun 17, 2021 at 2:57 PM Hyukjin Kwon <[email protected]> wrote: >> >>> *GA -> QA >>> >>> On Thu, 17 Jun 2021, 15:16 Hyukjin Kwon, <[email protected]> wrote: >>> >>>> I think we would make sure treating these items in the list as >>>> exceptions from the code freeze, and discourage to push new APIs and >>>> features though. >>>> >>>> GA period ideally we should focus on bug fixes and polishing. >>>> >>>> It would be great if we can speed up on these items in the list too. >>>> >>>> >>>> On Thu, 17 Jun 2021, 15:08 Gengliang Wang, <[email protected]> wrote: >>>> >>>>> Thanks for the suggestions from Dongjoon, Liangchi, Min, and Xiao! >>>>> Now we make it clear that it's a soft cut and we can still merge >>>>> important code changes to branch-3.2 before RC. Let's keep the branch cut >>>>> date as July 1st. >>>>> >>>>> On Thu, Jun 17, 2021 at 1:41 PM Dongjoon Hyun <[email protected]> >>>>> wrote: >>>>> >>>>>> > First, I think you are saying "branch-3.2"; >>>>>> >>>>>> To Xiao. Yes, it's was a typo of "branch-3.2". >>>>>> >>>>>> > We do strongly prefer to cut the release for Spark 3.2.0 including >>>>>> all the patches under SPARK-30602. >>>>>> > This way, we can backport the other performance/operability >>>>>> enhancements tickets under SPARK-33235 into branch-3.2 to be released in >>>>>> future Spark 3.2.x patch releases. >>>>>> >>>>>> To Min, after releasing 3.2.0, only bug fixes are allowed for 3.2.1+ >>>>>> as Xiao wrote. >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Jun 16, 2021 at 9:42 PM Xiao Li <[email protected]> wrote: >>>>>> >>>>>>> To Liang-Chi, I'm -1 for postponing the branch cut because this is a >>>>>>>> soft cut and the committers still are able to commit to `branch-3.3` >>>>>>>> according to their decisions. >>>>>>> >>>>>>> >>>>>>> First, I think you are saying "branch-3.2"; >>>>>>> >>>>>>> Second, the "so cut" means no "code freeze", although we cut the >>>>>>> branch. To avoid releasing half-baked and unready features, the release >>>>>>> manager needs to be very careful when cutting the RC. Based on what is >>>>>>> proposed here, the RC date is the actual code freeze date. >>>>>>> >>>>>>> This way, we can backport the other performance/operability >>>>>>>> enhancements tickets under SPARK-33235 into branch-3.2 to be released >>>>>>>> in >>>>>>>> future Spark 3.2.x patch releases. >>>>>>> >>>>>>> >>>>>>> This is not allowed based on the policy. Only bug fixes can be >>>>>>> merged to the patch releases. Thus, if we know it will introduce major >>>>>>> performance regression, we have to turn the feature off by default. >>>>>>> >>>>>>> Xiao >>>>>>> >>>>>>> >>>>>>> >>>>>>> Min Shen <[email protected]> 于2021年6月16日周三 下午3:22写道: >>>>>>> >>>>>>>> Hi Gengliang, >>>>>>>> >>>>>>>> Thanks for volunteering as the release manager for Spark 3.2.0. >>>>>>>> Regarding the ongoing work of push-based shuffle in SPARK-30602, we >>>>>>>> are close to having all the patches merged to master to enable >>>>>>>> push-based >>>>>>>> shuffle. >>>>>>>> Currently, there are 2 PRs under SPARK-30602 that are under active >>>>>>>> review (SPARK-32922 and SPARK-35671), and hopefully can be merged soon. >>>>>>>> We should be able to post the PRs for the other 2 remaining tickets >>>>>>>> (SPARK-32923 and SPARK-35546) early next week. >>>>>>>> >>>>>>>> The tickets under SPARK-30602 are the minimum set of patches to >>>>>>>> enable push-based shuffle. >>>>>>>> We do have other performance/operability enhancements tickets under >>>>>>>> SPARK-33235 that are needed to fully contribute what we have >>>>>>>> internally for >>>>>>>> push-based shuffle. >>>>>>>> However, these are optional for enabling push-based shuffle. >>>>>>>> We do strongly prefer to cut the release for Spark 3.2.0 including >>>>>>>> all the patches under SPARK-30602. >>>>>>>> This way, we can backport the other performance/operability >>>>>>>> enhancements tickets under SPARK-33235 into branch-3.2 to be released >>>>>>>> in >>>>>>>> future Spark 3.2.x patch releases. >>>>>>>> I understand the preference of not postponing the branch cut date. >>>>>>>> We will check with Dongjoon regarding the soft cut date and the >>>>>>>> flexibility for including the remaining tickets under SPARK-30602 into >>>>>>>> branch-3.2. >>>>>>>> >>>>>>>> Best, >>>>>>>> Min >>>>>>>> >>>>>>>> On Wed, Jun 16, 2021 at 1:20 PM Liang-Chi Hsieh <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> Thanks Dongjoon. I've talked with Dongjoon offline to know more >>>>>>>>> this. >>>>>>>>> As it is soft cut date, there is no reason to postpone it. >>>>>>>>> >>>>>>>>> It sounds good then to keep original branch cut date. >>>>>>>>> >>>>>>>>> Thank you. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Dongjoon Hyun-2 wrote >>>>>>>>> > Thank you for volunteering, Gengliang. >>>>>>>>> > >>>>>>>>> > Apache Spark 3.2.0 is the first version enabling AQE by default. >>>>>>>>> I'm also >>>>>>>>> > watching some on-going improvements on that. >>>>>>>>> > >>>>>>>>> > https://issues.apache.org/jira/browse/SPARK-33828 (SQL >>>>>>>>> Adaptive Query >>>>>>>>> > Execution QA) >>>>>>>>> > >>>>>>>>> > To Liang-Chi, I'm -1 for postponing the branch cut because this >>>>>>>>> is a soft >>>>>>>>> > cut and the committers still are able to commit to `branch-3.3` >>>>>>>>> according >>>>>>>>> > to their decisions. >>>>>>>>> > >>>>>>>>> > Given that Apache Spark had 115 commits in a week in various >>>>>>>>> areas >>>>>>>>> > concurrently, we should start QA for Apache Spark 3.2 by creating >>>>>>>>> > branch-3.3 and allowing only limited backporting. >>>>>>>>> > >>>>>>>>> > https://github.com/apache/spark/graphs/commit-activity >>>>>>>>> > >>>>>>>>> > Bests, >>>>>>>>> > Dongjoon. >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > On Wed, Jun 16, 2021 at 9:19 AM Liang-Chi Hsieh < >>>>>>>>> >>>>>>>>> > viirya@ >>>>>>>>> >>>>>>>>> > > wrote: >>>>>>>>> > >>>>>>>>> >> First, thanks for being volunteer as the release manager of >>>>>>>>> Spark 3.2.0, >>>>>>>>> >> Gengliang! >>>>>>>>> >> >>>>>>>>> >> And yes, for the two important Structured Streaming features, >>>>>>>>> RocksDB >>>>>>>>> >> StateStore and session window, we're working on them and expect >>>>>>>>> to have >>>>>>>>> >> them >>>>>>>>> >> in the new release. >>>>>>>>> >> >>>>>>>>> >> So I propose to postpone the branch cut date. >>>>>>>>> >> >>>>>>>>> >> Thank you! >>>>>>>>> >> >>>>>>>>> >> Liang-Chi >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> Gengliang Wang-2 wrote >>>>>>>>> >> > Thanks, Hyukjin. >>>>>>>>> >> > >>>>>>>>> >> > The expected target branch cut date of Spark 3.2 is *July >>>>>>>>> 1st* on >>>>>>>>> >> > https://spark.apache.org/versioning-policy.html. However, I >>>>>>>>> notice that >>>>>>>>> >> > there are still multiple important projects in progress now: >>>>>>>>> >> > >>>>>>>>> >> > [Core] >>>>>>>>> >> > >>>>>>>>> >> > - SPIP: Support push-based shuffle to improve shuffle >>>>>>>>> efficiency >>>>>>>>> >> > <https://issues.apache.org/jira/browse/SPARK-30602> >>>>>>>>> >> > >>>>>>>>> >> > [SQL] >>>>>>>>> >> > >>>>>>>>> >> > - Support ANSI SQL INTERVAL types >>>>>>>>> >> > <https://issues.apache.org/jira/browse/SPARK-27790> >>>>>>>>> >> > - Support Timestamp without time zone data type >>>>>>>>> >> > <https://issues.apache.org/jira/browse/SPARK-35662> >>>>>>>>> >> > - Aggregate (Min/Max/Count) push down for Parquet >>>>>>>>> >> > <https://issues.apache.org/jira/browse/SPARK-34952> >>>>>>>>> >> > >>>>>>>>> >> > [Streaming] >>>>>>>>> >> > >>>>>>>>> >> > - EventTime based sessionization (session window) >>>>>>>>> >> > <https://issues.apache.org/jira/browse/SPARK-10816> >>>>>>>>> >> > - Add RocksDB StateStore as external module >>>>>>>>> >> > <https://issues.apache.org/jira/browse/SPARK-34198> >>>>>>>>> >> > >>>>>>>>> >> > >>>>>>>>> >> > I wonder whether we should postpone the branch cut date. >>>>>>>>> >> > cc Min Shen, Yi Wu, Max Gekk, Huaxin Gao, Jungtaek Lim, >>>>>>>>> Yuanjian >>>>>>>>> >> > Li, Liang-Chi Hsieh, who work on the projects above. >>>>>>>>> >> > >>>>>>>>> >> > On Tue, Jun 15, 2021 at 4:34 PM Hyukjin Kwon < >>>>>>>>> >> >>>>>>>>> >> > gurwls223@ >>>>>>>>> >> >>>>>>>>> >> > > wrote: >>>>>>>>> >> > >>>>>>>>> >> >> +1, thanks. >>>>>>>>> >> >> >>>>>>>>> >> >> On Tue, 15 Jun 2021, 16:17 Gengliang Wang, < >>>>>>>>> >> >>>>>>>>> >> > ltnwgl@ >>>>>>>>> >> >>>>>>>>> >> > > wrote: >>>>>>>>> >> >> >>>>>>>>> >> >>> Hi, >>>>>>>>> >> >>> >>>>>>>>> >> >>> As the expected release date is close, I would like to >>>>>>>>> volunteer as >>>>>>>>> >> the >>>>>>>>> >> >>> release manager for Apache Spark 3.2.0. >>>>>>>>> >> >>> >>>>>>>>> >> >>> Thanks, >>>>>>>>> >> >>> Gengliang >>>>>>>>> >> >>> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >> -- >>>>>>>>> >> Sent from: >>>>>>>>> http://apache-spark-developers-list.1001551.n3.nabble.com/ >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> --------------------------------------------------------------------- >>>>>>>>> >> To unsubscribe e-mail: >>>>>>>>> >>>>>>>>> > [email protected] >>>>>>>>> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Sent from: >>>>>>>>> http://apache-spark-developers-list.1001551.n3.nabble.com/ >>>>>>>>> >>>>>>>>> >>>>>>>>> --------------------------------------------------------------------- >>>>>>>>> To unsubscribe e-mail: [email protected] >>>>>>>>> >>>>>>>>>
