+1 to define an allowlist of features that we want to backport to branch 3.3. I also have a few in my mind complex type support in vectorized parquet reader: https://github.com/apache/spark/pull/34659 refine the DS v2 filter API for JDBC v2: https://github.com/apache/spark/pull/35768 a few new SQL functions that have been in development for a while: to_char, split_part, percentile_disc, try_sum, etc.
On Wed, Mar 16, 2022 at 2:41 PM Maxim Gekk <maxim.g...@databricks.com.invalid> wrote: > Hi All, > > I have created the branch for Spark 3.3: > https://github.com/apache/spark/commits/branch-3.3 > > Please, backport important fixes to it, and if you have some doubts, ping > me in the PR. Regarding new features, we are still building the allow list > for branch-3.3. > > Best regards, > Max Gekk > > > On Wed, Mar 16, 2022 at 5:51 AM Dongjoon Hyun <dongjoon.h...@gmail.com> > wrote: > >> Yes, I agree with you for your whitelist approach for backporting. :) >> Thank you for summarizing. >> >> Thanks, >> Dongjoon. >> >> >> On Tue, Mar 15, 2022 at 4:20 PM Xiao Li <gatorsm...@gmail.com> wrote: >> >>> I think I finally got your point. What you want to keep unchanged is the >>> branch cut date of Spark 3.3. Today? or this Friday? This is not a big >>> deal. >>> >>> My major concern is whether we should keep merging the feature work or >>> the dependency upgrade after the branch cut. To make our release time more >>> predictable, I am suggesting we should finalize the exception PR list >>> first, instead of merging them in an ad hoc way. In the past, we spent a >>> lot of time on the revert of the PRs that were merged after the branch cut. >>> I hope we can minimize unnecessary arguments in this release. Do you agree, >>> Dongjoon? >>> >>> >>> >>> Dongjoon Hyun <dongjoon.h...@gmail.com> 于2022年3月15日周二 15:55写道: >>> >>>> That is not totally fine, Xiao. It sounds like you are asking a change >>>> of plan without a proper reason. >>>> >>>> Although we cut the branch Today according our plan, you still can >>>> collect the list and make a list of exceptions. I'm not blocking what you >>>> want to do. >>>> >>>> Please let the community start to ramp down as we agreed before. >>>> >>>> Dongjoon >>>> >>>> >>>> >>>> On Tue, Mar 15, 2022 at 3:07 PM Xiao Li <gatorsm...@gmail.com> wrote: >>>> >>>>> Please do not get me wrong. If we don't cut a branch, we are allowing >>>>> all patches to land Apache Spark 3.3. That is totally fine. After we cut >>>>> the branch, we should avoid merging the feature work. In the next three >>>>> days, let us collect the actively developed PRs that we want to make an >>>>> exception (i.e., merged to 3.3 after the upcoming branch cut). Does that >>>>> make sense? >>>>> >>>>> Dongjoon Hyun <dongjoon.h...@gmail.com> 于2022年3月15日周二 14:54写道: >>>>> >>>>>> Xiao. You are working against what you are saying. >>>>>> If you don't cut a branch, it means you are allowing all patches to >>>>>> land Apache Spark 3.3. No? >>>>>> >>>>>> > we need to avoid backporting the feature work that are not being >>>>>> well discussed. >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Mar 15, 2022 at 12:12 PM Xiao Li <gatorsm...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Cutting the branch is simple, but we need to avoid backporting the >>>>>>> feature work that are not being well discussed. Not all the members are >>>>>>> actively following the dev list. I think we should wait 3 more days for >>>>>>> collecting the PR list before cutting the branch. >>>>>>> >>>>>>> BTW, there are very few 3.4-only feature work that will be affected. >>>>>>> >>>>>>> Xiao >>>>>>> >>>>>>> Dongjoon Hyun <dongjoon.h...@gmail.com> 于2022年3月15日周二 11:49写道: >>>>>>> >>>>>>>> Hi, Max, Chao, Xiao, Holden and all. >>>>>>>> >>>>>>>> I have a different idea. >>>>>>>> >>>>>>>> Given the situation and small patch list, I don't think we need to >>>>>>>> postpone the branch cut for those patches. It's easier to cut a >>>>>>>> branch-3.3 >>>>>>>> and allow backporting. >>>>>>>> >>>>>>>> As of today, we already have an obvious Apache Spark 3.4 patch in >>>>>>>> the branch together. This situation only becomes worse and worse >>>>>>>> because >>>>>>>> there is no way to block the other patches from landing >>>>>>>> unintentionally if >>>>>>>> we don't cut a branch. >>>>>>>> >>>>>>>> [SPARK-38335][SQL] Implement parser support for DEFAULT column >>>>>>>> values >>>>>>>> >>>>>>>> Let's cut `branch-3.3` Today for Apache Spark 3.3.0 preparation. >>>>>>>> >>>>>>>> Best, >>>>>>>> Dongjoon. >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Mar 15, 2022 at 10:17 AM Chao Sun <sunc...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Cool, thanks for clarifying! >>>>>>>>> >>>>>>>>> On Tue, Mar 15, 2022 at 10:11 AM Xiao Li <gatorsm...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >> >>>>>>>>> >> For the following list: >>>>>>>>> >> #35789 [SPARK-32268][SQL] Row-level Runtime Filtering >>>>>>>>> >> #34659 [SPARK-34863][SQL] Support complex types for Parquet >>>>>>>>> vectorized reader >>>>>>>>> >> #35848 [SPARK-38548][SQL] New SQL function: try_sum >>>>>>>>> >> Do you mean we should include them, or exclude them from 3.3? >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > If possible, I hope these features can be shipped with Spark 3.3. >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > Chao Sun <sunc...@apache.org> 于2022年3月15日周二 10:06写道: >>>>>>>>> >> >>>>>>>>> >> Hi Xiao, >>>>>>>>> >> >>>>>>>>> >> For the following list: >>>>>>>>> >> >>>>>>>>> >> #35789 [SPARK-32268][SQL] Row-level Runtime Filtering >>>>>>>>> >> #34659 [SPARK-34863][SQL] Support complex types for Parquet >>>>>>>>> vectorized reader >>>>>>>>> >> #35848 [SPARK-38548][SQL] New SQL function: try_sum >>>>>>>>> >> >>>>>>>>> >> Do you mean we should include them, or exclude them from 3.3? >>>>>>>>> >> >>>>>>>>> >> Thanks, >>>>>>>>> >> Chao >>>>>>>>> >> >>>>>>>>> >> On Tue, Mar 15, 2022 at 9:56 AM Dongjoon Hyun < >>>>>>>>> dongjoon.h...@gmail.com> wrote: >>>>>>>>> >> > >>>>>>>>> >> > The following was tested and merged a few minutes ago. So, we >>>>>>>>> can remove it from the list. >>>>>>>>> >> > >>>>>>>>> >> > #35819 [SPARK-38524][SPARK-38553][K8S] Bump Volcano to v1.5.1 >>>>>>>>> >> > >>>>>>>>> >> > Thanks, >>>>>>>>> >> > Dongjoon. >>>>>>>>> >> > >>>>>>>>> >> > On Tue, Mar 15, 2022 at 9:48 AM Xiao Li <gatorsm...@gmail.com> >>>>>>>>> wrote: >>>>>>>>> >> >> >>>>>>>>> >> >> Let me clarify my above suggestion. Maybe we can wait 3 more >>>>>>>>> days to collect the list of actively developed PRs that we want to >>>>>>>>> merge to >>>>>>>>> 3.3 after the branch cut? >>>>>>>>> >> >> >>>>>>>>> >> >> Please do not rush to merge the PRs that are not fully >>>>>>>>> reviewed. We can cut the branch this Friday and continue merging the >>>>>>>>> PRs >>>>>>>>> that have been discussed in this thread. Does that make sense? >>>>>>>>> >> >> >>>>>>>>> >> >> Xiao >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>>> >> >> Holden Karau <hol...@pigscanfly.ca> 于2022年3月15日周二 09:10写道: >>>>>>>>> >> >>> >>>>>>>>> >> >>> May I suggest we push out one week (22nd) just to give >>>>>>>>> everyone a bit of breathing space? Rushed software development more >>>>>>>>> often >>>>>>>>> results in bugs. >>>>>>>>> >> >>> >>>>>>>>> >> >>> On Tue, Mar 15, 2022 at 6:23 AM Yikun Jiang < >>>>>>>>> yikunk...@gmail.com> wrote: >>>>>>>>> >> >>>> >>>>>>>>> >> >>>> > To make our release time more predictable, let us >>>>>>>>> collect the PRs and wait three more days before the branch cut? >>>>>>>>> >> >>>> >>>>>>>>> >> >>>> For SPIP: Support Customized Kubernetes Schedulers: >>>>>>>>> >> >>>> #35819 [SPARK-38524][SPARK-38553][K8S] Bump Volcano to >>>>>>>>> v1.5.1 >>>>>>>>> >> >>>> >>>>>>>>> >> >>>> Three more days are OK for this from my view. >>>>>>>>> >> >>>> >>>>>>>>> >> >>>> Regards, >>>>>>>>> >> >>>> Yikun >>>>>>>>> >> >>> >>>>>>>>> >> >>> -- >>>>>>>>> >> >>> Twitter: https://twitter.com/holdenkarau >>>>>>>>> >> >>> Books (Learning Spark, High Performance Spark, etc.): >>>>>>>>> https://amzn.to/2MaRAG9 >>>>>>>>> >> >>> YouTube Live Streams: >>>>>>>>> https://www.youtube.com/user/holdenkarau >>>>>>>>> >>>>>>>>