I think we finally agreed on the way forward for both items, so should be able to wrap up quickly.
вт, 2 черв. 2026 р. о 09:27 huaxin gao <[email protected]> пише: > Hi all, > > Just to keep everyone updated, the Spark 4.2.0 RC1 cut is still pending on > the two DSv2 transaction fixes: > > 1. https://issues.apache.org/jira/browse/SPARK-56695 > 2. https://issues.apache.org/jira/browse/SPARK-56995 > > Once these fixes are merged and backported to branch-4.2, I will proceed > with cutting RC1. > > Thanks, > Huaxin > > On Thu, May 28, 2026 at 10:04 AM Anton Okolnychyi <[email protected]> > wrote: > >> We have addressed the CDC items I mentioned earlier. I spent more time >> looking into CDC support in general and didn't find any new issues. >> >> Andreas opened two PRs to address the transaction problems above. I am >> going to review them today and the plan is to get the PRs in by the end of >> this week (hopefully) / early next week (if iterations are needed). >> >> - Anton >> >> чт, 28 трав. 2026 р. о 09:07 huaxin gao <[email protected]> пише: >> >>> Thanks Szehon for the update. These two Auto CDC PRs look good to me. >>> >>> Anton and Andreas, could you share the current status of the DSv2 >>> transaction fixes for SPARK-56695 and SPARK-56995, and when you expect them >>> to be merged and backported to branch-4.2? >>> >>> Once these pending items are in, I can proceed with cutting RC1. >>> >>> Thanks, >>> Huaxin >>> >>> On Wed, May 27, 2026 at 4:56 PM Szehon Ho <[email protected]> >>> wrote: >>> >>>> Hi Huaxin >>>> >>>> Thanks for all the hard work doing the release! >>>> >>>> It'd be nice to get these two PR by Anish in for the Spark 4.2 feature >>>> Auto CDC (although its not the end of the world if we cannot). >>>> >>>> - https://github.com/apache/spark/pull/53073 >>>> - https://github.com/apache/spark/pull/56160 >>>> >>>> The first one is day 0 bug for SDP and the second is a validation >>>> that'd be awkward to add after the release. >>>> >>>> We will aim to get it in by EOD, but depend on CI. >>>> >>>> Thanks! >>>> Szehon >>>> >>>> On Tue, May 26, 2026 at 7:37 PM Cheng Pan <[email protected]> wrote: >>>> >>>>> I apologize for any inconvenience caused. >>>>> >>>>> My intention was to keep PR open for at least 1-2 workdays (based on >>>>> the size and complexity of the patch, also don't want to keep it open too >>>>> long to block the release process) so that developers from all time zones >>>>> would have the opportunity to review it, but I was completely unaware that >>>>> Monday is a holiday in the US. The merge operation happened on Tue 11:48 >>>>> AM >>>>> PDT, after a formal approval from a PMC member active in the SQL area; >>>>> half >>>>> of the workday is indeed too short for reviewers based in the US to >>>>> review. >>>>> >>>>> Apologize again, and I'm happy to address any post-review comments. >>>>> >>>>> Thanks, >>>>> Cheng Pan >>>>> >>>>> >>>>> >>>>> On May 27, 2026, at 09:15, huaxin gao <[email protected]> wrote: >>>>> >>>>> Hi Cheng, >>>>> >>>>> Thanks for working on this fix. >>>>> >>>>> Since this has already been merged into branch-4.2, I will trust your >>>>> judgment on the fix itself, but I do have some concerns about the process. >>>>> >>>>> The PR was opened over the weekend, Monday was a US holiday, and the >>>>> 12-hour notice was sent at 10:59 PM Monday night Pacific time. In >>>>> practice, >>>>> that did not leave enough review time before merging into the release >>>>> branch. This is especially concerning for a last-minute change close to RC >>>>> that includes an API change and behavior changes beyond the narrow >>>>> correctness issue. >>>>> >>>>> For future 4.2.0 release-branch changes, could we please allow more >>>>> practical review time? >>>>> >>>>> Thanks, >>>>> Huaxin >>>>> >>>>> On Mon, May 25, 2026 at 10:59 PM Cheng Pan <[email protected]> wrote: >>>>> >>>>>> Huaxin, thank you for replying. >>>>>> >>>>>> I would not treat it as a hard blocker given it has been existing for >>>>>> a long time the impact scope is fairly narrow, but still good to get the >>>>>> fix include the 4.2.0 given the fix is a relatively small change. >>>>>> >>>>>> > The PR also includes API changes and new TABLESAMPLE SYSTEM support >>>>>> ... >>>>>> > … unless you think the correctness fix needs to be split out >>>>>> separately. >>>>>> >>>>>> 3 parts mentioned in the PR description can be split into dedicated >>>>>> PRs, but the correctness fix for (1) requires the API change; the change >>>>>> for (2) (3) are small, I put them together mainly for demonstration of >>>>>> why >>>>>> the API change makes sense. I’m fine to split the PR and defer the "new >>>>>> TABLESAMPLE SYSTEM support” to 4.3 if you think it’s risky. >>>>>> >>>>>> The PR has been reviewed and approved by cloud-fan, I will leave it >>>>>> open for another 12 hours and merge it as is if no further comments. >>>>>> >>>>>> Thanks, >>>>>> Cheng Pan >>>>>> >>>>>> >>>>>> >>>>>> On May 26, 2026, at 00:53, huaxin gao <[email protected]> wrote: >>>>>> >>>>>> Hi Cheng, >>>>>> >>>>>> Thanks for flagging this. The withReplacement = true pushdown issue >>>>>> looks valid, but the impact seems fairly narrow. It mainly affects users >>>>>> doing JDBC TABLESAMPLE pushdown with withReplacement = true on PostgreSQL >>>>>> or Databricks. The PR also includes API changes and new TABLESAMPLE >>>>>> SYSTEM >>>>>> support, which feels more like a 4.2.1 candidate than a last-minute RC >>>>>> change. >>>>>> >>>>>> Could you evaluate the risk of merging at the last minute? Otherwise >>>>>> I'd prefer 4.2.1, unless you think the correctness fix needs to be split >>>>>> out separately. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Huaxin >>>>>> >>>>>> On Mon, May 25, 2026 at 3:27 AM Cheng Pan <[email protected]> wrote: >>>>>> >>>>>>> Hi Huaxin, >>>>>>> >>>>>>> I found some issues in the implementation of JDBC connector >>>>>>> TABLESAMPLE pushdown, I opened SPARK-57040 and >>>>>>> https://github.com/apache/spark/pull/56092, it would be great if >>>>>>> you could take a look and evaluate whether this is a blocker and should >>>>>>> be >>>>>>> included in 4.2.0 since you are the author of this feature. >>>>>>> >>>>>>> Thanks, >>>>>>> Cheng Pan >>>>>>> >>>>>>> >>>>>>> >>>>>>> On May 18, 2026, at 11:40, huaxin gao <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I plan to cut Spark 4.2.0 RC1 on May 20, assuming there are no >>>>>>> outstanding release blockers. >>>>>>> >>>>>>> If you have any fixes that must be included in 4.2.0, please make >>>>>>> sure they are merged/backported to branch-4.2 before then. If you >>>>>>> are aware of any release blockers, please reply with the JIRA/PR and >>>>>>> current status. >>>>>>> >>>>>>> Thanks, >>>>>>> Huaxin >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>
