I think we finally agreed on the way forward for both items, so should be
able to wrap up quickly.

вт, 2 черв. 2026 р. о 09:27 huaxin gao <[email protected]> пише:

> Hi all,
>
> Just to keep everyone updated, the Spark 4.2.0 RC1 cut is still pending on
> the two DSv2 transaction fixes:
>
>    1. https://issues.apache.org/jira/browse/SPARK-56695
>    2. https://issues.apache.org/jira/browse/SPARK-56995
>
> Once these fixes are merged and backported to branch-4.2, I will proceed
> with cutting RC1.
>
> Thanks,
> Huaxin
>
> On Thu, May 28, 2026 at 10:04 AM Anton Okolnychyi <[email protected]>
> wrote:
>
>> We have addressed the CDC items I mentioned earlier. I spent more time
>> looking into CDC support in general and didn't find any new issues.
>>
>> Andreas opened two PRs to address the transaction problems above. I am
>> going to review them today and the plan is to get the PRs in by the end of
>> this week (hopefully) / early next week (if iterations are needed).
>>
>> - Anton
>>
>> чт, 28 трав. 2026 р. о 09:07 huaxin gao <[email protected]> пише:
>>
>>> Thanks Szehon for the update. These two Auto CDC PRs look good to me.
>>>
>>> Anton and Andreas, could you share the current status of the DSv2
>>> transaction fixes for SPARK-56695 and SPARK-56995, and when you expect them
>>> to be merged and backported to branch-4.2?
>>>
>>> Once these pending items are in, I can proceed with cutting RC1.
>>>
>>> Thanks,
>>> Huaxin
>>>
>>> On Wed, May 27, 2026 at 4:56 PM Szehon Ho <[email protected]>
>>> wrote:
>>>
>>>> Hi Huaxin
>>>>
>>>> Thanks for all the hard work doing the release!
>>>>
>>>> It'd be nice to get these two PR by Anish in for the Spark 4.2 feature
>>>> Auto CDC (although its not the end of the world if we cannot).
>>>>
>>>>    - https://github.com/apache/spark/pull/53073
>>>>    - https://github.com/apache/spark/pull/56160
>>>>
>>>> The first one is day 0 bug for SDP and the second is a validation
>>>> that'd be awkward to add after the release.
>>>>
>>>> We will aim to get it in by EOD, but depend on CI.
>>>>
>>>> Thanks!
>>>> Szehon
>>>>
>>>> On Tue, May 26, 2026 at 7:37 PM Cheng Pan <[email protected]> wrote:
>>>>
>>>>> I apologize for any inconvenience caused.
>>>>>
>>>>> My intention was to keep PR open for at least 1-2 workdays (based on
>>>>> the size and complexity of the patch, also don't want to keep it open too
>>>>> long to block the release process) so that developers from all time zones
>>>>> would have the opportunity to review it, but I was completely unaware that
>>>>> Monday is a holiday in the US. The merge operation happened on Tue 11:48 
>>>>> AM
>>>>> PDT, after a formal approval from a PMC member active in the SQL area; 
>>>>> half
>>>>> of the workday is indeed too short for reviewers based in the US to 
>>>>> review.
>>>>>
>>>>> Apologize again, and I'm happy to address any post-review comments.
>>>>>
>>>>> Thanks,
>>>>> Cheng Pan
>>>>>
>>>>>
>>>>>
>>>>> On May 27, 2026, at 09:15, huaxin gao <[email protected]> wrote:
>>>>>
>>>>> Hi Cheng,
>>>>>
>>>>> Thanks for working on this fix.
>>>>>
>>>>> Since this has already been merged into branch-4.2, I will trust your
>>>>> judgment on the fix itself, but I do have some concerns about the process.
>>>>>
>>>>> The PR was opened over the weekend, Monday was a US holiday, and the
>>>>> 12-hour notice was sent at 10:59 PM Monday night Pacific time. In 
>>>>> practice,
>>>>> that did not leave enough review time before merging into the release
>>>>> branch. This is especially concerning for a last-minute change close to RC
>>>>> that includes an API change and behavior changes beyond the narrow
>>>>> correctness issue.
>>>>>
>>>>> For future 4.2.0 release-branch changes, could we please allow more
>>>>> practical review time?
>>>>>
>>>>> Thanks,
>>>>> Huaxin
>>>>>
>>>>> On Mon, May 25, 2026 at 10:59 PM Cheng Pan <[email protected]> wrote:
>>>>>
>>>>>> Huaxin, thank you for replying.
>>>>>>
>>>>>> I would not treat it as a hard blocker given it has been existing for
>>>>>> a long time the impact scope is fairly narrow, but still good to get the
>>>>>> fix include the 4.2.0 given the fix is a relatively small change.
>>>>>>
>>>>>> > The PR also includes API changes and new TABLESAMPLE SYSTEM support
>>>>>> ...
>>>>>> > … unless you think the correctness fix needs to be split out
>>>>>> separately.
>>>>>>
>>>>>> 3 parts mentioned in the PR description can be split into dedicated
>>>>>> PRs, but the correctness fix for (1) requires the API change; the change
>>>>>> for (2) (3) are small, I put them together mainly for demonstration of 
>>>>>> why
>>>>>> the API change makes sense. I’m fine to split the PR and defer the "new
>>>>>> TABLESAMPLE SYSTEM support” to 4.3 if you think it’s risky.
>>>>>>
>>>>>> The PR has been reviewed and approved by cloud-fan, I will leave it
>>>>>> open for another 12 hours and merge it as is if no further comments.
>>>>>>
>>>>>> Thanks,
>>>>>> Cheng Pan
>>>>>>
>>>>>>
>>>>>>
>>>>>> On May 26, 2026, at 00:53, huaxin gao <[email protected]> wrote:
>>>>>>
>>>>>> Hi Cheng,
>>>>>>
>>>>>> Thanks for flagging this. The withReplacement = true pushdown issue
>>>>>> looks valid, but the impact seems fairly narrow. It mainly affects users
>>>>>> doing JDBC TABLESAMPLE pushdown with withReplacement = true on PostgreSQL
>>>>>> or Databricks. The PR also includes API changes and new TABLESAMPLE 
>>>>>> SYSTEM
>>>>>> support, which feels more like a 4.2.1 candidate than a last-minute RC
>>>>>> change.
>>>>>>
>>>>>> Could you evaluate the risk of merging at the last minute? Otherwise
>>>>>> I'd prefer 4.2.1, unless you think the correctness fix needs to be split
>>>>>> out separately.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Huaxin
>>>>>>
>>>>>> On Mon, May 25, 2026 at 3:27 AM Cheng Pan <[email protected]> wrote:
>>>>>>
>>>>>>> Hi Huaxin,
>>>>>>>
>>>>>>> I found some issues in the implementation of JDBC connector
>>>>>>> TABLESAMPLE pushdown, I opened SPARK-57040 and
>>>>>>> https://github.com/apache/spark/pull/56092, it would be great if
>>>>>>> you could take a look and evaluate whether this is a blocker and should 
>>>>>>> be
>>>>>>> included in 4.2.0 since you are the author of this feature.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Cheng Pan
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On May 18, 2026, at 11:40, huaxin gao <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I plan to cut Spark 4.2.0 RC1 on May 20, assuming there are no
>>>>>>> outstanding release blockers.
>>>>>>>
>>>>>>> If you have any fixes that must be included in 4.2.0, please make
>>>>>>> sure they are merged/backported to branch-4.2 before then. If you
>>>>>>> are aware of any release blockers, please reply with the JIRA/PR and
>>>>>>> current status.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Huaxin
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>

Reply via email to