+1 (non-binding) Thanks for working on this Anton! Some links to other engines that also did something similar:
HIVE-13076 - https://issues.apache.org/jira/browse/HIVE-13076 IMPALA-3531 - https://issues.apache.org/jira/browse/IMPALA-3531 In fact, Spark had a very old Jira SPARK-19842 - https://issues.apache.org/jira/browse/SPARK-19842 Thanks Anurag Mantripragada > On Mar 22, 2025, at 5:36 AM, Yuming Wang <[email protected]> wrote: > > +1 > > On Sat, Mar 22, 2025 at 7:01 PM Peter Toth <[email protected] > <mailto:[email protected]>> wrote: >> +1 >> >> On Fri, Mar 21, 2025 at 10:24 PM Szehon Ho <[email protected] >> <mailto:[email protected]>> wrote: >>> +1 (non binding) >>> >>> Agree with Anton, data sources like the open table formats define the >>> requirement, and definitely need engines to write to it accordingly. >>> >>> Thanks, >>> Szehon >>> >>> On Fri, Mar 21, 2025 at 1:31 PM Anton Okolnychyi <[email protected] >>> <mailto:[email protected]>> wrote: >>>>> -1 (non-binding): Breaks the Chain of Responsibility. Constraints should >>>>> be defined and enforced by the data sources themselves, not Spark. Spark >>>>> is a processing engine, and enforcing constraints at this level blurs >>>>> architectural boundaries, making Spark responsible for something it does >>>>> not control. >>>> >>>> I disagree that this breaks the chain of responsibility. It may be quite >>>> the opposite, in fact. Spark is already responsible for enforcing NOT NULL >>>> constraints by adding AssertNotNull for required columns today. Connectors >>>> like Iceberg and Delta store constraint definitions but rely on engines >>>> like Spark to enforce them during INSERT, DELETE, UPDATE, and MERGE >>>> operations. Without this API, each connector would need to reimplement the >>>> same logic, creating duplication. >>>> >>>> The proposal is aligned with the SQL standard and other relational >>>> databases. In my view, it simply makes Spark a better engine, facilitates >>>> data accuracy and consistency, and enables performance optimizations. >>>> >>>> - Anton >>>> >>>> пт, 21 бер. 2025 р. о 12:59 Ángel Álvarez Pascua >>>> <[email protected] <mailto:[email protected]>> >>>> пише: >>>>> -1 (non-binding): Breaks the Chain of Responsibility. Constraints should >>>>> be defined and enforced by the data sources themselves, not Spark. Spark >>>>> is a processing engine, and enforcing constraints at this level blurs >>>>> architectural boundaries, making Spark responsible for something it does >>>>> not control. >>>>> >>>>> El vie, 21 mar 2025 a las 20:18, L. C. Hsieh (<[email protected] >>>>> <mailto:[email protected]>>) escribió: >>>>>> +1 >>>>>> >>>>>> On Fri, Mar 21, 2025 at 12:13 PM huaxin gao <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>> > >>>>>> > +1 >>>>>> > >>>>>> > On Fri, Mar 21, 2025 at 12:08 PM Denny Lee <[email protected] >>>>>> > <mailto:[email protected]>> wrote: >>>>>> >> >>>>>> >> +1 (non-binding) >>>>>> >> >>>>>> >> On Fri, Mar 21, 2025 at 11:52 Gengliang Wang <[email protected] >>>>>> >> <mailto:[email protected]>> wrote: >>>>>> >>> >>>>>> >>> +1 >>>>>> >>> >>>>>> >>> On Fri, Mar 21, 2025 at 11:46 AM Anton Okolnychyi >>>>>> >>> <[email protected] <mailto:[email protected]>> wrote: >>>>>> >>>> >>>>>> >>>> Hi all, >>>>>> >>>> >>>>>> >>>> I would like to start a vote on adding support for constraints to >>>>>> >>>> DSv2. >>>>>> >>>> >>>>>> >>>> Discussion thread: >>>>>> >>>> https://lists.apache.org/thread/njqjcryq0lot9rkbf10mtvf7d1t602bj >>>>>> >>>> SPIP: >>>>>> >>>> https://docs.google.com/document/d/1EHjB4W1LjiXxsK_G7067j9pPX0y15LUF1Z5DlUPoPIo >>>>>> >>>> PR with the API changes: https://github.com/apache/spark/pull/50253 >>>>>> >>>> JIRA: https://issues.apache.org/jira/browse/SPARK-51207 >>>>>> >>>> >>>>>> >>>> Please vote on the SPIP for the next 72 hours: >>>>>> >>>> >>>>>> >>>> [ ] +1: Accept the proposal as an official SPIP >>>>>> >>>> [ ] +0 >>>>>> >>>> [ ] -1: I don’t think this is a good idea because … >>>>>> >>>> >>>>>> >>>> - Anton >>>>>> >>>>>> --------------------------------------------------------------------- >>>>>> To unsubscribe e-mail: [email protected] >>>>>> <mailto:[email protected]> >>>>>>
