beliefer opened a new pull request #25313: [WIP][SPARK-28580][SQL] Support ANSI SQL Unique-Predicate syntax URL: https://github.com/apache/spark/pull/25313 ## What changes were proposed in this pull request? The aim of this PR is to support ANSI SQL `Unique-Predicate` syntax. The function of `Unique-Predicate` is specify a test for the absence of duplicate rows. The definition in ANSI docs is below: ``` <unique predicate> ::= UNIQUE <table subquery> ``` IMHO. I can't find any database supports this syntax. I lost some reference of other database. The usage maybe looks like: ``` SELECT t.* FROM course AS t WHERE UNIQUE( SELECT r.course_id FROM section AS r WHERE t.course_id=r.course_id AND r.year = '2018' ); ``` I have references the implement of `Exists` in Spark SQL. The rule `RewritePredicateSubquery` replace the `Exists` with a semi join between inner table and outer table. I have a basic idea use some rule replace `Unique-Predicate` with equivalent SQL. Take the above SQL as an example, the replaced SQL likes below: ``` SELECT T.* FROM course AS T WHERE 1 = ( SELECT count(R.course_id) FROM section AS R WHERE T.course_id=R.course_id AND R.year = 2018 ); ``` But I don't know whether welcomed by everyone or not, I need some better thinking. ## How was this patch tested? new UT
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
