beliefer opened a new pull request #25313: [WIP][SPARK-28580][SQL] Support ANSI 
SQL Unique-Predicate syntax
URL: https://github.com/apache/spark/pull/25313
 
 
   ## What changes were proposed in this pull request?
   
   The aim of this PR is to support ANSI SQL `Unique-Predicate` syntax.
   The function of `Unique-Predicate` is specify a test for the absence of 
duplicate rows.
   The definition in ANSI docs is below:
   ```
   <unique predicate> ::=
   UNIQUE <table subquery>
   ```
   IMHO. I can't find any database supports this syntax. I lost some reference 
of other database.
   The usage maybe looks like:
   ```
   SELECT t.*
   FROM course AS t
   WHERE UNIQUE(
       SELECT r.course_id
       FROM section AS r
       WHERE t.course_id=r.course_id AND r.year = '2018'
   );
   ```
   I have references the implement of `Exists` in Spark SQL. The rule 
`RewritePredicateSubquery` replace the `Exists` with a semi join between inner 
table and outer table.
   I have a basic idea use some rule replace `Unique-Predicate` with equivalent 
SQL.
   Take the above SQL as an example, the replaced SQL likes below:
   ```
   SELECT T.*
   FROM course AS T
   WHERE 1 = (
     SELECT count(R.course_id)
     FROM section AS  R
     WHERE T.course_id=R.course_id AND R.year = 2018
   );
   ```
   But I don't know whether welcomed by everyone or not, I need some better 
thinking.
   ## How was this patch tested?
   
   new UT
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to