Hi,
I'm new here and trying to contribute to Catalyst based on my experience.
SparkSQL supports three kinds of subquery in Filter: InSubquery, Exists,
ScalarSubquery. But "ANY(SOME) subquery" and "ALL subquery" are also
supported by most DBs.
Therefore I'm writing this email in order to discuss how to support ANY
subquery in SparkSQL.

ANY Syntax:
SELECT column(s)
FROM tabel
WHERE column(s) operator ANY
(SELECT column(s) FROM table WHERE condition);

I have done some experiments. And here are some basic points:
1. Since InSubquery is a special case of AnySubquery whose operator should
be "="(EqualTo), they can be similarly implemented.
2. Syntax in SqlBase.g4 could be : comparisonOperator NOT? kind=(ANY | SOME)
'(' query ')'
3. Define a case class AnySubquery(values: Seq[Expression],
comparisonOperator: String, query: ListQuery)
4. Analyse the subquery in the rule: ResolveSubquery.
5. Rewrite the subquery as Semi/Anti Join in the rule:
RewritePredicateSubquery
6. The only difference between InSubquery and AnySubquery is the condition
expression of rewrite-join.

I haven't opened a pull request or JIRA because I'm new here and I do not
want to bother the reviewers.
If you have any suggestions or something I missed, it's a pleasure to
discuss with you. 

Thanks,

Mingcong Han



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to