[
https://issues.apache.org/jira/browse/PHOENIX-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139904#comment-14139904
]
Maryann Xue commented on PHOENIX-1168:
--------------------------------------
bq. can you give me the 10,000ft view of how you execute the inner query? Is it
done client-side with an early exit based on ANY/SOME/ALL? Or do you turn it
into a join and push the inner or outer in a hash cache?
ANY/SOME/ALL is handled just similar to IN clause now. No early exit, but
rather execute the entire inner query and build it into an literal array.
If we should choose to do early exit with ANY/SOME/ALL, we would want to do
them client side, for server-to-server communication is something we'd like to
avoid. But since this ANY/SOME/ALL can appear anywhere in the where clause, it
is not always possible to isolate them and leave the others to the server-side.
A more aggressive and maybe better optimization is to rewrite ANY/SOME/ALL to
their equivalent, like >= ALL (...) means >= max(...) assuming that query does
not return empty results.
http://www.oracle-base.com/articles/misc/all-any-some-comparison-conditions-in-sql.php
bq. what is the relationship between the ANY operator and EXISTS? Would it make
sense to push a LIMIT to the inner query (or maybe you already do)?
EXISTS is not implemented in this patch. I was thinking to do it in PHOENIX-167.
bq. can a constant appear on the LHS when ANY/SOME/ALL is used? If so, is the
constant pushed into the execution of the inner query?
Yes, since ANY/SOME/ALL is already converted to a LiteralExpression of Phoenix
Array at the time of the outer query being executed, anything should work
exactly the way for normal array constructs. But no push-down to the inner
query.
bq. is there any opportunity to flatten the inner query in an IN with the outer
query?
Could you give me a more detailed idea on this? like an example?
bq. when there's no modifier, is that when it's only valid for the IN query to
return a single row? Would it make sense to push a LIMIT 2 to the IN query to
prevent a bunch of data from being pulled over?
Yes, otherwise a SQLException would be thrown (code:
SINGLE_ROW_SUBQUERY_RETURNS_MULTIPLE_ROWS). This is done with early exit, at
the time of iteration of the inner query. We have a flag "expectSingleRow", and
when moving iterators to a second row, the exception is thrown. But still,
think LIMIT 2 would make some sense, for we can avoid starting parallel
iterators for the inner query, right?
> Support non-correlated sub-queries in where clause having a comparison
> operator with no modifier or a comparison operator modified by ANY, SOME or
> ALL
> ------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-1168
> URL: https://issues.apache.org/jira/browse/PHOENIX-1168
> Project: Phoenix
> Issue Type: Sub-task
> Affects Versions: 3.0.0, 4.0.0, 5.0.0
> Reporter: Maryann Xue
> Assignee: Maryann Xue
> Fix For: 3.0.0, 4.0.0, 5.0.0
>
> Attachments: 1168.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)