The LogicalAggregate will ALWAYS return one row (just as "SELECT COUNT(*) FROM t" always returns one row, even if t is empty, or in this case "SELECT MIN(TRUE) FROM t" always returns one row). If the result is non-empty the column value will be TRUE. If the result is empty the column value will be NULL (aka UNKNOWN).
So, "IS NOT NULL($9)" is a proxy for the "EXISTS ...". It always returns the same value. And thankfully EXISTS (unlike IN) never returns UNKNOWN so we don't have to worry about 3-valued logic. Julian On Wed, Mar 1, 2017 at 10:05 PM, baofeng.zbf <baofeng....@alibaba-inc.com> wrote: > > Hi, > > When I do non-correlated subquery using NOT EXISTS, the behavior of > TableScan on empty result confused me. > > For example, `select * from foo where not exists (select col1, col2 from bar > where id<-1) order by 1 limit 10;` will generate: > > xxxx > LogicalFilter(condition=[NOT(IS NOT NULL($9))]) > LogicalJoin(condition=[true], joinType=[left]) > TableScan(table=[foo], projects=[...]) > LogicalAggregate(group=[{}], agg#0=[MIN($0)]) > LogicalProject($f0=[true]) > LogicalProject(id=[$0]) > LogicalFilter(condition=[<($0, -1)]) > TableScan(table=[bar], projects=[[0, 1]]) > > When `select col1, col2 from bar where id<-1` returns empty result, > TableScan givens no rows. > > In this cases, the left join will not perform, unless empty result will > always return Row(null, null) if it is empty. > > Is Calcite`s subquery conversion based on this assumption? > > Thx. > Baofeng Zhang.