The issue here is not with zero-field records. The issue is that when doing `SELECT COUNT(*)`, the sql-to-rel conversion doesn't produce a projection with just a dummy constant to have one field, but there's no projection at all and all fields are read from the TableScan.
Viliam On Wed, 19 Jan 2022 at 03:08, Julian Hyde <[email protected]> wrote: > As Stamatis said, we don’t have a consistent policy on zero-length > records. But in that thread I logged > https://issues.apache.org/jira/browse/CALCITE-4597 < > https://issues.apache.org/jira/browse/CALCITE-4597> to clarify the > situation. It would be great if someone worked on it. > > I see Viliam’s point that it makes physical optimization easier if there > is an explicit Project telling you which columns (if any) need to be read > from the TableScan. AggregateExtractProjectRule [1] may make it easier to > accomplish this. But in the usual case, when this rule is not enabled, I > don’t think we should create a Project. > > Julian > > [1] > https://github.com/apache/calcite/blob/d70583c4a8013f878457f82df6dffddd71875900/core/src/main/java/org/apache/calcite/rel/rules/AggregateExtractProjectRule.java#L53 > < > https://github.com/apache/calcite/blob/d70583c4a8013f878457f82df6dffddd71875900/core/src/main/java/org/apache/calcite/rel/rules/AggregateExtractProjectRule.java#L53> > > > > On Jan 15, 2022, at 2:07 PM, Stamatis Zampetakis <[email protected]> > wrote: > > > > Hi Viliam, > > > > I don't see a problem with the current plan. It seems correct and more > > intuitive than the one with the DUMMY projection. > > > > LogicalAggregate(group=[{}], EXPR$0=[COUNT($0)]) > > LogicalTableScan(table=foo) > > > > The code you cited in SqlToRelConverter seems an attempt to handle empty > > records/tuples that we are not handling very well in general [1]. > > Doesn't seem related to performance as the use-case you mentioned. > > > > Best, > > Stamatis > > > > [1] https://lists.apache.org/thread/dtsz159x4nk3l9b3topgykqpsml024tv > > > > On Fri, Jan 14, 2022 at 12:57 PM Viliam Durina <[email protected]> > wrote: > > > >> I noticed this two pieces of code: > >> > >> 1. in SqlToRelConverter: > >> > >> if (preExprs.size() == 0) { > >> // Special case for COUNT(*), where we can end up with no inputs > >> // at all. The rest of the system doesn't like 0-tuples, so we > >> // select a dummy constant here. > >> final RexNode zero = rexBuilder.makeExactLiteral(BigDecimal.ZERO); > >> preExprs = ImmutableList.of(Pair.of(zero, null)); > >> } > >> > >> 2. in RelBuilder: > >> > >> // Some parts of the system can't handle rows with zero fields, so > >> // pretend that one field is used. > >> if (fieldsUsed.isEmpty()) { > >> r = ((Project) r).getInput(); > >> } > >> > >> They run in this order, and the 2nd overrides the former. The end > result is > >> that for query `SELECT COUNT(*) FROM foo`, the result of sql-to-rel > >> conversion is: > >> > >> LogicalAggregate(group=[{}], EXPR$0=[COUNT($0)]) > >> LogicalTableScan(table=foo) > >> > >> instead of: > >> > >> LogicalAggregate(group=[{}], EXPR$0=[COUNT($0)]) > >> LogicalProject(DUMMY=[0]) > >> LogicalTableScan(table=foo) > >> > >> In our implementation we push the projection to table scan. Without the > >> project, we fetch full rows, even though the aggregation uses no row. > >> > >> The code was introduced in > >> https://issues.apache.org/jira/browse/CALCITE-3763, but maybe it was > >> broken > >> later. > >> > >> Do you think this is an issue? > >> > >> Viliam > >> > >> -- > >> This message contains confidential information and is intended only for > >> the > >> individuals named. If you are not the named addressee you should not > >> disseminate, distribute or copy this e-mail. Please notify the sender > >> immediately by e-mail if you have received this e-mail by mistake and > >> delete this e-mail from your system. E-mail transmission cannot be > >> guaranteed to be secure or error-free as information could be > intercepted, > >> corrupted, lost, destroyed, arrive late or incomplete, or contain > viruses. > >> The sender therefore does not accept liability for any errors or > omissions > >> in the contents of this message, which arise as a result of e-mail > >> transmission. If verification is required, please request a hard-copy > >> version. -Hazelcast > >> > > -- This message contains confidential information and is intended only for the individuals named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. If verification is required, please request a hard-copy version. -Hazelcast
