paul-rogers commented on PR #13793:
URL: https://github.com/apache/druid/pull/13793#issuecomment-1435100849

   This discussion has expanded to cover two topics: how we handle aggregations 
in general on MSQ, and the original, specific topic for this PR. Issue #13816 
covers the broader topic.
   
   The issue here seems to be a semantic issue. MSQ requires that every 
expression refer to an input column. However, `LATEST(foo)` has two reference: 
one to an _input column_ (`foo`) and another implicit reference to the _output 
column_ `__time`. MSQ will have to special-case this code. Someone has to 
determine where the reference needs to be modified. At native query generation 
time in the planner? As part of the controller task?
   
   The workaround is for the planner to simply forbid the one-argument form of 
these functions in MSQ, forcing the user to provide an input column to use for 
the basis. However, if we do that, then, as noted above, that input column _is 
not_ available at compaction time, so we would only solve the "first pass" 
ingestion (MSQ) but fail the "second pass" (compaction).
   
   Would be great for someone to do the analysis, the post a description of the 
problem, and propose a solution that works for both ingestion and compaction.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to