adarshsanjeev commented on issue #13824:
URL: https://github.com/apache/druid/issues/13824#issuecomment-1438347030
The cause of the problem seems to be present in StringLastAggregatorFactory,
which is used in case of EARLIEST and LATEST. In case the time column is not
known, we default to ColumnHolder.TIME_COLUMN_NAME ("__time"). This is under
the assumption that a time column should be present and works for non MSQ
queries. For some MSQ queries which read from an external source, the __time
column is present in the output, but during aggregation, might be referred to
by a temporary name or virtual column.
An ideal solution would be to handle reading from aliased columns directly.
This would help for queries like
```
TIME_PARSE("timestamp") AS "__time",
LATEST_BY("comment", "__time", 1024),
```
which do not work currently.
An alternate solution could be to handle EARLIEST and LATEST as a special
case for now. We could change the implicit reference to the __time column.
MSQTaskQueryMaker has the necessary mappings to know what is mapped to the
__time column in the output. ColumnMappings contains the mapping of __time to a
the intermediate column, (MSQ sets CTX_TIME_COLUMN_NAME to this in its query
context) and dimensions contains the mappings of virtual columns. Changing this
reference from __time to the column that is mapped to it in the final output
produces the expected output of latest in the above query with LATEST. This
might need some additional changes to support compaction, but it could be able
to handle this case if the reference to the column is changed to the __time
column during the process.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]