gianm commented on PR #13793:
URL: https://github.com/apache/druid/pull/13793#issuecomment-1457996004
IMO, `LATEST` shouldn't work if the relation it's reading from doesn't have
a `__time` column. I'd expect a validation error if someone tries to write
`SELECT … LATEST(foo) … FROM <something>` and `<something>` doesn't have
`__time`.
> SQL semantics say that I can use SELECT columns in an aggregate. So, it
should be perfectly legal to say `LATEST_BY(foo, __time)` where `__time` has
been defined as `TIME_PARSE(timestamp) AS __time`.
Is this right? PostgreSQL doesn't seem to think so:
```
create table t (x bigint);
insert into t (x) values (1), (2), (3);
select
x + 1 as y,
sum(y) sum_y
from t
group by y;
```
yields:
```
Query Error: error: column "y" does not exist
```
Our SQL validator doesn't accept it either. Something similar to the first
query in Druid results in:
```
org.apache.calcite.runtime.CalciteContextException: From line 1, column 42
to line 1, column 44: Column 'y' not found in any table
```
And this query:
```
select
x + 1 as x,
sum(x)
from t
group by x;
```
yields this in both systems:
```
| x | sum |
| --- | --- |
| 4 | 3 |
| 3 | 2 |
| 2 | 1 |
```
i.e., the `x` in `sum(x)` refers to `t.x`, not `x + 1 as x`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]