alamb commented on pull request #1448:
URL: https://github.com/apache/arrow-datafusion/pull/1448#issuecomment-993877514
> I have a perhaps naive question, shouldn't aliasing a query to an already
existing column just fail outright? Wouldn't that reasonably be something of a
"which column do you really want here?" confusion?
@hntd187 -- I agree that aliasing one column to another column is unlikely
to be actually useful 😆 it was just the minimal reproducer I could come up
with.
Another use of `Projection`, despite its slightly misleading name, is to
evaluate expressions as well as to control the names of the fields in the
output schema, which is what IOx was doing that triggered this bug.
Specifically, the plan in the IOx test was:
```
2021-12-13T14:38:49.774041Z DEBUG datafusion::execution::context: Logical
plan:
Projection: #cpu.cpu, #cpu.host, CAST(#usage_system AS Int64) AS
usage_system, CAST(#usage_user AS Int64) AS usage_user, #time
Sort: #cpu.cpu ASC NULLS FIRST, #cpu.host ASC NULLS FIRST
Projection: #cpu.cpu, #cpu.host, #usage_system, #usage_user, #time
Aggregate: groupBy=[[#cpu.cpu, #cpu.host]],
aggr=[[COUNT(#cpu.usage_system AS usage_system) AS usage_system,
COUNT(#cpu.usage_user AS usage_user) AS usage_user, MAX(#cpu.time) AS time]]
Filter: TimestampNanosecond(0) <= #cpu.time AND #cpu.time <
TimestampNanosecond(2001)
TableScan: cpu projection=None
```
Which was built programatically, but is approximately what would come out of
this query
```sql
SELECT
cpu,
host,
count(usage_system) as usage_system,
count(usage_user) as usage_user
max(time) as time,
FROM
cpu
WHERE 0 < time AND time < 2001
```
In this case, the consumer of the output expect the columns named a certain
way, and without the alias `count(usage_system)` results in a column named
something like `count(usage_system)` rather than the expected `usage_system`,
and this the alias is added
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]