Vedin commented on PR #16933:
URL: https://github.com/apache/datafusion/pull/16933#issuecomment-3184394834
Hi @haohuaijin, I accidentally worked on the same feature. The approach is
the same. So, I don't want anyhow conflict with your contribution and will just
wait until your PR is merged. I just want to highlight 2 cases that probably
are not currently covered in your PR.
1. Aggregate functions with `QUALIFY`
```sql
CREATE TABLE qt (i INT, p VARCHAR, o INT) AS VALUES
(1, 'A', 1),
(2, 'A', 2),
(3, 'B', 1),
(4, 'B', 2);
SELECT p, SUM(o) AS s
FROM qt
GROUP BY p
QUALIFY RANK() OVER (ORDER BY s DESC) = 1
ORDER BY p;
```
2. Constant filter + QUALIFY
```sql
CREATE TABLE web_base_events_this_run (
domain_sessionid VARCHAR,
app_id VARCHAR,
page_view_id VARCHAR,
derived_tstamp TIMESTAMP,
dvce_created_tstamp TIMESTAMP,
event_id VARCHAR
) AS SELECT * FROM VALUES
('ds1', 'appA', NULL, '2025-01-01 10:00:00'::timestamp, '2025-01-01
10:05:00'::timestamp, 'e1'),
('ds1', 'appA', NULL, '2025-01-01 11:00:00'::timestamp, '2025-01-01
11:00:00'::timestamp, 'e2'),
('ds1', 'appA', 'pv', '2025-01-01 12:00:00'::timestamp, '2025-01-01
12:00:00'::timestamp, 'e3'),
('ds2', 'appB', NULL, '2025-01-01 09:00:00'::timestamp, '2025-01-01
09:10:00'::timestamp, 'e4'),
('ds2', 'appB', NULL, '2025-01-01 09:05:00'::timestamp, '2025-01-01
09:09:00'::timestamp, 'e5');
SELECT domain_sessionid, app_id
FROM web_base_events_this_run
WHERE page_view_id IS NULL
QUALIFY ROW_NUMBER() OVER (
PARTITION BY domain_sessionid
ORDER BY derived_tstamp, dvce_created_tstamp, event_id
) = 1
ORDER BY domain_sessionid;
```
I covered the first one by adding `qualify` expressions to
`aggr_expr_haystack`:
```rust
let aggr_expr_haystack = select_exprs
.iter()
.chain(having_expr_opt.iter())
.chain(qualify_expr_opt_pre_aggr.iter());
```
The second one required me to change the logic in
`common_subexpr_eliminate.rs`. You can check it out here
(https://github.com/Embucket/datafusion/pull/34/). Maybe you'll come up with a
better solution.
Hope this helps.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]