NGA-TRAN commented on issue #7664:
URL: https://github.com/apache/datafusion/issues/7664#issuecomment-2818401885
DataDog is interested in this feature. Our focus is on small sliding windows
of 3, 5, 7, and 9, which I believe are quite reasonable to support as they only
require a small buffer to compute and return the median. Below are the steps to
reproduce:
```SQL
-- create a table with 3 columns: timestamp, tags, value
CREATE TABLE test_table (
timestamp INT,
tags VARCHAR(255),
value DOUBLE
);
-- insert some data into the table
INSERT INTO test_table (timestamp, tags, value) VALUES
(1, 'tag1', 10),
(2, 'tag1', 20),
(3, 'tag1', 30),
(4, 'tag1', 40),
(5, 'tag1', 50),
(1, 'tag2', 60),
(2, 'tag2', 70),
(3, 'tag2', 80),
(4, 'tag2', 90),
(5, 'tag2', 100);
-- Not supported yet
SELECT
timestamp,
tags,
median(value) OVER (PARTITION BY tags ORDER BY timestamp ROWS BETWEEN 1
PRECEDING AND 1 FOLLOWING) AS value_median_3
FROM test_table
ORDER BY tags, timestamp;
-- This feature is not implemented: Aggregate can not be used as a sliding
accumulator because `retract_batch` is not implemented:
median(test_table.value) PARTITION BY [test_table.tags] ORDER BY
[test_table.timestamp ASC NULLS LAST] ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
-- Other streamable aggregations such as max are supported
SELECT
timestamp,
tags,
value,
max(value) OVER (PARTITION BY tags ORDER BY timestamp ROWS BETWEEN 1
PRECEDING AND 1 FOLLOWING) AS value_max_3
FROM test_table
ORDER BY tags, timestamp;
+-----------+------+-------+-------------+
| timestamp | tags | value | value_max_3 |
+-----------+------+-------+-------------+
| 1 | tag1 | 10.0 | 20.0 |
| 2 | tag1 | 20.0 | 30.0 |
| 3 | tag1 | 30.0 | 40.0 |
| 4 | tag1 | 40.0 | 50.0 |
| 5 | tag1 | 50.0 | 50.0 |
| 1 | tag2 | 60.0 | 70.0 |
| 2 | tag2 | 70.0 | 80.0 |
| 3 | tag2 | 80.0 | 90.0 |
| 4 | tag2 | 90.0 | 100.0 |
| 5 | tag2 | 100.0 | 100.0 |
+-----------+------+-------+-------------+
10 row(s) fetched.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]