This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/main by this push:
new cb9ec127e6 slt: Add test for REE arrays in group by (#19763)
cb9ec127e6 is described below
commit cb9ec127e69b267e357eb56c3b712483d94fa5a4
Author: Frederic Branczyk <[email protected]>
AuthorDate: Mon Jan 12 18:57:56 2026 +0100
slt: Add test for REE arrays in group by (#19763)
## Which issue does this PR close?
Closes #16011 (really all functionality was already implemented, but in
https://github.com/apache/datafusion/pull/18981 @Jefffrey suggested to
only close once we have an SLT covering the functionality)
## Rationale for this change
Ensure that aggregating on REE arrays doesn't break end-to-end.
## What changes are included in this PR?
An SLT covering aggregating on REE arrays.
## Are these changes tested?
The whole change is a test.
## Are there any user-facing changes?
None, just ensuring it doesn't break in the future.
@alamb @Jefffrey
---
.../sqllogictest/test_files/run_end_encoded.slt | 57 ++++++++++++++++++++++
1 file changed, 57 insertions(+)
diff --git a/datafusion/sqllogictest/test_files/run_end_encoded.slt
b/datafusion/sqllogictest/test_files/run_end_encoded.slt
new file mode 100644
index 0000000000..1f0a9b4eb3
--- /dev/null
+++ b/datafusion/sqllogictest/test_files/run_end_encoded.slt
@@ -0,0 +1,57 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# Tests for Run-End Encoded (REE) array support in aggregations
+# This tests that REE arrays can be used as GROUP BY keys (requires proper
hashing support)
+
+# Create a table with REE-encoded sensor IDs using arrow_cast
+# First create primitive arrays, then cast to REE in a second step
+statement ok
+CREATE TABLE sensor_readings AS
+WITH raw_data AS (
+ SELECT * FROM (
+ VALUES
+ ('sensor_A', 22),
+ ('sensor_A', 23),
+ ('sensor_B', 20),
+ ('sensor_A', 24)
+ ) AS t(sensor_id, temperature)
+)
+SELECT
+ arrow_cast(sensor_id, 'RunEndEncoded("run_ends": non-null Int32, "values":
Utf8)') AS sensor_id,
+ temperature
+FROM raw_data;
+
+# Test basic aggregation with REE column as GROUP BY key
+query ?RI rowsort
+SELECT
+ sensor_id,
+ AVG(temperature) AS avg_temp,
+ COUNT(*) AS reading_count
+FROM sensor_readings
+GROUP BY sensor_id;
+----
+sensor_A 23 3
+sensor_B 20 1
+
+# Test DISTINCT with REE column
+query ? rowsort
+SELECT DISTINCT sensor_id
+FROM sensor_readings;
+----
+sensor_A
+sensor_B
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]