Re: [PR] fix q5 sql [sedona-spatialbench]

via GitHub Fri, 19 Sep 2025 21:36:21 -0700


james-willis commented on code in PR #24:
URL: 
https://github.com/apache/sedona-spatialbench/pull/24#discussion_r2365277246



##########
print_queries.py:
##########
@@ -404,23 +405,15 @@ def q5() -> str:
         return """
 -- Q5 (SedonaDB): In SedonaDB ST_Collect is an aggregate function so no need 
to use ARRAY_AGG first.
 -- ST_Collect does not accept an array as input so we cannot use the query 
with ARRAY_AGG.
-WITH per AS (
-   SELECT
-       c.c_custkey,
-       c.c_name AS customer_name,
-       DATE_TRUNC('month', t.t_pickuptime) AS pickup_month,
-       COUNT(t.t_tripkey) AS n_trips,
-       ST_Area(ST_ConvexHull(
-               ST_Collect(ST_GeomFromWKB(t.t_dropoffloc))
-               )) AS monthly_travel_hull_area
-   FROM trip t
-            JOIN customer c ON t.t_custkey = c.c_custkey
-   GROUP BY c.c_custkey, c.c_name, pickup_month
-)
-SELECT *
-FROM per
-WHERE n_trips > 5
-ORDER BY n_trips DESC, c_custkey ASC;
+SELECT
+    c.c_custkey, c.c_name AS customer_name,
+    DATE_TRUNC('month', t.t_pickuptime) AS pickup_month,
+    ST_Area(ST_ConvexHull(ST_Collect(ST_GeomFromWKB(t.t_dropoffloc)))) AS 
monthly_travel_hull_area,

Review Comment:
   I tried running sedona spark benchmarks this afternoon. The order by clause 
of q5 was invalid spark sql.
   
   Once I updated that query I felt it was important to update all 
implementations of q5 to match as closely as possible.
   
   I believe it is essential for the queries to match as closely as possible 
across engines to have both a perception and a reality of fairness. 
   
   Do you think we should make a different change in order to get a working 
sedona spark q5 implementation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] fix q5 sql [sedona-spatialbench]

Reply via email to