pjain1 opened a new issue, #16728:
URL: https://github.com/apache/druid/issues/16728

   For a comparison query that compares previous time range to current time 
range on some metric like below
   ```
   SELECT 
     (COALESCE(base."page", comparison."page")) AS "page", 
     (ANY_VALUE(base."added")) AS "added",
     (ANY_VALUE(comparison."added")) AS "added_prev",
     (ANY_VALUE(base."added" - comparison."added")) AS "added_delta" 
   FROM 
     (SELECT "page", sum(added) AS "added" FROM "wikipedia"  WHERE "__time" >= 
'2016-06-27T00:00:00.000Z' AND "__time" < '2016-06-27T01:00:00.000Z' GROUP BY 1 
ORDER BY "added" DESC LIMIT 10) base 
   LEFT OUTER JOIN 
     (SELECT "page", sum(added) AS "added" FROM "wikipedia" WHERE "__time" >= 
'2016-06-27T01:00:00.000Z' AND "__time" < '2016-06-27T02:00:00.000Z' GROUP BY 
1) comparison 
   ON 
     (base."page" IS NOT DISTINCT FROM comparison."page") 
   GROUP BY 1 
   ORDER BY "added" DESC 
   LIMIT 10
   ```
   Druid calculates the base(left) and comparison(right) inner queries and 
joins them on `page` dimension. However if the dimension is high cardinality, 
the comparison query might fail with `ResourceLimitExceededException` as it 
might exceed `maxSubqueryBytes` limit. Limit cannot be pushed to the comparison 
query as it might have different topNs than the base one. An inner query 
selecting topn values from base time range also cannot be used in the where 
clause of comparison query as if there is a `null` value then `<NULL> IN NULL` 
comparison is false so `null` value in comparison query will be ignored. 
   
   Druid can however compute the base query first and push the join values into 
the comparison query to limit the comparison query results. Can planner do this 
optimization or any other ideas ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to