pjain1 opened a new issue, #16896:
URL: https://github.com/apache/druid/issues/16896

   Consider a datasource having a multi valued dimension. If there is a full 
join query like below it works
   ```
   SELECT
     (COALESCE(base."channel_mvd", comparison."channel_mvd")) AS "channel",
     (ANY_VALUE(base."added" - comparison."added")) AS "added_delta"
   FROM (
     SELECT
       "channel_mvd",
       (SUM(added)) AS "added"
     FROM "wikipedia" CROSS JOIN UNNEST(MV_TO_ARRAY("channel")) as 
u("channel_mvd")
     WHERE ("__time" >= '2016-06-27T03:00:00.000Z' AND "__time" < 
'2016-06-27T06:00:00.000Z') 
     GROUP BY 1
   ) AS base
   FULL JOIN (
     SELECT
       "channel_mvd",
       (SUM(added)) AS "added"
     FROM "wikipedia" CROSS JOIN UNNEST(MV_TO_ARRAY("channel")) as 
u("channel_mvd")
     WHERE ("__time" >= '2016-06-27T00:00:00.000Z' AND "__time" < 
'2016-06-27T03:00:00.000Z') 
     GROUP BY 1
   ) AS comparison ON (base."channel_mvd" IS NOT DISTINCT FROM 
comparison."channel_mvd")
   GROUP BY 1
   ORDER BY "added_delta" DESC
   LIMIT 250
   ```
   added `CROSS JOIN UNNEST(MV_TO_ARRAY("channel")) as u("channel_mvd")` just 
to simulate column being mvd as channel is not. Anyways if any mvd is used 
directly in the join above then also it works.
   
   However, if we do the same thing with CTE then it fails (assume 
`channel_mvd` is mvd here) -
   ```
   WITH base AS (
     SELECT
       "channel_mvd",
       (SUM(added)) AS "added"
     FROM "wikipedia"
     WHERE ("__time" >= '2016-06-27T03:00:00.000Z' AND "__time" < 
'2016-06-27T06:00:00.000Z')
     GROUP BY 1
     ORDER BY "channel_mvd" DESC
     LIMIT 250
   )
   SELECT
     (COALESCE(base."channel_mvd", comparison."channel_mvd")) AS "channel",
     (ANY_VALUE(base."added" - comparison."added")) AS "added_delta"
   FROM base
   LEFT JOIN (
     SELECT
       "channel_mvd",
       (SUM(added)) AS "added"
     FROM "wikipedia"
     WHERE ("__time" >= '2016-06-27T00:00:00.000Z' AND "__time" < 
'2016-06-27T03:00:00.000Z') AND (
       ("channel_mvd") IN (SELECT "base"."channel_mvd" FROM "base")
     )
     GROUP BY 1
   ) AS comparison ON (base."channel_mvd" IS NOT DISTINCT FROM 
comparison."channel_mvd")
   GROUP BY 1
   ORDER BY "added_delta" DESC
   LIMIT 250
   ```
   It fails with ` QueryUnsupportedException{msg=Joining against a multi-value 
dimension is not supported.` and if I add `CROSS JOIN 
UNNEST(MV_TO_ARRAY("channel")) AS u ("channel_mvd")` then it fails with 
`Unhandled Query Planning Failure, see broker logs for details`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to