Akshat-Jain commented on code in PR #16804:
URL: https://github.com/apache/druid/pull/16804#discussion_r1697097941
##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/querykit/WindowOperatorQueryKit.java:
##########
@@ -91,7 +91,6 @@ public QueryDefinition makeQueryDefinition(
ShuffleSpec nextShuffleSpec =
findShuffleSpecForNextWindow(operatorList.get(0), maxWorkerCount);
Review Comment:
`findShuffleSpecForNextWindow()` returns null when we have empty over clause.
In my original commit on this PR, I changed the logic of
`findShuffleSpecForNextWindow()` to be:
```java
if (partition == null) {
return null;
}
if (partition.getPartitionColumns().isEmpty()) {
return MixShuffleSpec.instance();
}
```
But Adarsh pointed out that this would propagate MixShuffleSpec to the other
query kits which is inefficient.
For example, it would make `shuffleSpec=MixShuffleSpec` for multiple earlier
stages in queries where we have multiple Scan stages before the Window stage.
Sample query:
```sql
WITH "ext" AS (
select * from wikipedia left join drill_wikipedia on
wikipedia."countryName" = drill_wikipedia."countryName" where
wikipedia."countryName" in ('Guatemala') limit 5
)
SELECT
"__time",
"countryName",
"channel",
row_number() over (order by countryName, cityName, channel) as c1,
count(channel) over (partition by cityName order by countryName, cityName,
channel) as c2
FROM "ext"
group by __time, countryName, cityName, channel
order by countryName, cityName, channel
```
The above query gives:

Hence, it's better to not propagate MixShuffleSpec to the other query kits,
and just override it here only for the just-previous stage.
Hope that makes sense!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]