Akshat-Jain commented on code in PR #16804:
URL: https://github.com/apache/druid/pull/16804#discussion_r1696608483


##########
extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/querykit/scan/ScanQueryKit.java:
##########
@@ -178,6 +183,11 @@ public QueryDefinition makeQueryDefinition(
       }
     }
 
+    // If window has an empty over, we want a single worker to process entire 
data for window function evaluation.
+    if (windowHasEmptyOver) {
+      scanShuffleSpec = MixShuffleSpec.instance();

Review Comment:
   > Also, I think we should fix the NPE in this PR if the fix is small and 
straightforward.
   
   I haven't dug into the ingestion issues yet, so I'm unsure of the scope 
there. But since it's unrelated to this PR, I'd prefer taking it up separately 
in future.
   
   > You can also test this by running an async query, with rowsPerPage 
parameter, and see if the correct number of pages are created
   
   I tried this. Summarising what I am getting:
   
   Query:
   ```sql
   select
   "countryName",
   "cityName",
   row_number() over (partition by cityName order by countryName, cityName, 
channel) as c1,
   count(channel) over (partition by cityName order by countryName, cityName, 
channel) as c2,
   "channel"
   from "drill_wikipedia"
   where countryName in ('Guatemala')
   group by countryName, cityName, channel 
   order by countryName, cityName, channel
   ```
   
   ## Query context 1
   
   ```json
   {
     "maxNumTasks": 4,
     "enableWindowing": true,
     "rowsPerSegment": 1,
     "selectDestination": "durableStorage",
     "durableShuffleStorage": true
   }
   ```
   
   <img width="1637" alt="image" 
src="https://github.com/user-attachments/assets/c0a9a0e3-557f-48cd-9631-d48f4ff82cd3";>
   
   
   ## Query context 2 (added `rowsPerPage: 1`)
   
   ```json
   {
     "maxNumTasks": 4,
     "enableWindowing": true,
     "rowsPerPage": 1,
     "rowsPerSegment": 1,
     "selectDestination": "durableStorage",
     "durableShuffleStorage": true
   }
   ```
   
   <img width="1632" alt="image" 
src="https://github.com/user-attachments/assets/4e6b0ea9-559f-4bf1-872f-7b4019a7b610";>
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to