gianm commented on code in PR #13506:
URL: https://github.com/apache/druid/pull/13506#discussion_r1042611343


##########
docs/multi-stage-query/reference.md:
##########
@@ -198,13 +198,99 @@ The following table lists the context parameters for the 
MSQ task engine:
 | `maxNumTasks` | SELECT, INSERT, REPLACE<br /><br />The maximum total number 
of tasks to launch, including the controller task. The lowest possible value 
for this setting is 2: one controller and one worker. All tasks must be able to 
launch simultaneously. If they cannot, the query returns a `TaskStartTimeout` 
error code after approximately 10 minutes.<br /><br />May also be provided as 
`numTasks`. If both are present, `maxNumTasks` takes priority.| 2 |
 | `taskAssignment` | SELECT, INSERT, REPLACE<br /><br />Determines how many 
tasks to use. Possible values include: <ul><li>`max`: Uses as many tasks as 
possible, up to `maxNumTasks`.</li><li>`auto`: When file sizes can be 
determined through directory listing (for example: local files, S3, GCS, HDFS) 
uses as few tasks as possible without exceeding 10 GiB or 10,000 files per 
task, unless exceeding these limits is necessary to stay within `maxNumTasks`. 
When file sizes cannot be determined through directory listing (for example: 
http), behaves the same as `max`.</li></ul> | `max` |
 | `finalizeAggregations` | SELECT, INSERT, REPLACE<br /><br />Determines the 
type of aggregation to return. If true, Druid finalizes the results of complex 
aggregations that directly appear in query results. If false, Druid returns the 
aggregation's intermediate type rather than finalized type. This parameter is 
useful during ingestion, where it enables storing sketches directly in Druid 
tables. For more information about aggregations, see [SQL aggregation 
functions](../querying/sql-aggregations.md). | true |
+| `sqlJoinAlgorithm` | SELECT, INSERT, REPLACE<br /><br />Algorithm to use for 
JOIN. Use `broadcast` (the default) for broadcast hash join or `sortMerge` for 
sort-merge join. Affects all JOIN operations in the query. See [Joins](#joins) 
for more details. | `broadcast` |

Review Comment:
   Unfortunately, hints are not available in the version of Calcite version we 
use. Newer versions have this: 
https://calcite.apache.org/docs/reference.html#sql-hints
   
   @abhishekagarwal87 found some reasons we couldn't do it right now, as 
referenced by this comment: 
https://github.com/apache/druid/pull/13153#issuecomment-1261204764. 
@abhishekagarwal87 would you mind writing up your notes as to what blocks an 
upgrade, and making an issue about that, titled something like `Upgrade Calcite 
past <whatever version introduced the blocking problem>`? That way, we have an 
issue we can refer to and use to discuss possible ways to fix the blockers.
   
   Once we sort that out, I'd like to deprecate the context parameter and move 
things to use hints (and eventually statistics as well) instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to