gianm commented on code in PR #19011:
URL: https://github.com/apache/druid/pull/19011#discussion_r2876782438
##########
server/src/main/java/org/apache/druid/server/QueryLifecycle.java:
##########
@@ -159,6 +164,8 @@ public <T> QueryResponse<T> runSimple(
{
initialize(query);
+ checkQueryBlocklist();
Review Comment:
`runSimple` is not always used, which means the blocklist won't always be
checked if the call is put here. Some usages of `QueryLifecycle` call the
methods individually.
IMO best to put it in `doAuthorize`. This method is always called-- it's the
only place the state can transition to `AUTHORIZED`.
It would also work to put it in `execute`.
##########
server/src/main/java/org/apache/druid/server/coordinator/CoordinatorDynamicConfig.java:
##########
@@ -351,6 +360,17 @@ public Set<String> getTurboLoadingNodes()
return turboLoadingNodes;
}
+ /**
+ * List of query blocklist rules for dynamically blocking queries on brokers.
+ *
+ * @return List of query blocklist rules
+ */
+ @JsonProperty
+ public List<QueryBlocklistRule> getQueryBlocklist()
Review Comment:
Please make this `@JsonInclude(NON_EMPTY)` so it is skipped from the
serialized form of config if not present. Keeps things cleaner.
##########
docs/configuration/index.md:
##########
@@ -838,6 +838,73 @@ The following table shows the dynamic configuration
properties for the Coordinat
|`replicateAfterLoadTimeout`|Boolean flag for whether or not additional
replication is needed for segments that have failed to load due to the expiry
of `druid.coordinator.load.timeout`. If this is set to true, the Coordinator
will attempt to replicate the failed segment on a different historical server.
This helps improve the segment availability if there are a few slow Historicals
in the cluster. However, the slow Historical may still load the segment later
and the Coordinator may issue drop requests if the segment is
over-replicated.|false|
|`turboLoadingNodes`| Experimental. List of Historical servers to place in
turbo loading mode. These servers use a larger thread-pool to load segments
faster but at the cost of query performance. For servers specified in
`turboLoadingNodes`, `druid.coordinator.loadqueuepeon.http.batchSize` is
ignored and the coordinator uses the value of the respective
`numLoadingThreads` instead.<br/>Please use this config with caution. All
servers should eventually be removed from this list once the segment loading on
the respective historicals is finished. |none|
|`cloneServers`| Experimental. Map from target Historical server to source
Historical server which should be cloned by the target. The target Historical
does not participate in regular segment assignment or balancing. Instead, the
Coordinator mirrors any segment assignment made to the source Historical onto
the target Historical, so that the target becomes an exact copy of the source.
Segments on the target Historical do not count towards replica counts either.
If the source disappears, the target remains in the last known state of the
source server until removed from the configuration. <br/>Use this config with
caution. All servers should eventually be removed from this list once the
desired state on the respective Historicals is achieved. |none|
+|`queryBlocklist`| List of rules to block queries based on datasource, query
type, and/or query context parameters. Each rule defines criteria that are
combined with AND logic. Blocked queries return an HTTP 403 error. See [Query
blocklist rules](#query-blocklist-rules) for details.|none|
+
+##### Query blocklist rules
+
+Query blocklist rules allow you to block specific queries based on datasource,
query type, and/or query context parameters. This feature is useful for
preventing expensive or problematic queries from impacting cluster performance.
Review Comment:
The documentation should mention that blocking is best-effort, and may not
be applied in certain cases (such as if the Broker has recently started up,
can't contact the Coordinator, etc).
##########
docs/configuration/index.md:
##########
@@ -838,6 +838,73 @@ The following table shows the dynamic configuration
properties for the Coordinat
|`replicateAfterLoadTimeout`|Boolean flag for whether or not additional
replication is needed for segments that have failed to load due to the expiry
of `druid.coordinator.load.timeout`. If this is set to true, the Coordinator
will attempt to replicate the failed segment on a different historical server.
This helps improve the segment availability if there are a few slow Historicals
in the cluster. However, the slow Historical may still load the segment later
and the Coordinator may issue drop requests if the segment is
over-replicated.|false|
|`turboLoadingNodes`| Experimental. List of Historical servers to place in
turbo loading mode. These servers use a larger thread-pool to load segments
faster but at the cost of query performance. For servers specified in
`turboLoadingNodes`, `druid.coordinator.loadqueuepeon.http.batchSize` is
ignored and the coordinator uses the value of the respective
`numLoadingThreads` instead.<br/>Please use this config with caution. All
servers should eventually be removed from this list once the segment loading on
the respective historicals is finished. |none|
|`cloneServers`| Experimental. Map from target Historical server to source
Historical server which should be cloned by the target. The target Historical
does not participate in regular segment assignment or balancing. Instead, the
Coordinator mirrors any segment assignment made to the source Historical onto
the target Historical, so that the target becomes an exact copy of the source.
Segments on the target Historical do not count towards replica counts either.
If the source disappears, the target remains in the last known state of the
source server until removed from the configuration. <br/>Use this config with
caution. All servers should eventually be removed from this list once the
desired state on the respective Historicals is achieved. |none|
+|`queryBlocklist`| List of rules to block queries based on datasource, query
type, and/or query context parameters. Each rule defines criteria that are
combined with AND logic. Blocked queries return an HTTP 403 error. See [Query
blocklist rules](#query-blocklist-rules) for details.|none|
+
+##### Query blocklist rules
+
+Query blocklist rules allow you to block specific queries based on datasource,
query type, and/or query context parameters. This feature is useful for
preventing expensive or problematic queries from impacting cluster performance.
+
+Each rule in the `queryBlocklist` array is a JSON object with the following
properties:
+
+|Property|Description|Required|Default|
+|--------|-----------|--------|-------|
+|`ruleName`|Unique name identifying this blocklist rule. Used in error
messages when queries are blocked.|Yes|N/A|
+|`dataSources`|List of datasource names to match. A query matches if it
references any datasource in this list.|No|Matches all datasources|
+|`queryTypes`|List of query types to match (e.g., `scan`, `timeseries`,
`groupBy`, `topN`). A query matches if its type is in this list.|No|Matches all
query types|
Review Comment:
I don't see a test for this `queryTypes` feature. Please add one.
##########
server/src/main/java/org/apache/druid/server/QueryLifecycle.java:
##########
@@ -121,6 +124,7 @@ public QueryLifecycle(
final DefaultQueryConfig defaultQueryConfig,
final AuthConfig authConfig,
final PolicyEnforcer policyEnforcer,
+ @Nullable final BrokerViewOfCoordinatorConfig
brokerViewOfCoordinatorConfig,
Review Comment:
Better to keep things targeted and only pass in the
`List<QueryBlocklistRule>`. It makes usage and testing easier.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]