abhishekrb19 commented on code in PR #19486:
URL: https://github.com/apache/druid/pull/19486#discussion_r3275490130
##########
docs/querying/query-context-reference.md:
##########
@@ -71,7 +71,8 @@ Unless otherwise noted, the following parameters apply to all
query types, and t
|`setProcessingThreadNames`|`true`| Whether processing thread names will be
set to `queryType_dataSource_intervals` while processing a query. This aids in
interpreting thread dumps, and is on by default. Query overhead can be reduced
slightly by setting this to `false`. This has a tiny effect in most scenarios,
but can be meaningful in high-QPS, low-per-segment-processing-time scenarios. |
|`sqlPlannerBloat`|`1000`|Calcite parameter which controls whether to merge
two Project operators when inlining expressions causes complexity to increase.
Implemented as a workaround to exception `There are not enough rules to produce
a node with desired properties: convention=DRUID, sort=[]` thrown after
rejecting the merge of two projects.|
|`cloneQueryMode`|`excludeClones`| Indicates whether clone Historicals should
be queried by brokers. Clone servers are created by the `cloneServers`
Coordinator dynamic configuration. Possible values are `excludeClones`,
`includeClones` and `preferClones`. `excludeClones` means that clone
Historicals are not queried by the broker. `preferClones` indicates that when
given a choice between the clone Historical and the original Historical which
is being cloned, the broker chooses the clones. Historicals which are not
involved in the cloning process will still be queried. `includeClones` means
that broker queries any Historical without regarding clone status. This
parameter only affects native queries. MSQ does not query Historicals directly.|
-|`realtimeSegmentsOnly` |`false`| When set to true, only query realtime
segments. Historical segments are excluded. |
+|`realtimeSegmentsMode` |`include`| Controls which segments are queried,
classified by whether a historical replica exists. `include` queries all
segments. `exclusive` queries only segments served solely by realtime servers;
any segment with at least one historical replica (including segments
mid-handoff) is excluded. `exclude` is the inverse: segments served solely by
realtime servers are skipped, but segments mid-handoff that have both a
realtime and a historical replica are still queried. |
Review Comment:
>`exclude` is the inverse: segments served solely by realtime servers are
skipped, but segments mid-handoff that have both a realtime and a historical
replica are still queried.
I don't know what mid-handoff is. Wouldn't this be the case where the
segment is either fully loaded on the realtime server or the historical, it's
not and Broker would deterministically choose one or the other as it sees in
its server inventory?
##########
processing/src/main/java/org/apache/druid/query/QueryContexts.java:
##########
@@ -233,6 +244,39 @@ public String toString()
}
}
+ /**
+ * Classifies segments by whether a historical replica exists
+ * (see {@link
org.apache.druid.client.selector.ServerSelector#isRealtimeSegment()}: a segment
is
+ * "realtime" only when it has realtime servers and zero historical servers).
+ */
+ public enum RealtimeSegmentsMode
+ {
+ /** Include all segments (default). */
+ INCLUDE,
+ /** Include only segments served solely by realtime servers; any segment
with a historical replica
+ * (including segments mid-handoff) is excluded. */
+ EXCLUSIVE,
+ /** Exclude segments served solely by realtime servers; segments
mid-handoff with both realtime
+ * and historical replicas are still included. */
+ EXCLUDE;
+
+ @JsonCreator
+ public static RealtimeSegmentsMode fromString(String str)
+ {
+ if (str == null) {
+ return null;
+ }
+ return RealtimeSegmentsMode.valueOf(StringUtils.toUpperCase(str));
+ }
+
+ @Override
+ @JsonValue
+ public String toString()
+ {
+ return StringUtils.toLowerCase(name());
Review Comment:
For consistency with `fromString()`:
```suggestion
return StringUtils.toUpperCase(name());
```
##########
processing/src/main/java/org/apache/druid/query/QueryContext.java:
##########
@@ -781,8 +782,34 @@ public boolean isPrePlanned()
return getBoolean(QueryContexts.CTX_PREPLANNED,
QueryContexts.DEFAULT_PREPLANNED);
}
+ /**
+ * Returns the realtime segments mode for this query. If {@code
realtimeSegmentsMode} is absent
Review Comment:
Perhaps use `@link` for `realtimeSegmentsMode` variable instead of `@code`
in the javadocs so it's clear how these things are referenced & used by callers.
##########
processing/src/main/java/org/apache/druid/query/QueryContext.java:
##########
@@ -781,8 +782,34 @@ public boolean isPrePlanned()
return getBoolean(QueryContexts.CTX_PREPLANNED,
QueryContexts.DEFAULT_PREPLANNED);
}
+ /**
+ * Returns the realtime segments mode for this query. If {@code
realtimeSegmentsMode} is absent
+ * or null, falls back to the deprecated {@code realtimeSegmentsOnly}
boolean: {@code true} maps
+ * to {@link RealtimeSegmentsMode#EXCLUSIVE}; otherwise returns {@link
RealtimeSegmentsMode#INCLUDE}.
+ */
+ public RealtimeSegmentsMode getRealtimeSegmentsMode()
+ {
+ RealtimeSegmentsMode mode = getEnum(
+ QueryContexts.REALTIME_SEGMENTS_MODE,
+ RealtimeSegmentsMode.class,
+ null
+ );
Review Comment:
Should we just pass this default enum value here as-is?
```suggestion
RealtimeSegmentsMode mode = getEnum(
QueryContexts.REALTIME_SEGMENTS_MODE,
RealtimeSegmentsMode.class,
QueryContexts.DEFAULT_REALTIME_SEGMENTS_MODE
);
```
IMO the backward-compatible behavior here can be to error out if both the
new field and the old deprecated field are set, instead of introducing
precedence rules, since that can become confusing (and users would also have to
explicitly opt in to the new property anyway)
##########
processing/src/test/java/org/apache/druid/query/QueryContextTest.java:
##########
@@ -440,6 +441,66 @@ public void testIsRealtimeSegmentsOnly()
);
}
+ @Test
+ public void testGetRealtimeSegmentsMode()
+ {
+ assertEquals(
+ QueryContexts.RealtimeSegmentsMode.INCLUDE,
+ QueryContext.empty().getRealtimeSegmentsMode()
+ );
+ assertEquals(
+ QueryContexts.RealtimeSegmentsMode.EXCLUSIVE,
+ QueryContext.of(ImmutableMap.of(QueryContexts.REALTIME_SEGMENTS_MODE,
"exclusive"))
+ .getRealtimeSegmentsMode()
+ );
+ assertEquals(
+ QueryContexts.RealtimeSegmentsMode.EXCLUDE,
+ QueryContext.of(ImmutableMap.of(QueryContexts.REALTIME_SEGMENTS_MODE,
"exclude"))
+ .getRealtimeSegmentsMode()
+ );
+ assertEquals(
+ QueryContexts.RealtimeSegmentsMode.INCLUDE,
+ QueryContext.of(ImmutableMap.of(QueryContexts.REALTIME_SEGMENTS_MODE,
"include"))
+ .getRealtimeSegmentsMode()
+ );
+ }
+
+ @Test
+ public void testGetRealtimeSegmentsModeBackwardCompat()
+ {
+ // realtimeSegmentsOnly=true maps to EXCLUSIVE
+ assertEquals(
+ QueryContexts.RealtimeSegmentsMode.EXCLUSIVE,
+ QueryContext.of(ImmutableMap.of(QueryContexts.REALTIME_SEGMENTS_ONLY,
true))
+ .getRealtimeSegmentsMode()
+ );
+ // realtimeSegmentsOnly=false maps to INCLUDE (default)
+ assertEquals(
+ QueryContexts.RealtimeSegmentsMode.INCLUDE,
+ QueryContext.of(ImmutableMap.of(QueryContexts.REALTIME_SEGMENTS_ONLY,
false))
+ .getRealtimeSegmentsMode()
+ );
+ // realtimeSegmentsMode takes precedence over realtimeSegmentsOnly
+ assertEquals(
+ QueryContexts.RealtimeSegmentsMode.EXCLUDE,
+ QueryContext.of(ImmutableMap.of(
+ QueryContexts.REALTIME_SEGMENTS_ONLY, true,
+ QueryContexts.REALTIME_SEGMENTS_MODE, "exclude"
+ )).getRealtimeSegmentsMode()
+ );
+ }
+
+ @Test
+ public void testGetRealtimeSegmentsModeInvalidValue()
+ {
+ BadQueryContextException e = assertThrows(
+ BadQueryContextException.class,
+ () ->
QueryContext.of(ImmutableMap.of(QueryContexts.REALTIME_SEGMENTS_MODE,
"badvalue"))
+ .getRealtimeSegmentsMode()
+ );
+ assertTrue(e.getMessage().contains("realtimeSegmentsMode"));
Review Comment:
nit: can we assert the exception msg beyond `contains()` (or catch & throw a
`DruidException` if its appropriate)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]