Copilot commented on code in PR #17532:
URL: https://github.com/apache/pinot/pull/17532#discussion_r2784428415
##########
pinot-broker/src/main/java/org/apache/pinot/broker/routing/manager/BaseBrokerRoutingManager.java:
##########
@@ -448,14 +464,22 @@ private void processInstanceConfigChangeInternal() {
// Update routing entry for all tables
for (RoutingEntry routingEntry : _routingEntryMap.values()) {
+ String tableNameWithType = routingEntry.getTableNameWithType();
try {
- Object tableLock =
getRoutingTableBuildLock(routingEntry.getTableNameWithType());
+ Object tableLock = getRoutingTableBuildLock(tableNameWithType);
synchronized (tableLock) {
routingEntry.onInstancesChange(_routableServers, changedServers);
+
+ // Handle sampler routing entries
+ if (_samplerRoutingEntryMap.containsKey(tableNameWithType)) {
+ for (RoutingEntry samplerRoutingEntry :
_samplerRoutingEntryMap.get(tableNameWithType).values()) {
Review Comment:
`containsKey()` followed by `get()` on a `ConcurrentHashMap` is not atomic;
if the entry is removed between calls,
`_samplerRoutingEntryMap.get(tableNameWithType)` can be null and `values()`
will throw NPE. Safer pattern: fetch once into a local variable (e.g.,
`Map<String, RoutingEntry> samplerEntries =
_samplerRoutingEntryMap.get(tableNameWithType);`) and null-check before
iterating. (This same pattern repeats in exclude/include paths as well.)
```suggestion
Map<String, RoutingEntry> samplerRoutingEntries =
_samplerRoutingEntryMap.get(tableNameWithType);
if (samplerRoutingEntries != null) {
for (RoutingEntry samplerRoutingEntry :
samplerRoutingEntries.values()) {
```
##########
pinot-spi/src/test/java/org/apache/pinot/spi/config/table/TableConfigTest.java:
##########
@@ -104,4 +106,15 @@ public void testCopyConstructor() {
assertEquals(config, copy);
assertEquals(config.toJsonString(), copy.toJsonString());
}
+
+ @Test
+ public void testDuplicateTableSamplerNamesRejected() {
+ TableConfig config = new
TableConfigBuilder(TableType.OFFLINE).setTableName(RAW_TABLE_NAME).build();
+ List<TableSamplerConfig> duplicateSamplers = List.of(
+ new TableSamplerConfig("sampler1", "firstN", Map.of("numSegments",
"10")),
+ new TableSamplerConfig("sampler1", "firstN", Map.of("numSegments",
"1")));
+ IllegalArgumentException e =
Assert.expectThrows(IllegalArgumentException.class,
+ () -> config.setTableSamplers(duplicateSamplers));
+ assertTrue(e.getMessage().contains("Duplicate table sampler name:
sampler1"));
+ }
Review Comment:
The duplicate-name test only covers exact string matches. If sampler names
are intended to be user-facing and stable (especially given the query option is
free-form text), add coverage for normalization behaviors you support (e.g.,
leading/trailing whitespace and case-insensitive duplicates like `sampler1` vs
` Sampler1 `). This helps prevent regressions when making sampler lookup
consistent between table config and query options.
##########
pinot-broker/src/main/java/org/apache/pinot/broker/routing/manager/BaseBrokerRoutingManager.java:
##########
@@ -1086,11 +1188,43 @@ private Map<ServerInstance, SegmentsToQuery>
getServerInstanceToSegmentsMap(Stri
@Nullable
@Override
public List<String> getSegments(BrokerRequest brokerRequest) {
+ return getSegments(brokerRequest, extractSamplerName(brokerRequest));
+ }
+
+ @Nullable
+ @Override
+ public List<String> getSegments(BrokerRequest brokerRequest, @Nullable
String samplerName) {
String tableNameWithType = brokerRequest.getQuerySource().getTableName();
- RoutingEntry routingEntry = _routingEntryMap.get(tableNameWithType);
+ RoutingEntry routingEntry = getRoutingEntry(tableNameWithType,
samplerName);
return routingEntry != null ? routingEntry.getSegments(brokerRequest) :
null;
}
+ @Nullable
+ private RoutingEntry getRoutingEntry(String tableNameWithType, String
samplerName) {
+ if (StringUtils.isNotBlank(samplerName)) {
+ Map<String, RoutingEntry> samplerEntries =
_samplerRoutingEntryMap.get(tableNameWithType);
+ RoutingEntry samplerEntry = samplerEntries != null ?
samplerEntries.get(samplerName) : null;
+ if (samplerEntry != null) {
+ return samplerEntry;
+ }
+ LOGGER.warn("Requested sampler '{}' not found for table '{}'; falling
back to default routing entry",
Review Comment:
This `WARN` is in the per-query routing path and can become very noisy (and
expensive) if clients send an unknown `tableSampler` option or have mismatched
casing/whitespace. Consider downgrading to `DEBUG`, adding rate-limiting, or
logging once per (table, samplerName) with a cache to avoid log amplification
on hot paths.
```suggestion
LOGGER.debug("Requested sampler '{}' not found for table '{}'; falling
back to default routing entry",
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]