Jackie-Jiang commented on code in PR #12276:
URL: https://github.com/apache/pinot/pull/12276#discussion_r1458364517
##########
pinot-broker/src/main/java/org/apache/pinot/broker/routing/instanceselector/BalancedInstanceSelector.java:
##########
@@ -50,8 +51,9 @@
public class BalancedInstanceSelector extends BaseInstanceSelector {
public BalancedInstanceSelector(String tableNameWithType,
ZkHelixPropertyStore<ZNRecord> propertyStore,
- BrokerMetrics brokerMetrics, @Nullable AdaptiveServerSelector
adaptiveServerSelector, Clock clock) {
- super(tableNameWithType, propertyStore, brokerMetrics,
adaptiveServerSelector, clock);
+ BrokerMetrics brokerMetrics, @Nullable AdaptiveServerSelector
adaptiveServerSelector, Clock clock,
+ boolean useStickyRouting) {
Review Comment:
IMO sticky routing is not very clear, `useSingleReplica` or
`useFixedReplica` might be more explicit. This is personal preference, so open
to other names
##########
pinot-broker/src/main/java/org/apache/pinot/broker/routing/instanceselector/InstanceSelectorFactory.java:
##########
@@ -40,44 +43,48 @@ private InstanceSelectorFactory() {
public static final String LEGACY_REPLICA_GROUP_OFFLINE_ROUTING =
"PartitionAwareOffline";
public static final String LEGACY_REPLICA_GROUP_REALTIME_ROUTING =
"PartitionAwareRealtime";
+ @VisibleForTesting
public static InstanceSelector getInstanceSelector(TableConfig tableConfig,
- ZkHelixPropertyStore<ZNRecord> propertyStore, BrokerMetrics
brokerMetrics) {
- return getInstanceSelector(tableConfig, propertyStore, brokerMetrics,
null, Clock.systemUTC());
+ ZkHelixPropertyStore<ZNRecord> propertyStore, BrokerMetrics
brokerMetrics, PinotConfiguration config) {
Review Comment:
(minor) Suggest renaming to `brokerConf` to make it clear that it is top
level broker config
##########
pinot-broker/src/main/java/org/apache/pinot/broker/routing/instanceselector/SegmentInstanceCandidate.java:
##########
@@ -41,4 +41,9 @@ public String getInstance() {
public boolean isOnline() {
return _online;
}
+
+ @Override
+ public int compareTo(SegmentInstanceCandidate o) {
Review Comment:
Is this a bugfix?
##########
pinot-common/src/main/java/org/apache/pinot/common/utils/config/QueryOptionsUtils.java:
##########
@@ -145,6 +145,11 @@ public static boolean isSkipScanFilterReorder(Map<String,
String> queryOptions)
return
"false".equalsIgnoreCase(queryOptions.get(QueryOptionKey.USE_SCAN_REORDER_OPTIMIZATION));
}
+ @Nullable
+ public static boolean isUseStickyRouting(Map<String, String> queryOptions) {
Review Comment:
We should return `Boolean` here, and differentiate not set and explicit false
##########
pinot-broker/src/main/java/org/apache/pinot/broker/routing/instanceselector/BalancedInstanceSelector.java:
##########
@@ -89,7 +91,13 @@ Pair<Map<String, String>, Map<String, String>>
select(List<String> segments, int
if (candidates == null) {
continue;
}
- int selectedIdx = requestId++ % candidates.size();
+ int selectedIdx;
+ if (_useStickyRouting ||
QueryOptionsUtils.isUseStickyRouting(queryOptions)) {
+ // candidates array is always sorted
+ selectedIdx = Math.abs(_tableNameWithType.hashCode() %
candidates.size());
Review Comment:
Consider using raw table name so that we can pick the same replica for
hybrid use cases for future proof. Also we can cache the hash (positive value)
to avoid calculating hash per query
##########
pinot-broker/src/main/java/org/apache/pinot/broker/routing/instanceselector/MultiStageReplicaGroupSelector.java:
##########
@@ -86,7 +87,16 @@ Pair<Map<String, String>, Map<String, String>>
select(List<String> segments, int
SegmentStates segmentStates, Map<String, String> queryOptions) {
// Create a copy of InstancePartitions to avoid race-condition with
event-listeners above.
InstancePartitions instancePartitions = _instancePartitions;
- int replicaGroupSelected = requestId %
instancePartitions.getNumReplicaGroups();
+ int replicaGroupSelected;
+ if (_useStickyRouting) {
+ // When using sticky routing, we want to iterate over the
instancePartitions in order to ensure deterministic
+ // selection of replica group across queries i.e. same instance replica
group id is picked each time.
+ // Since the instances within a selected replica group are iterated in
order, the assignment within a selected
+ // replica group is guaranteed to be deterministic.
+ replicaGroupSelected = 0;
Review Comment:
We should still start with the table hash to avoid always selecting the
servers from the same replica. If we always pick replica 0, it will likely
cause hotspot servers because servers with smaller id will always be picked
##########
pinot-broker/src/main/java/org/apache/pinot/broker/routing/instanceselector/BalancedInstanceSelector.java:
##########
@@ -89,7 +91,13 @@ Pair<Map<String, String>, Map<String, String>>
select(List<String> segments, int
if (candidates == null) {
continue;
}
- int selectedIdx = requestId++ % candidates.size();
+ int selectedIdx;
+ if (_useStickyRouting ||
QueryOptionsUtils.isUseStickyRouting(queryOptions)) {
Review Comment:
We should allow query option to override this. Basically we can have broker
config -> table config -> query options. Each level can override previous level
if explicitly configured.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]