suneet-s commented on a change in pull request #10359:
URL: https://github.com/apache/druid/pull/10359#discussion_r502021990



##########
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/worker/shuffle/ShuffleMetrics.java
##########
@@ -0,0 +1,120 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.indexing.worker.shuffle;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.errorprone.annotations.concurrent.GuardedBy;
+
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.Map;
+
+/**
+ * Shuffle metrcis for middleManagers and indexers. This class is thread-safe 
because shuffle can be performed by
+ * multiple HTTP threads while a monitoring thread periodically emits the 
snapshot of metrics.
+ *
+ * @see ShuffleResource
+ * @see org.apache.druid.java.util.metrics.MonitorScheduler
+ */
+public class ShuffleMetrics
+{
+  /**
+   * This lock is used to synchronize accesses to the reference to {@link 
#datasourceMetrics} and the
+   * {@link PerDatasourceShuffleMetrics} values of the map. This means,
+   *
+   * - Any updates on PerDatasourceShuffleMetrics in the map (and thus its key 
as well) should be synchronized
+   * under this lock.
+   * - Any updates on the reference to datasourceMetrics should be 
synchronized under this lock.
+   */
+  private final Object lock = new Object();
+
+  /**
+   * A map of (datasource name) -> {@link PerDatasourceShuffleMetrics}. This 
map is replaced with an empty map
+   * whenever a snapshot is taken since the map can keep growing over time 
otherwise. For concurrent access pattern,
+   * see {@link #shuffleRequested} and {@link #snapshotAndReset()}.
+   */
+  @GuardedBy("lock")
+  private Map<String, PerDatasourceShuffleMetrics> datasourceMetrics = new 
HashMap<>();
+
+  /**
+   * This method is called whenever a new shuffle is requested. Multiple tasks 
can request shuffle at the same time,
+   * while the monitoring thread takes a snapshot of the metrics. There is a 
happens-before relationship between
+   * shuffleRequested and {@link #snapshotAndReset()}.
+   */
+  public void shuffleRequested(String supervisorTaskId, long fileLength)
+  {
+    synchronized (lock) {

Review comment:
       I like this approach a lot 🤘 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to