isaacreath commented on code in PR #2058:
URL: https://github.com/apache/cassandra/pull/2058#discussion_r1061709122


##########
src/java/org/apache/cassandra/streaming/StreamSession.java:
##########
@@ -1027,9 +1022,30 @@ public void receive(IncomingStreamMessage message)
     public void progress(String filename, ProgressInfo.Direction direction, 
long bytes, long total)
     {
         ProgressInfo progress = new ProgressInfo(peer, index, filename, 
direction, bytes, total);
+        updateMetricsOnProgress(progress);
         streamResult.handleProgress(progress);
     }
 
+    private void updateMetricsOnProgress(ProgressInfo progress)
+    {
+        ProgressInfo.Direction direction = progress.direction;
+        long lastSeenBytesStreamedForProgress = 
lastSeenBytesStreamed.getOrDefault(progress, 0L);
+        long newBytesStreamed = progress.currentBytes - 
lastSeenBytesStreamedForProgress;
+        if (direction == ProgressInfo.Direction.OUT)
+        {
+            StreamingMetrics.totalOutgoingBytes.inc(newBytesStreamed);
+            metrics.outgoingBytes.inc(newBytesStreamed);
+        }
+
+        else if (direction == ProgressInfo.Direction.IN)
+        {
+            StreamingMetrics.totalIncomingBytes.inc(newBytesStreamed);
+            metrics.incomingBytes.inc(newBytesStreamed);
+        }
+
+        lastSeenBytesStreamed.put(progress, lastSeenBytesStreamedForProgress + 
newBytesStreamed);

Review Comment:
   In practice we will be adding a new progress object for every file streamed 
in each direction by this `StreamSession` (see: 
[ProgressInfo::hashCode](https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/streaming/ProgressInfo.java#L98)).
 The worst case is the number of total entries is equal to number of files on 
the local node * number of files on the remote node for each `StreamSession`. 
   
   A simple optimization I can add here would be to remove the progress object 
from the map once we've completed streaming. In this case when 
`progress.currentBytes == progress.totalBytes`.  This would clean things up as 
each file completes and probably improve memory utilization in the average 
case. This wouldn't handle the worst case where all files complete streaming at 
the same time. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to