[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-03-24 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17704701#comment-17704701
 ] 

Paulo Motta commented on CASSANDRA-16325:
-

Now that CASSANDRA-18304 is committed I rebased this patch and resubmitted CI: 
[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/2392/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/2392/pipeline]

Prepared patch is on [this 
branch|https://github.com/pauloricardomg/cassandra/tree/feature/CASSANDRA-16325].

Also includes a 
[commit|https://github.com/pauloricardomg/cassandra/commit/43e9f7c7932dbdb6b654be59fecd8300ccdbf2b]
 to log node configuration on dtests, which allowed to verify that the 
configuration is being properly set.

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 5.0
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-26 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17681030#comment-17681030
 ] 

Paulo Motta commented on CASSANDRA-16325:
-

build failed due to CASSANDRA-18197

Resubmitted CI: 
!https://ci-cassandra.apache.org/job/Cassandra-devbranch/2234/badge/icon!

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-25 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17680808#comment-17680808
 ] 

Paulo Motta commented on CASSANDRA-16325:
-

LGTM, nice work Isaac!

Prepared patch for commit 
[here|https://github.com/pauloricardomg/cassandra/commit/49946329d05b736943678345c2e110ce315940bc]
 and submitted a final round of CI:
 * [https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2228/] 
(queued)

ok to merge if CI looks good [~smiklosovic]?

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-24 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17680378#comment-17680378
 ] 

David Capwell commented on CASSANDRA-16325:
---

Looked at the change and LGTM, looks really good now!

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-19 Thread Isaac Reath (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678933#comment-17678933
 ] 

Isaac Reath commented on CASSANDRA-16325:
-

[~paulo] I've updated the patch to include testing for compressed tables and 
{{stream_entire_sstables=true}}. 

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-18 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678348#comment-17678348
 ] 

Josh McKenzie commented on CASSANDRA-16325:
---

{quote}if processing of downstream streaming listeners turn out to be a 
bottleneck I think we should send the notifications on a separate thread versus 
adding special logic to batch metric updates to not make this overly complex.
{quote}
In the "how do we solve for this case" I 100% agree. From the "we have lots of 
metrics that are potentially contended and will be queried programmatically by 
users via vtables" angle / lens, I think there might be a {{NoSpamLogger}} 
analog worth pursuing here to socket and update metrics when something crosses 
a certain threshold; trade off memory for reduced contention in general.

Definitely wouldn't block here on it but might be a nice defensive paradigm for 
us to make idiomatic as we add more metrics.

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-17 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678015#comment-17678015
 ] 

Paulo Motta commented on CASSANDRA-16325:
-

The new approach looks good to me. Right now, 
{{testMetricsUpdateIncrementallyWithRepairAndStreamingBetweenNodes}} and 
{{testMetricsUpdateIncrementallyWithRebuildAndStreamingBetweenNodes}} are only 
testing the uncompressed/partial stream path. I think we should extend these 
tests to also test the metrics are working correctly for compressed and 
{{stream_entire_sstables=true}}

I don't expect these more frequent streamed bytes counter updates to have any 
noticeable impact on streaming performance, but if processing of downstream 
streaming listeners turn out to be a bottleneck I think we should send the 
notifications on a separate thread versus adding special logic to batch metric 
updates to not make this overly complex.

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-17 Thread Isaac Reath (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17677968#comment-17677968
 ] 

Isaac Reath commented on CASSANDRA-16325:
-

Thank you [~dcapwell]. I've now updated my PR to update metrics through a 
{{StreamEventListener}}, overall seems like a much better outcome. 

In testing I found that the {{CassandraCompressedStreamReader}} seemed to be 
reporting way more bytes than were written. I did some digging and found that 
{{cis.bytesRead()}} does not reset after each section is finished causing the 
delta calculation to be incorrect after we've completed writing a section of an 
SSTable. Moving that to outside of the {{for 
(SSTableReader.PartitionPositionBounds section : sections)}} loop made the 
metrics collection accurate.

[~jmckenzie], I'm not sure if moving this logic to a StreamEventListener 
addresses your concerns about hammering the counters. Moving to a 
{{StreamEventListener}} makes a threshold counter a bit more complicated (since 
you will have 1:many relationship between {{StreamEventListener}} and peer, 
forcing us to track the deltas per peer). Per [~dcapwell] we get a progress 
event for every 64kb which may still be too often.  I kept the patch simple for 
now but we could move to either a batch-based approach or having a thread to 
update the counters if you still feel this is too frequent. 

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-12 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676280#comment-17676280
 ] 

David Capwell commented on CASSANDRA-16325:
---

Merged to 4.1 and trunk, so events now include the delta

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-11 Thread Isaac Reath (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17675685#comment-17675685
 ] 

Isaac Reath commented on CASSANDRA-16325:
-

This looks good to me. I left a comment about usage of {{bytesWritten}} in 
{{CassandraEntireSSTableStreamWriter}}. I might be wrong about that usage so I 
can look into it on my ticket. 

 

 

 

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-10 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17657008#comment-17657008
 ] 

David Capwell commented on CASSANDRA-16325:
---

[~isaacreath] mind reviewing the changes in 
https://github.com/apache/cassandra/pull/2066?  Here is the commit 
https://github.com/apache/cassandra/pull/2066/commits/5e6fd378da0297f09566cf3aafaa20c7600d2249

Most usage actually had the delta... subset needed extra memory to retain 
previous "progress"

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-10 Thread Isaac Reath (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17657003#comment-17657003
 ] 

Isaac Reath commented on CASSANDRA-16325:
-

[~dcapwell]  yeah that sounds good to me! 

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-10 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656999#comment-17656999
 ] 

David Capwell commented on CASSANDRA-16325:
---

CASSANDRA-18110 is approved and ready to merge.  If it would help this patch to 
move the deltas to the caller I am glad to do it as part of that work so I can 
remove the cache; let me know [~isaacreath]

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-10 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656998#comment-17656998
 ] 

David Capwell commented on CASSANDRA-16325:
---

bq. I think if we updated the ProgressInfo object to store the delta from the 
last time progress was called

I just looked into this and it would require light caching to the callers, such 
as

{code}
diff --git 
src/java/org/apache/cassandra/db/streaming/CassandraCompressedStreamReader.java 
src/java/org/apache/cassandra/db/streaming/CassandraCompressedStreamReader.java
index dda874ba40..b9f4566aac 100644
--- 
src/java/org/apache/cassandra/db/streaming/CassandraCompressedStreamReader.java
+++ 
src/java/org/apache/cassandra/db/streaming/CassandraCompressedStreamReader.java
@@ -79,6 +79,7 @@ public class CassandraCompressedStreamReader extends 
CassandraStreamReader
 writer = createWriter(cfs, totalSize, repairedAt, pendingRepair, 
format);
 String filename = writer.getFilename();
 int sectionIdx = 0;
+long read = 0;
 for (SSTableReader.PartitionPositionBounds section : sections)
 {
 assert cis.chunkBytesRead() <= totalSize;
@@ -93,7 +94,10 @@ public class CassandraCompressedStreamReader extends 
CassandraStreamReader
 {
 writePartition(deserializer, writer);
 // when compressed, report total bytes of compressed 
chunks read since remoteFile.size is the sum of chunks transferred
-session.progress(filename + '-' + fileSeqNum, 
ProgressInfo.Direction.IN, cis.chunkBytesRead(), totalSize);
+long bytes = cis.chunkBytesRead();
+long delta = bytes - read;
+read = bytes;
+session.progress(filename + '-' + fileSeqNum, 
ProgressInfo.Direction.IN, bytes, delta, totalSize);
 }
 assert in.getBytesRead() == sectionLength;
 }
{code}

So, if the reason to avoid using a listener was tracking progress info, we can 
compute the delta at the event creation side (non look hard), I personally find 
that better than embedding your custom metrics listener as it gives benefits to 
other listeners (such as removing the cache in StreamingState)

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-10 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656997#comment-17656997
 ] 

David Capwell commented on CASSANDRA-16325:
---

.bq  I think if we updated the ProgressInfo object to store the delta from the 
last time progress was called

That would be a good and welcome change... my patch stores a cache of String -> 
Long just so I can compute the delta... if streaming could do this for me then 
I can lower the memory of the listener.

.bq The only thought I have about moving the counters into the StreamingSession 
class is if there is a feature flag around per-file progress. As a user, I 
would still want metrics to report if that flag is enabled. 

There is a feature flag to disable the stream stats listener, so it makes 
complete sets not to couple it with that listener; I agree.


> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-10 Thread Isaac Reath (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656990#comment-17656990
 ] 

Isaac Reath commented on CASSANDRA-16325:
-

Yes [~dcapwell]  this seems to be solving something very similar to what's 
happening in StreamingState::streamProgress in your patch. Looking at these 
lines in your patch: 
[https://github.com/apache/cassandra/pull/2066/files#diff-2c11c59ea199afd7097d89eece180d6ce6f9ee7568ffb141994f5dff9d9c2d27R267-R269]
 it looks very similar to the approach I initially took in this patch (see: 
[https://github.com/apache/cassandra/pull/2058#discussion_r1061577063).] We 
changed our approach to use this different one to avoid needing to keep track 
of the incoming / outgoing ProgressInfo objects (per [~smiklosovic]'s 
concerns). I think if we updated the ProgressInfo object to store the delta 
from the last time progress was called, then we could avoid having to track 
ProgressInfo objects and still leverage the ProgressEvent. From what I can 
tell, we have that information most of the time we call StreamSession.progress.

The only thought I have about moving the counters into the StreamingSession 
class is if there is a feature flag around per-file progress. As a user, I 
would still want metrics to report if that flag is enabled. 

 

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-10 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656768#comment-17656768
 ] 

David Capwell commented on CASSANDRA-16325:
---

Saw this due to CASSANDRA-18110 and the vtable work.  Looking at the patch I 
have 2 main questions:

1) why embed into the calls when we get progress events providing the same 
details, just batched for every 64k bytes?  Using the stream listens we have 
would resolve [~jmckenzie] comments as they are every 64k; this also 
drastically shrinks the size and scope of this patch as its just adding a 
single listener and listening to the FILE_PROGRESS event.
2) it seems that most of this work is to boil down to calling 2 lines: 
https://github.com/apache/cassandra/pull/2058/files#diff-bc79ba735741b3d997c735ad38ea4793d079d3ca2f2162412a422f26b3efdb40R99-R100.
  This work has a lot of overlap with the stream vtable which does the same 
work, just using the events rather than embedding into the call sites; should 
we move these counters there?

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-10 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656759#comment-17656759
 ] 

Stefan Miklosovic commented on CASSANDRA-16325:
---

If you want to introduce system property, please do not forget to put it into 
CassandraRelevantProperties.

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-10 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656757#comment-17656757
 ] 

Paulo Motta commented on CASSANDRA-16325:
-

bq. ISTM we might benefit from a simple threshold counter inside 
CassandraStreamWriter where it'll update the metrics every N multiples of 
CassandraStreamWriter.DEFAULT_CHUNK_SIZE so we don't spam the metrics updates.

Makes sense, but I think we should keep this metric buffering logic inside 
{{FileStreamMetricsListener}} so it's transparent to different readers/writers. 
We should also probably make the threshold configurable via a system property.

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-10 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656750#comment-17656750
 ] 

Josh McKenzie commented on CASSANDRA-16325:
---

I'm a little concerned about us hammering those Counters every time we transfer 
any date:
{code:java}
// stream each of the required sections of the file
for (SSTableReader.PartitionPositionBounds section : sections)
{
long start = validator == null ? section.lowerPosition : 
validator.chunkStart(section.lowerPosition);
// if the transfer does not start on the valididator's chunk boundary, this 
is the number of bytes to offset by
int transferOffset = (int) (section.lowerPosition - start);
if (validator != null)
validator.seek(start);
// length of the section to read
long length = section.upperPosition - start;
// tracks write progress
long bytesRead = 0;
while (bytesRead < length)
{
int toTransfer = (int) Math.min(bufferSize, length - bytesRead);
long lastBytesRead = write(proxy, validator, out, start, 
transferOffset, toTransfer, bufferSize);
start += lastBytesRead;
bytesRead += lastBytesRead;
progress += (lastBytesRead - transferOffset);
session.progress(sstable.descriptor.filenameFor(Component.DATA), 
ProgressInfo.Direction.OUT, progress, totalSize);
fileStreamMetricsListener.onStreamingBytesTransferred(progress); // 
REVIEWER NOTE: This is where we're going to update the counters on each call
transferOffset = 0;
}{code}
ISTM we might benefit from a simple threshold counter inside 
{{CassandraStreamWriter}} where it'll update the metrics every N multiples of 
{{CassandraStreamWriter.DEFAULT_CHUNK_SIZE}} so we don't spam the metrics 
updates.

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-10 Thread Abe Ratnofsky (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656738#comment-17656738
 ] 

Abe Ratnofsky commented on CASSANDRA-16325:
---

Wanted to flag potential overlap with CASSANDRA-18110, which also has a PR up 
right now.

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
> Fix For: 4.2
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-10 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656504#comment-17656504
 ] 

Stefan Miklosovic commented on CASSANDRA-16325:
---

[~isaacreath] thanks for the cooperation! I left couple last comments on the 
PR. When resolve I am +1. [~paulo] would you mind to take care of the eventual 
merging, please?

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-04 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654491#comment-17654491
 ] 

Stefan Miklosovic commented on CASSANDRA-16325:
---

Ive put some comments in.

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-03 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654206#comment-17654206
 ] 

Paulo Motta commented on CASSANDRA-16325:
-

Submitted CI with updated test: 
[https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2156/]

I think this should be good for a second round of review. [~blerer] [~yifanc] 
[~stefan.miklosovic] would you be interested in reviewing this? Thanks

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2023-01-03 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17654203#comment-17654203
 ] 

Paulo Motta commented on CASSANDRA-16325:
-

The approach looks good to me, but the test was not working in my computer 
because streaming was completed by the time we checked [if the streaming 
metrics were 
updated|https://github.com/apache/cassandra/pull/2058/commits/b217c700892ac1164ebcee0e4333035c1a97ec33#diff-cfc5f5318c05f7dafd83a877597fd56afdadd78e5a48f0d2f9f228289655d01aR183].

Furthermore, since we only [checked that the streaming metrics were updated 
once|https://github.com/apache/cassandra/pull/2058/commits/b217c700892ac1164ebcee0e4333035c1a97ec33#diff-cfc5f5318c05f7dafd83a877597fd56afdadd78e5a48f0d2f9f228289655d01aR225]
 this did not guarantee metrics are updated incrementally since this check 
could be performed *after* the streaming is fully completed.

In order to make this a bit more robust, I did the following changes to 
{{{}testMetricsUpdateIncrementally{}}}:
 * Update cluster size to 2 and RF=2, since this will make node1 transfer all 
of its data to node2 during streaming making it easier to reason about test 
state
 * Create single 10MB sstable on node1 with compression disabled, so we can 
ensure streaming of a single file will update streaming metrics incrementally
 * Check that 3 or more incoming/outgoing bytes are reported during streaming 
from node1 to node2
  * Before, only 2 values are reported: [0, total_streamed_size]
 * Check that final transferred bytes matches expected sstable size (10MB)
 * Test both peer incoming/outgoing metrics as well as total incoming/outgoing 
metrics

I added log statements on the updated test and these are the reported bytes 
before and after this patch:
BEFORE:
{noformat}
OUTGOING [0, 10508890]
INCOMING [0, 10508890]
{noformat}
AFTER
{noformat}
OUTGOING [0, 65536, 131072, 196608, 262144, 327680, 393216, 458752, 524288, 
589824, 655360, 720896, 786432, 851968, 917504, 983040, 1048576, 1114112, 
1179648, 1245184, 1310720, 1376256, 1441792, 1507328, 1572864, 1638400, 
1703936, 1769472, 1835008, 1900544, 1966080, 2031616, 2097152, 2162688, 
2228224, 2293760, 2359296, 2424832, 2490368, 2555904, 2621440, 2686976, 
2752512, 2818048, 2883584, 2949120, 3014656, 3080192, 3276800, 3342336, 
3538944, 3604480, 3670016, 3735552, 3801088, 3866624, 3932160, 3997696, 
4063232, 4128768, 4194304, 4259840, 4325376, 4390912, 4456448, 4521984, 
4587520, 4653056, 4718592, 4784128, 4849664, 4915200, 4980736, 5046272, 
5111808, 5177344, 5242880, 5308416, 5373952, 5439488, 5505024, 5570560, 
5636096, 5701632, 5767168, 5832704, 5898240, 5963776, 6029312, 6094848, 
6160384, 6225920, 6291456, 6356992, 6422528, 6488064, 6553600, 6619136, 
6684672, 6750208, 6815744, 6881280, 6946816, 7012352, 7077888, 7143424, 
7208960, 7274496, 7340032, 7405568, 7471104, 7536640, 7602176, 7667712, 
7733248, 7798784, 7864320, 7929856, 7995392, 8060928, 8126464, 8192000, 
8257536, 8323072, 8388608, 8454144, 8519680, 8585216, 8650752, 8716288, 
8781824, 8847360, 8912896, 8978432, 9043968, 9109504, 9175040, 9240576, 
9306112, 9371648, 9437184, 9502720, 9568256, 9633792, 9699328, 9764864, 
9830400, 9895936, 9961472, 10027008, 10092544, 10158080, 10223616, 10289152, 
10354688, 10420224, 10485760, 10508890]

INCOMING [0, 1051, 4204, 12611, 22070, 28372, 31525, 36777, 44134, 48338, 
55693, 60948, 65151, 67253, 73559, 78814, 89323, 97730, 106137, 112443, 119799, 
128207, 132408, 134510, 138714, 149224, 156581, 164986, 173394, 182852, 196512, 
207022, 217530, 226987, 238546, 248004, 255360, 261666, 266919, 267970, 280581, 
297395, 316313, 329973, 347837, 356243, 365702, 385669, 393026, 403534, 423497, 
443465, 458179, 480245, 499163, 520182, 527538, 545403, 570623, 589538, 597945, 
616862, 642081, 655743, 679915, 706187, 727205, 754527, 783954, 797617, 821786, 
851212, 868025, 909010, 921620, 949993, 982571, 1009893, 1042470, 1063489, 
1089762, 1090813, 1113932, 1141256, 1179088, 1210615, 1244242, 1273666, 
1310450, 1351438, 1395577, 1440761, 1494357, 1541647, 1572125, 1605756, 
1614161, 1615212, 1638331, 1658299, 1680367, 1703488, 1727652, 1762330, 
1775993, 1806466, 1821180, 1837994, 1873722, 183, 1924162, 1947283, 
1965147, 1988268, 2026099, 2048168, 2083899, 2097561, 2128036, 2152207, 
2162716, 2191090, 2214208, 2227869, 2250988, 2285667, 2293024, 2306687, 
2335064, 2359233, 2414932, 2424390, 2425441, 2476934, 2489546, 2528430, 
2555754, 2577823, 2613554, 2620909, 2652436, 2686065, 2687116, 2728101, 
2752266, 2756469, 2804811, 2817422, 2818473, 2859457, 2882576, 2905696, 
2942476, 2948781, 2959291, 3013940, 3046520, 3080149, 3096963, 3138998, 
3170524, 3211505, 3229370, 3230421, 3276657, 3330250, 3377539, 3445847, 
3501549, 3538331, 3579316, 3632911, 3699116, 3734848, 3801057, 3843094, 
3866214, 3867265, 3899844, 3931368, 3980761, 3997575, 4017543, 4062734, 
4098462, 4127887, 4166768, 

[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2022-12-19 Thread Isaac Reath (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649444#comment-17649444
 ] 

Isaac Reath commented on CASSANDRA-16325:
-

Created: [https://github.com/apache/cassandra/pull/2058.]

The approach we've taken is to update the metrics every time 
StreamSession::progress by the number of bytes streamed for a file since the 
last time progress was called. For testing, we've updated the StreamingMetrics 
distributed test to kick off a repair on a separate thread and check that the 
streaming metrics are updated before the repair completes.

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Isaac Reath
>Priority: Normal
>  Labels: lhf
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2022-12-16 Thread Isaac Reath (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648819#comment-17648819
 ] 

Isaac Reath commented on CASSANDRA-16325:
-

Picking up where Dejan left off to add some tests.

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Dejan Gvozdenac
>Priority: Normal
>  Labels: lhf
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2022-04-25 Thread Dejan Gvozdenac (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527654#comment-17527654
 ] 

Dejan Gvozdenac commented on CASSANDRA-16325:
-

https://github.com/apache/cassandra/pull/1589

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Dejan Gvozdenac
>Priority: Normal
>  Labels: lhf
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16325) Update streaming metrics incrementally

2022-03-08 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17503042#comment-17503042
 ] 

Paulo Motta commented on CASSANDRA-16325:
-

Hi [~dgvozdenac]. Thanks for taking this up.

It seems like a similar issue happens with compaction metrics.

There was some code changes in these areas in recent versions. I think a good 
start would be to inspect the code in {{trunk}} to check where streaming and 
compaction size metrics are updated, and if they're being done incrementally or 
only at the end of the operation.

> Update streaming metrics incrementally
> --
>
> Key: CASSANDRA-16325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Paulo Motta
>Assignee: Dejan Gvozdenac
>Priority: Normal
>  Labels: lhf
>
> Currently the inbound and outbound streamed bytes metrics are incremented 
> after each file is streamed, what doesn't represent the current number of 
> bytes streamed since it can take a long time for a large file to be streamed. 
> We should update the metric incrementally as data is streamed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org