[ 
https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17181389#comment-17181389
 ] 

Stefan Miklosovic commented on CASSANDRA-15406:
-----------------------------------------------

Build: [https://ci-cassandra.apache.org/job/Cassandra-devbranch/255/]

PR: [https://github.com/apache/cassandra/pull/711]

I have found a bug during this implementation. In a nutshell, I hope it helps 
you during review, the issue is that the size in 
CassandraOutgoingFile#getSize() is computed from header.size() but that is 
wrong because size from header is done in CassandraStreamHeader#calculateSize 
and that is either based on compression info or otherwise it just computes 
stuff from "sections". However there is a weird clash when this is computed 
because Netstats reports the total bytes to be sent as total over this 
"sections size" even it is sent over like compressed, it is compressed by 
default by CassandraCompressedStreamWriter and the individual "total bytes" per 
each item to be streamed is taken from its totalSize method and there it is 
computed as compressed so the numbers dont match.

What I did was that I computed CassandraOutgoingFile#getSize() in advance just 
once with help of header.calculateCompressionInfo in its constructor.

The PR is consisting of two commits, the second one tries to simplify the logic 
related to the computation of size and it reduce that logic to one central 
place.

Tests are testing bootstrapping and normal repair, compression turned on / off 
and entire sstable streaming on / off. Btw, I am not able to test as of now the 
output of the bootstrapping node. My suspicion is that the output is there, 
sure, even for the bootstrapping one, but dtest api and things around are hairy 
a bit so nothing is shown, I am getting some errors while doing nodetool 
netstats against a node which is under bootstrap.
 
For percentage progress for building indexes, the progress is there, I have 
managed to push this through some time ago so that one is just fine 
[https://github.com/apache/cassandra/commit/38f5f9caccabee2601ca5e95884d83857d22bf33]

> Show the progress of data streaming and index build 
> ----------------------------------------------------
>
>                 Key: CASSANDRA-15406
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15406
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Tool/nodetool
>            Reporter: maxwellguo
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 4.0, 4.x
>
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> I found that we should supply a command to show the progress of streaming 
> when we do the operation of bootstrap/move/decommission/removenode. For when 
> do data streaming , noboday knows which steps there program are in , so I 
> think a command to show the joing/leaving node's is needed .
>  
> PR [https://github.com/apache/cassandra/pull/558]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to