[jira] [Commented] (CASSANDRA-14685) Incremental repair 4.0 : SSTables remain locked forever if the coordinator dies during streaming

Alexander Dejanovski (JIRA) Fri, 31 Aug 2018 11:35:14 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-14685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16599128#comment-16599128
 ]


Alexander Dejanovski commented on CASSANDRA-14685:
--------------------------------------------------

[~jasobrown],

indeed, nodes 2 and 3 are still showing ongoing streams although node1 is down 
: 

 
{noformat}
$ ccm node2 nodetool netstats
Mode: NORMAL
Repair e28883b0-ad4b-11e8-82ca-5fbf27df5fb6
 /127.0.0.1
 Sending 2 files, 49304220 bytes total. Already sent 0 files, 5373952 bytes 
total
 
/Users/adejanovski/.ccm/inc-repair-issue/node2/data0/tlp_stress/sensor_data-67193da0ad4b11e88663cb45de9ab9e9/na-9-big-Data.db
 5373952/34243878 bytes(15%) sent to idx:0/127.0.0.1
Read Repair Statistics:
Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool Name Active Pending Completed Dropped
Large messages n/a 0 2 0
Small messages n/a 0 244612 0
Gossip messages n/a 23 531 0
$ ccm node3 nodetool netstats
Mode: NORMAL
Repair e269d820-ad4b-11e8-82ca-5fbf27df5fb6
 /127.0.0.1
 Sending 2 files, 49166315 bytes total. Already sent 1 files, 11748602 bytes 
total
 
/Users/adejanovski/.ccm/inc-repair-issue/node3/data0/tlp_stress/sensor_data-67193da0ad4b11e88663cb45de9ab9e9/na-11-big-Data.db
 8865018/8865018 bytes(100%) sent to idx:0/127.0.0.1
 
/Users/adejanovski/.ccm/inc-repair-issue/node3/data0/tlp_stress/sensor_data-67193da0ad4b11e88663cb45de9ab9e9/na-9-big-Data.db
 2883584/34198115 bytes(8%) sent to idx:0/127.0.0.1
Read Repair Statistics:
Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool Name Active Pending Completed Dropped
Large messages n/a 0 2 0
Small messages n/a 0 244611 0
Gossip messages n/a 0 820 0
{noformat}
 

> Incremental repair 4.0 : SSTables remain locked forever if the coordinator 
> dies during streaming 
> -------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-14685
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14685
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Repair
>            Reporter: Alexander Dejanovski
>            Assignee: Jason Brown
>            Priority: Critical
>
> The changes in CASSANDRA-9143 modified the way incremental repair performs by 
> applying the following sequence of events : 
>  * Anticompaction is executed on all replicas for all SSTables overlapping 
> the repaired ranges
>  * Anticompacted SSTables are then marked as "Pending repair" and cannot be 
> compacted anymore, nor part of another repair session
>  * Merkle trees are generated and compared
>  * Streaming takes place if needed
>  * Anticompaction is committed and "pending repair" table are marked as 
> repaired if it succeeded, or they are released if the repair session failed.
> If the repair coordinator dies during the streaming phase, *the SSTables on 
> the replicas will remain in "pending repair" state and will never be eligible 
> for repair or compaction*, even after all the nodes in the cluster are 
> restarted. 
> Steps to reproduce (I've used Jason's 13938 branch that fixes streaming 
> errors) : 
> {noformat}
> ccm create inc-repair-issue -v github:jasobrown/13938 -n 3
> # Allow jmx access and remove all rpc_ settings in yaml
> for f in ~/.ccm/inc-repair-issue/node*/conf/cassandra-env.sh;
> do
>   sed -i'' -e 
> 's/com.sun.management.jmxremote.authenticate=true/com.sun.management.jmxremote.authenticate=false/g'
>  $f
> done
> for f in ~/.ccm/inc-repair-issue/node*/conf/cassandra.yaml;
> do
>   grep -v "rpc_" $f > ${f}.tmp
>   cat ${f}.tmp > $f
> done
> ccm start
> {noformat}
> I used [tlp-stress|https://github.com/thelastpickle/tlp-stress] to generate a 
> few 10s of MBs of data (killed it after some time). Obviously 
> cassandra-stress works as well :
> {noformat}
> bin/tlp-stress run BasicTimeSeries -i 1M -p 1M -t 2 --rate 5000      
> --replication "{'class':'SimpleStrategy', 'replication_factor':2}"       
> --compaction "{'class': 'SizeTieredCompactionStrategy'}"       --host 
> 127.0.0.1
> {noformat}
> Flush and delete all SSTables in node1 :
> {noformat}
> ccm node1 nodetool flush
> ccm node1 stop
> rm -f ~/.ccm/inc-repair-issue/node1/data0/tlp_stress/sensor*/*.*
> ccm node1 start{noformat}
> Then throttle streaming throughput to 1MB/s so we have time to take node1 
> down during the streaming phase and run repair:
> {noformat}
> ccm node1 nodetool setstreamthroughput 1
> ccm node2 nodetool setstreamthroughput 1
> ccm node3 nodetool setstreamthroughput 1
> ccm node1 nodetool repair tlp_stress
> {noformat}
> Once streaming starts, shut down node1 and start it again :
> {noformat}
> ccm node1 stop
> ccm node1 start
> {noformat}
> Run repair again :
> {noformat}
> ccm node1 nodetool repair tlp_stress
> {noformat}
> The command will return very quickly, showing that it skipped all sstables :
> {noformat}
> [2018-08-31 19:05:16,292] Repair completed successfully
> [2018-08-31 19:05:16,292] Repair command #1 finished in 2 seconds
> $ ccm node1 nodetool status
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address    Load       Tokens       Owns    Host ID                        
>        Rack
> UN  127.0.0.1  228,64 KiB  256          ?       
> 437dc9cd-b1a1-41a5-961e-cfc99763e29f  rack1
> UN  127.0.0.2  60,09 MiB  256          ?       
> fbcbbdbb-e32a-4716-8230-8ca59aa93e62  rack1
> UN  127.0.0.3  57,59 MiB  256          ?       
> a0b1bcc6-0fad-405a-b0bf-180a0ca31dd0  rack1
> {noformat}
> sstablemetadata will then show that nodes 2 and 3 have SSTables still in 
> "pending repair" state :
> {noformat}
> ~/.ccm/repository/gitCOLONtrunk/tools/bin/sstablemetadata na-4-big-Data.db | 
> grep repair
> SSTable: 
> /Users/adejanovski/.ccm/inc-repair-4.0/node2/data0/tlp_stress/sensor_data-b7375660ad3111e8a0e59357ff9c9bda/na-4-big
> Pending repair: 3844a400-ad33-11e8-b5a7-6b8dd8f31b62
> {noformat}
> Restarting these nodes wouldn't help either.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14685) Incremental repair 4.0 : SSTables remain locked forever if the coordinator dies during streaming

Reply via email to