[ 
https://issues.apache.org/jira/browse/CASSANDRA-11845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292145#comment-15292145
 ] 

Paulo Motta commented on CASSANDRA-11845:
-----------------------------------------

Unfortunately it's not possible to track down the cause from these logs your 
posted. You'll need to [enable DEBUG 
logging|https://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configLoggingLevels_r.html]
 on the {{org.apache.cassandra.streaming}} and {{org.apache.cassandra.repair}} 
packages and attach full debug.log on this ticket (you should use the attach 
files functionality of JIRA instead of pasting logs on the comments).

Please note that to cancel hanged repair you'll probably need to restart 
involved nodes first before starting a new repair (stop repair functionality 
will be provided by CASSANDRA-3486).

> Hanging repair in cassandra 2.2.4
> ---------------------------------
>
>                 Key: CASSANDRA-11845
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11845
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Streaming and Messaging
>         Environment: Centos 6
>            Reporter: vin01
>            Priority: Minor
>
> So after increasing the streaming_timeout_in_ms value to 3 hours, i was able 
> to avoid the socketTimeout errors i was getting earlier 
> (https://issues.apAache.org/jira/browse/CASSANDRA-11826), but now the issue 
> is repair just stays stuck.
> current status :-
> [2016-05-19 05:52:50,835] Repair session a0e590e1-1d99-11e6-9d63-b717b380ffdd 
> for range (-3309358208555432808,-3279958773585646585] finished (progress: 54%)
> [2016-05-19 05:53:09,446] Repair session a0e590e3-1d99-11e6-9d63-b717b380ffdd 
> for range (8149151263857514385,8181801084802729407] finished (progress: 55%)
> [2016-05-19 05:53:13,808] Repair session a0e5b7f1-1d99-11e6-9d63-b717b380ffdd 
> for range (3372779397996730299,3381236471688156773] finished (progress: 55%)
> [2016-05-19 05:53:27,543] Repair session a0e5b7f3-1d99-11e6-9d63-b717b380ffdd 
> for range (-4182952858113330342,-4157904914928848809] finished (progress: 55%)
> [2016-05-19 05:53:41,128] Repair session a0e5df00-1d99-11e6-9d63-b717b380ffdd 
> for range (6499366179019889198,6523760493740195344] finished (progress: 55%)
> And its 10:46:25 Now, almost 5 hours since it has been stuck right there.
> Earlier i could see repair session going on in system.log but there are no 
> logs coming in right now, all i get in logs is regular index summary 
> redistribution logs.
> Last logs for repair i saw in logs :-
> INFO  [RepairJobTask:5] 2016-05-19 05:53:41,125 RepairJob.java:152 - [repair 
> #a0e5df00-1d99-11e6-9d63-b717b380ffdd] TABLE_NAME is fully synced
> INFO  [RepairJobTask:5] 2016-05-19 05:53:41,126 RepairSession.java:279 - 
> [repair #a0e5df00-1d99-11e6-9d63-b717b380ffdd] Session completed successfully
> INFO  [RepairJobTask:5] 2016-05-19 05:53:41,126 RepairRunnable.java:232 - 
> Repair session a0e5df00-1d99-11e6-9d63-b717b380ffdd for range 
> (6499366179019889198,6523760493740195344] finished
> Its an incremental repair, and in "nodetool netstats" output i can see logs 
> like :-
> Repair e3055fb0-1d9d-11e6-9d63-b717b380ffdd
>     /Node-2
>         Receiving 8 files, 1093461 bytes total. Already received 8 files, 
> 1093461 bytes total
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80872-big-Data.db
>  399475/399475 bytes(100%) received from idx:0/Node-2
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80879-big-Data.db
>  53809/53809 bytes(100%) received from idx:0/Node-2
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80878-big-Data.db
>  89955/89955 bytes(100%) received from idx:0/Node-2
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80881-big-Data.db
>  168790/168790 bytes(100%) received from idx:0/Node-2
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80886-big-Data.db
>  107785/107785 bytes(100%) received from idx:0/Node-2
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80880-big-Data.db
>  52889/52889 bytes(100%) received from idx:0/Node-2
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80884-big-Data.db
>  148882/148882 bytes(100%) received from idx:0/Node-2
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80883-big-Data.db
>  71876/71876 bytes(100%) received from idx:0/Node-2
>         Sending 5 files, 863321 bytes total. Already sent 5 files, 863321 
> bytes total
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73168-big-Data.db
>  161895/161895 bytes(100%) sent to idx:0/Node-2
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-72604-big-Data.db
>  399865/399865 bytes(100%) sent to idx:0/Node-2
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73147-big-Data.db
>  149066/149066 bytes(100%) sent to idx:0/Node-2
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-72682-big-Data.db
>  126000/126000 bytes(100%) sent to idx:0/Node-2
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73173-big-Data.db
>  26495/26495 bytes(100%) sent to idx:0/Node-2
> Repair c0c8af20-1d9c-11e6-9d63-b717b380ffdd
>     /Node-3
>         Receiving 11 files, 13896288 bytes total. Already received 11 files, 
> 13896288 bytes total
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79186-big-Data.db
>  1598874/1598874 bytes(100%) received from idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79196-big-Data.db
>  736365/736365 bytes(100%) received from idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79197-big-Data.db
>  326558/326558 bytes(100%) received from idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79187-big-Data.db
>  1484827/1484827 bytes(100%) received from idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79180-big-Data.db
>  393636/393636 bytes(100%) received from idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79184-big-Data.db
>  825459/825459 bytes(100%) received from idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79188-big-Data.db
>  3568782/3568782 bytes(100%) received from idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79182-big-Data.db
>  271222/271222 bytes(100%) received from idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79193-big-Data.db
>  4315497/4315497 bytes(100%) received from idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79183-big-Data.db
>  19775/19775 bytes(100%) received from idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-79192-big-Data.db
>  355293/355293 bytes(100%) received from idx:0/Node-3
>         Sending 5 files, 9444101 bytes total. Already sent 5 files, 9444101 
> bytes total
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73168-big-Data.db
>  1796825/1796825 bytes(100%) sent to idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-72604-big-Data.db
>  4549996/4549996 bytes(100%) sent to idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73147-big-Data.db
>  1658881/1658881 bytes(100%) sent to idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-72682-big-Data.db
>  1418335/1418335 bytes(100%) sent to idx:0/Node-3
>             
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73173-big-Data.db
>  20064/20064 bytes(100%) sent to idx:0/Node-3
> Read Repair Statistics:
> Attempted: 1142
> Mismatch (Blocking): 0
> Mismatch (Background): 0
> Pool Name                    Active   Pending      Completed
> Large messages                  n/a         0            779
> Small messages                  n/a         0       14756609
> Gossip messages                 n/a         0         119647
> The last three fields "Large messages" , "Small messages"  and "Gossip 
> messages" keep changing, "Large messages" has incremented by 2 in last 5 
> hours, other 2 are changing more frequently.
> I am unable to figure out whether repair is going on or stuck.. If its 
> stuck.. what should be my course of action if i want to get that table 
> repaired?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to