[
https://issues.apache.org/jira/browse/CASSANDRA-11965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei Deng updated CASSANDRA-11965:
---------------------------------
Description:
[~jbellis] mentioned this as a potential improvement in his 2013 committer
meeting notes
(http://grokbase.com/t/cassandra/dev/132s6sh415/notes-from-committers-meeting-streaming-and-repair):
"making the repair coordinator smarter to know when to avoid duplicate
streaming. E.g., if replicas A and B have row X, but C does not, currently both
A and B will stream to C."
I tested in C* 3.0.6 and looks like this is still happening. Basically on a
3-node cluster I inserted into a trivial table under a keyspace with RF=3 and
forced two flushes on all nodes so that I have two SSTables on each node, then
I shutdown the 1st node and removed one SSTable from its data directory and
restarted the node. I connected cqlsh to this node and verified that with
CL.ONE the data is indeed missing; I now moved onto the 2nd node running a
"nodetool repair <keyspace> <table>", here are what I observed from system.log
on the 2nd node (as repair coordinator):
{noformat}
INFO [Thread-47] 2016-06-06 23:19:54,173 RepairRunnable.java:125 - Starting
repair command #1, repairing keyspace weitest with repair options (parallelism:
parallel, primary range: false, incremental: true, job threads: 1,
ColumnFamilies: [songs], dataCenters: [], hosts: [], # of ranges: 3)
INFO [Thread-47] 2016-06-06 23:19:54,253 RepairSession.java:237 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] new session: will sync /172.31.44.75,
/172.31.40.215, /172.31.36.148 on range
[(3074457345618258602,-9223372036854775808],
(-9223372036854775808,-3074457345618258603],
(-3074457345618258603,3074457345618258602]] for weitest.[songs]
INFO [Repair#1:1] 2016-06-06 23:19:54,268 RepairJob.java:172 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Requesting merkle trees for songs (to
[/172.31.40.215, /172.31.36.148, /172.31.44.75])
INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,335 RepairSession.java:181 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Received merkle tree for songs
from /172.31.40.215
INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,427 RepairSession.java:181 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Received merkle tree for songs
from /172.31.44.75
INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,460 RepairSession.java:181 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Received merkle tree for songs
from /172.31.36.148
INFO [RepairJobTask:1] 2016-06-06 23:19:54,466 SyncTask.java:73 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Endpoints /172.31.40.215 and
/172.31.36.148 have 3 range(s) out of sync for songs
INFO [RepairJobTask:1] 2016-06-06 23:19:54,467 RemoteSyncTask.java:54 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Forwarding streaming repair of 3
ranges to /172.31.40.215 (to be streamed with /172.31.36.148)
INFO [RepairJobTask:1] 2016-06-06 23:19:54,472 SyncTask.java:66 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Endpoints /172.31.36.148 and
/172.31.44.75 are consistent for songs
INFO [RepairJobTask:3] 2016-06-06 23:19:54,474 SyncTask.java:73 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Endpoints /172.31.40.215 and
/172.31.44.75 have 3 range(s) out of sync for songs
INFO [RepairJobTask:3] 2016-06-06 23:19:54,529 LocalSyncTask.java:68 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Performing streaming repair of 3
ranges with /172.31.40.215
INFO [RepairJobTask:3] 2016-06-06 23:19:54,574 StreamResultFuture.java:86 -
[Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57] Executing streaming plan for
Repair
INFO [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,576
StreamSession.java:238 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
Starting streaming to /172.31.40.215
INFO [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,580
StreamCoordinator.java:213 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57,
ID#0] Beginning stream session with /172.31.40.215
INFO [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:54,588
StreamResultFuture.java:168 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57
ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 1 files(174 bytes)
INFO [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:55,117
StreamResultFuture.java:182 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
Session with /172.31.40.215 is complete
INFO [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:55,120
StreamResultFuture.java:214 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
All sessions completed
INFO [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:55,123
LocalSyncTask.java:114 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Sync
complete using session 2d177cc0-2c3d-11e6-94d2-b35b6c93de57 between
/172.31.40.215 and /172.31.44.75 on songs
INFO [RepairJobTask:3] 2016-06-06 23:19:55,123 RepairJob.java:143 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] songs is fully synced
INFO [RepairJobTask:3] 2016-06-06 23:19:55,125 RepairSession.java:279 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Session completed successfully
INFO [RepairJobTask:3] 2016-06-06 23:19:55,126 RepairRunnable.java:240 -
Repair session 2d177cc0-2c3d-11e6-94d2-b35b6c93de57 for range
[(3074457345618258602,-9223372036854775808],
(-9223372036854775808,-3074457345618258603],
(-3074457345618258603,3074457345618258602]] finished
INFO [CompactionExecutor:991] 2016-06-06 23:19:55,131
CompactionManager.java:511 - Starting anticompaction for weitest.songs on
2/[BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db'),
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-1-big-Data.db')]
sstables
INFO [CompactionExecutor:991] 2016-06-06 23:19:55,131
CompactionManager.java:540 - SSTable
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db')
fully contained in range (-9223372036854775808,-9223372036854775808], mutating
repairedAt instead of anticompacting
INFO [CompactionExecutor:991] 2016-06-06 23:19:55,135
CompactionManager.java:540 - SSTable
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-1-big-Data.db')
fully contained in range (-9223372036854775808,-9223372036854775808], mutating
repairedAt instead of anticompacting
INFO [CompactionExecutor:991] 2016-06-06 23:19:55,137
CompactionManager.java:578 - Completed anticompaction successfully
INFO [InternalResponseStage:8] 2016-06-06 23:19:55,145
RepairRunnable.java:322 - Repair command #1 finished in 0 seconds
{noformat}
This is the log entry from the 1st node where one SSTable was missing and
needed to be repaired, indeed confirming that two equivalent streaming happened
from two replica nodes:
{noformat}
INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,307 Validator.java:274 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Sending completed merkle tree to
/172.31.44.75 for weitest.songs
INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,470 StreamingRepairTask.java:58
- [streaming task #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Performing streaming
repair of 3 ranges with /172.31.36.148
INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,497 StreamResultFuture.java:86
- [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483] Executing streaming plan for
Repair
INFO [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,498
StreamSession.java:238 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483]
Starting streaming to /172.31.36.148
INFO [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,512
StreamCoordinator.java:213 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483,
ID#0] Beginning stream session with /172.31.36.148
INFO [STREAM-IN-/172.31.36.148] 2016-06-06 23:19:54,562
StreamResultFuture.java:168 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483
ID#0] Prepare completed. Receiving 1 files(174 bytes), sending 0 files(0 bytes)
INFO [STREAM-INIT-/172.31.44.75:57066] 2016-06-06 23:19:54,579
StreamResultFuture.java:111 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57
ID#0] Creating new streaming plan for Repair
INFO [STREAM-INIT-/172.31.44.75:57066] 2016-06-06 23:19:54,580
StreamResultFuture.java:118 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57,
ID#0] Received streaming plan for Repair
INFO [STREAM-INIT-/172.31.44.75:47984] 2016-06-06 23:19:54,581
StreamResultFuture.java:118 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57,
ID#0] Received streaming plan for Repair
INFO [STREAM-IN-/172.31.44.75] 2016-06-06 23:19:54,584
StreamResultFuture.java:168 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57
ID#0] Prepare completed. Receiving 1 files(174 bytes), sending 0 files(0 bytes)
INFO [StreamReceiveTask:1] 2016-06-06 23:19:55,034
StreamResultFuture.java:182 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483]
Session with /172.31.36.148 is complete
INFO [StreamReceiveTask:1] 2016-06-06 23:19:55,037
StreamResultFuture.java:214 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483]
All sessions completed
INFO [StreamReceiveTask:1] 2016-06-06 23:19:55,040
StreamingRepairTask.java:85 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
streaming task succeed, returning response to /172.31.44.75
INFO [StreamReceiveTask:2] 2016-06-06 23:19:55,114
StreamResultFuture.java:182 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
Session with /172.31.44.75 is complete
INFO [StreamReceiveTask:2] 2016-06-06 23:19:55,115
StreamResultFuture.java:214 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
All sessions completed
INFO [CompactionExecutor:3] 2016-06-06 23:19:55,130
CompactionManager.java:511 - Starting anticompaction for weitest.songs on
1/[BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-4-big-Data.db'),
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-3-big-Data.db'),
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db')]
sstables
INFO [CompactionExecutor:3] 2016-06-06 23:19:55,131
CompactionManager.java:540 - SSTable
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db')
fully contained in range (-9223372036854775808,-9223372036854775808], mutating
repairedAt instead of anticompacting
INFO [CompactionExecutor:3] 2016-06-06 23:19:55,135
CompactionManager.java:578 - Completed anticompaction successfully
{noformat}
This will be a welcomed improvement for environments where compaction can
easily get behind by a lot of incoming small SSTables from repair streaming
(LCS and now-obsolete DTCS both suffer from this symptom a lot).
was:
[~jbellis] mentioned this as a potential improvement in his 2013 committer
meeting notes
(http://grokbase.com/t/cassandra/dev/132s6sh415/notes-from-committers-meeting-streaming-and-repair):
"making the repair coordinator smarter to know when to avoid duplicate
streaming. E.g., if replicas A and B have row X, but C does not, currently both
A and B will stream to C."
I tested in C* 3.0.6 and looks like this is still happening. Basically on a
3-node cluster I inserted into a trivial table under a keyspace with RF=3 and
forced two flushes on all nodes so that I have two SSTables on each node, then
I shutdown the 1st node and removed one SSTable from its data directory and
restarted the node. I connected cqlsh to this node and verified that with
CL.ONE the data is indeed missing; I now moved onto the 2nd node running a
"nodetool repair <keyspace> <table>", here are what I observed from system.log
on the 2nd node (as repair coordinator):
{noformat}
INFO [Thread-47] 2016-06-06 23:19:54,173 RepairRunnable.java:125 - Starting
repair command #1, repairing keyspace weitest with repair options (parallelism:
parallel, primary range: false, incremental: true, job threads: 1,
ColumnFamilies: [songs], dataCenters: [], hosts: [], # of ranges: 3)
INFO [Thread-47] 2016-06-06 23:19:54,253 RepairSession.java:237 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] new session: will sync /172.31.44.75,
/172.31.40.215, /172.31.36.148 on range
[(3074457345618258602,-9223372036854775808],
(-9223372036854775808,-3074457345618258603],
(-3074457345618258603,3074457345618258602]] for weitest.[songs]
INFO [Repair#1:1] 2016-06-06 23:19:54,268 RepairJob.java:172 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Requesting merkle trees for songs (to
[/172.31.40.215, /172.31.36.148, /172.31.44.75])
INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,335 RepairSession.java:181 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Received merkle tree for songs
from /172.31.40.215
INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,427 RepairSession.java:181 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Received merkle tree for songs
from /172.31.44.75
INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,460 RepairSession.java:181 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Received merkle tree for songs
from /172.31.36.148
INFO [RepairJobTask:1] 2016-06-06 23:19:54,466 SyncTask.java:73 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Endpoints /172.31.40.215 and
/172.31.36.148 have 3 range(s) out of sync for songs
INFO [RepairJobTask:1] 2016-06-06 23:19:54,467 RemoteSyncTask.java:54 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Forwarding streaming repair of 3
ranges to /172.31.40.215 (to be streamed with /172.31.36.148)
INFO [RepairJobTask:1] 2016-06-06 23:19:54,472 SyncTask.java:66 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Endpoints /172.31.36.148 and
/172.31.44.75 are consistent for songs
INFO [RepairJobTask:3] 2016-06-06 23:19:54,474 SyncTask.java:73 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Endpoints /172.31.40.215 and
/172.31.44.75 have 3 range(s) out of sync for songs
INFO [RepairJobTask:3] 2016-06-06 23:19:54,529 LocalSyncTask.java:68 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Performing streaming repair of 3
ranges with /172.31.40.215
INFO [RepairJobTask:3] 2016-06-06 23:19:54,574 StreamResultFuture.java:86 -
[Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57] Executing streaming plan for
Repair
INFO [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,576
StreamSession.java:238 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
Starting streaming to /172.31.40.215
INFO [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,580
StreamCoordinator.java:213 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57,
ID#0] Beginning stream session with /172.31.40.215
INFO [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:54,588
StreamResultFuture.java:168 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57
ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 1 files(174 bytes)
INFO [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:55,117
StreamResultFuture.java:182 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
Session with /172.31.40.215 is complete
INFO [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:55,120
StreamResultFuture.java:214 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
All sessions completed
INFO [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:55,123
LocalSyncTask.java:114 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Sync
complete using session 2d177cc0-2c3d-11e6-94d2-b35b6c93de57 between
/172.31.40.215 and /172.31.44.75 on songs
INFO [RepairJobTask:3] 2016-06-06 23:19:55,123 RepairJob.java:143 - [repair
#2d177cc0-2c3d-11e6-94d2-b35b6c93de57] songs is fully synced
INFO [RepairJobTask:3] 2016-06-06 23:19:55,125 RepairSession.java:279 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Session completed successfully
INFO [RepairJobTask:3] 2016-06-06 23:19:55,126 RepairRunnable.java:240 -
Repair session 2d177cc0-2c3d-11e6-94d2-b35b6c93de57 for range
[(3074457345618258602,-9223372036854775808],
(-9223372036854775808,-3074457345618258603],
(-3074457345618258603,3074457345618258602]] finished
INFO [CompactionExecutor:991] 2016-06-06 23:19:55,131
CompactionManager.java:511 - Starting anticompaction for weitest.songs on
2/[BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db'),
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-1-big-Data.db')]
sstables
INFO [CompactionExecutor:991] 2016-06-06 23:19:55,131
CompactionManager.java:540 - SSTable
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db')
fully contained in range (-9223372036854775808,-9223372036854775808], mutating
repairedAt instead of anticompacting
INFO [CompactionExecutor:991] 2016-06-06 23:19:55,135
CompactionManager.java:540 - SSTable
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-1-big-Data.db')
fully contained in range (-9223372036854775808,-9223372036854775808], mutating
repairedAt instead of anticompacting
INFO [CompactionExecutor:991] 2016-06-06 23:19:55,137
CompactionManager.java:578 - Completed anticompaction successfully
INFO [InternalResponseStage:8] 2016-06-06 23:19:55,145
RepairRunnable.java:322 - Repair command #1 finished in 0 seconds
{noformat}
This is the log entry from the 1st node where one SSTable was missing and
needed to be repaired, indeed confirming that two equivalent streaming happened
from two replica nodes:
{noformat}
INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,307 Validator.java:274 -
[repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Sending completed merkle tree to
/172.31.44.75 for weitest.songs
INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,470 StreamingRepairTask.java:58
- [streaming task #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Performing streaming
repair of 3 ranges with /172.31.36.148
INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,497 StreamResultFuture.java:86
- [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483] Executing streaming plan for
Repair
INFO [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,498
StreamSession.java:238 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483]
Starting streaming to /172.31.36.148
INFO [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,512
StreamCoordinator.java:213 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483,
ID#0] Beginning stream session with /172.31.36.148
INFO [STREAM-IN-/172.31.36.148] 2016-06-06 23:19:54,562
StreamResultFuture.java:168 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483
ID#0] Prepare completed. Receiving 1 files(174 bytes), sending 0 files(0 bytes)
INFO [STREAM-INIT-/172.31.44.75:57066] 2016-06-06 23:19:54,579
StreamResultFuture.java:111 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57
ID#0] Creating new streaming plan for Repair
INFO [STREAM-INIT-/172.31.44.75:57066] 2016-06-06 23:19:54,580
StreamResultFuture.java:118 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57,
ID#0] Received streaming plan for Repair
INFO [STREAM-INIT-/172.31.44.75:47984] 2016-06-06 23:19:54,581
StreamResultFuture.java:118 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57,
ID#0] Received streaming plan for Repair
INFO [STREAM-IN-/172.31.44.75] 2016-06-06 23:19:54,584
StreamResultFuture.java:168 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57
ID#0] Prepare completed. Receiving 1 files(174 bytes), sending 0 files(0 bytes)
INFO [StreamReceiveTask:1] 2016-06-06 23:19:55,034
StreamResultFuture.java:182 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483]
Session with /172.31.36.148 is complete
INFO [StreamReceiveTask:1] 2016-06-06 23:19:55,037
StreamResultFuture.java:214 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483]
All sessions completed
INFO [StreamReceiveTask:1] 2016-06-06 23:19:55,040
StreamingRepairTask.java:85 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
streaming task succeed, returning response to /172.31.44.75
INFO [StreamReceiveTask:2] 2016-06-06 23:19:55,114
StreamResultFuture.java:182 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
Session with /172.31.44.75 is complete
INFO [StreamReceiveTask:2] 2016-06-06 23:19:55,115
StreamResultFuture.java:214 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
All sessions completed
INFO [CompactionExecutor:3] 2016-06-06 23:19:55,130
CompactionManager.java:511 - Starting anticompaction for weitest.songs on
1/[BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-4-big-Data.db'),
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-3-big-Data.db'),
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db')]
sstables
INFO [CompactionExecutor:3] 2016-06-06 23:19:55,131
CompactionManager.java:540 - SSTable
BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db')
fully contained in range (-9223372036854775808,-9223372036854775808], mutating
repairedAt instead of anticompacting
INFO [CompactionExecutor:3] 2016-06-06 23:19:55,135
CompactionManager.java:578 - Completed anticompaction successfully
{noformat}
> Duplicated effort in repair streaming
> -------------------------------------
>
> Key: CASSANDRA-11965
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11965
> Project: Cassandra
> Issue Type: Improvement
> Components: Streaming and Messaging
> Reporter: Wei Deng
>
> [~jbellis] mentioned this as a potential improvement in his 2013 committer
> meeting notes
> (http://grokbase.com/t/cassandra/dev/132s6sh415/notes-from-committers-meeting-streaming-and-repair):
> "making the repair coordinator smarter to know when to avoid duplicate
> streaming. E.g., if replicas A and B have row X, but C does not, currently
> both A and B will stream to C."
> I tested in C* 3.0.6 and looks like this is still happening. Basically on a
> 3-node cluster I inserted into a trivial table under a keyspace with RF=3 and
> forced two flushes on all nodes so that I have two SSTables on each node,
> then I shutdown the 1st node and removed one SSTable from its data directory
> and restarted the node. I connected cqlsh to this node and verified that with
> CL.ONE the data is indeed missing; I now moved onto the 2nd node running a
> "nodetool repair <keyspace> <table>", here are what I observed from
> system.log on the 2nd node (as repair coordinator):
> {noformat}
> INFO [Thread-47] 2016-06-06 23:19:54,173 RepairRunnable.java:125 - Starting
> repair command #1, repairing keyspace weitest with repair options
> (parallelism: parallel, primary range: false, incremental: true, job threads:
> 1, ColumnFamilies: [songs], dataCenters: [], hosts: [], # of ranges: 3)
> INFO [Thread-47] 2016-06-06 23:19:54,253 RepairSession.java:237 - [repair
> #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] new session: will sync /172.31.44.75,
> /172.31.40.215, /172.31.36.148 on range
> [(3074457345618258602,-9223372036854775808],
> (-9223372036854775808,-3074457345618258603],
> (-3074457345618258603,3074457345618258602]] for weitest.[songs]
> INFO [Repair#1:1] 2016-06-06 23:19:54,268 RepairJob.java:172 - [repair
> #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Requesting merkle trees for songs (to
> [/172.31.40.215, /172.31.36.148, /172.31.44.75])
> INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,335 RepairSession.java:181 -
> [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Received merkle tree for songs
> from /172.31.40.215
> INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,427 RepairSession.java:181 -
> [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Received merkle tree for songs
> from /172.31.44.75
> INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,460 RepairSession.java:181 -
> [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Received merkle tree for songs
> from /172.31.36.148
> INFO [RepairJobTask:1] 2016-06-06 23:19:54,466 SyncTask.java:73 - [repair
> #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Endpoints /172.31.40.215 and
> /172.31.36.148 have 3 range(s) out of sync for songs
> INFO [RepairJobTask:1] 2016-06-06 23:19:54,467 RemoteSyncTask.java:54 -
> [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Forwarding streaming repair of
> 3 ranges to /172.31.40.215 (to be streamed with /172.31.36.148)
> INFO [RepairJobTask:1] 2016-06-06 23:19:54,472 SyncTask.java:66 - [repair
> #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Endpoints /172.31.36.148 and
> /172.31.44.75 are consistent for songs
> INFO [RepairJobTask:3] 2016-06-06 23:19:54,474 SyncTask.java:73 - [repair
> #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Endpoints /172.31.40.215 and
> /172.31.44.75 have 3 range(s) out of sync for songs
> INFO [RepairJobTask:3] 2016-06-06 23:19:54,529 LocalSyncTask.java:68 -
> [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Performing streaming repair of
> 3 ranges with /172.31.40.215
> INFO [RepairJobTask:3] 2016-06-06 23:19:54,574 StreamResultFuture.java:86 -
> [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57] Executing streaming plan for
> Repair
> INFO [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,576
> StreamSession.java:238 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
> Starting streaming to /172.31.40.215
> INFO [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,580
> StreamCoordinator.java:213 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57,
> ID#0] Beginning stream session with /172.31.40.215
> INFO [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:54,588
> StreamResultFuture.java:168 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57
> ID#0] Prepare completed. Receiving 0 files(0 bytes), sending 1 files(174
> bytes)
> INFO [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:55,117
> StreamResultFuture.java:182 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
> Session with /172.31.40.215 is complete
> INFO [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:55,120
> StreamResultFuture.java:214 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
> All sessions completed
> INFO [STREAM-IN-/172.31.40.215] 2016-06-06 23:19:55,123
> LocalSyncTask.java:114 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Sync
> complete using session 2d177cc0-2c3d-11e6-94d2-b35b6c93de57 between
> /172.31.40.215 and /172.31.44.75 on songs
> INFO [RepairJobTask:3] 2016-06-06 23:19:55,123 RepairJob.java:143 - [repair
> #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] songs is fully synced
> INFO [RepairJobTask:3] 2016-06-06 23:19:55,125 RepairSession.java:279 -
> [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Session completed successfully
> INFO [RepairJobTask:3] 2016-06-06 23:19:55,126 RepairRunnable.java:240 -
> Repair session 2d177cc0-2c3d-11e6-94d2-b35b6c93de57 for range
> [(3074457345618258602,-9223372036854775808],
> (-9223372036854775808,-3074457345618258603],
> (-3074457345618258603,3074457345618258602]] finished
> INFO [CompactionExecutor:991] 2016-06-06 23:19:55,131
> CompactionManager.java:511 - Starting anticompaction for weitest.songs on
> 2/[BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db'),
>
> BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-1-big-Data.db')]
> sstables
> INFO [CompactionExecutor:991] 2016-06-06 23:19:55,131
> CompactionManager.java:540 - SSTable
> BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db')
> fully contained in range (-9223372036854775808,-9223372036854775808],
> mutating repairedAt instead of anticompacting
> INFO [CompactionExecutor:991] 2016-06-06 23:19:55,135
> CompactionManager.java:540 - SSTable
> BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-1-big-Data.db')
> fully contained in range (-9223372036854775808,-9223372036854775808],
> mutating repairedAt instead of anticompacting
> INFO [CompactionExecutor:991] 2016-06-06 23:19:55,137
> CompactionManager.java:578 - Completed anticompaction successfully
> INFO [InternalResponseStage:8] 2016-06-06 23:19:55,145
> RepairRunnable.java:322 - Repair command #1 finished in 0 seconds
> {noformat}
> This is the log entry from the 1st node where one SSTable was missing and
> needed to be repaired, indeed confirming that two equivalent streaming
> happened from two replica nodes:
> {noformat}
> INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,307 Validator.java:274 -
> [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Sending completed merkle tree
> to /172.31.44.75 for weitest.songs
> INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,470
> StreamingRepairTask.java:58 - [streaming task
> #2d177cc0-2c3d-11e6-94d2-b35b6c93de57] Performing streaming repair of 3
> ranges with /172.31.36.148
> INFO [AntiEntropyStage:1] 2016-06-06 23:19:54,497
> StreamResultFuture.java:86 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483]
> Executing streaming plan for Repair
> INFO [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,498
> StreamSession.java:238 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483]
> Starting streaming to /172.31.36.148
> INFO [StreamConnectionEstablisher:1] 2016-06-06 23:19:54,512
> StreamCoordinator.java:213 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483,
> ID#0] Beginning stream session with /172.31.36.148
> INFO [STREAM-IN-/172.31.36.148] 2016-06-06 23:19:54,562
> StreamResultFuture.java:168 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483
> ID#0] Prepare completed. Receiving 1 files(174 bytes), sending 0 files(0
> bytes)
> INFO [STREAM-INIT-/172.31.44.75:57066] 2016-06-06 23:19:54,579
> StreamResultFuture.java:111 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57
> ID#0] Creating new streaming plan for Repair
> INFO [STREAM-INIT-/172.31.44.75:57066] 2016-06-06 23:19:54,580
> StreamResultFuture.java:118 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57,
> ID#0] Received streaming plan for Repair
> INFO [STREAM-INIT-/172.31.44.75:47984] 2016-06-06 23:19:54,581
> StreamResultFuture.java:118 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57,
> ID#0] Received streaming plan for Repair
> INFO [STREAM-IN-/172.31.44.75] 2016-06-06 23:19:54,584
> StreamResultFuture.java:168 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57
> ID#0] Prepare completed. Receiving 1 files(174 bytes), sending 0 files(0
> bytes)
> INFO [StreamReceiveTask:1] 2016-06-06 23:19:55,034
> StreamResultFuture.java:182 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483]
> Session with /172.31.36.148 is complete
> INFO [StreamReceiveTask:1] 2016-06-06 23:19:55,037
> StreamResultFuture.java:214 - [Stream #2d38e770-2c3d-11e6-80ed-e382fc580483]
> All sessions completed
> INFO [StreamReceiveTask:1] 2016-06-06 23:19:55,040
> StreamingRepairTask.java:85 - [repair #2d177cc0-2c3d-11e6-94d2-b35b6c93de57]
> streaming task succeed, returning response to /172.31.44.75
> INFO [StreamReceiveTask:2] 2016-06-06 23:19:55,114
> StreamResultFuture.java:182 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
> Session with /172.31.44.75 is complete
> INFO [StreamReceiveTask:2] 2016-06-06 23:19:55,115
> StreamResultFuture.java:214 - [Stream #2d423640-2c3d-11e6-94d2-b35b6c93de57]
> All sessions completed
> INFO [CompactionExecutor:3] 2016-06-06 23:19:55,130
> CompactionManager.java:511 - Starting anticompaction for weitest.songs on
> 1/[BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-4-big-Data.db'),
>
> BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-3-big-Data.db'),
>
> BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db')]
> sstables
> INFO [CompactionExecutor:3] 2016-06-06 23:19:55,131
> CompactionManager.java:540 - SSTable
> BigTableReader(path='/mnt/ephemeral/cassandra/data/weitest/songs-b254f711134611e692c45f08f496518a/ma-2-big-Data.db')
> fully contained in range (-9223372036854775808,-9223372036854775808],
> mutating repairedAt instead of anticompacting
> INFO [CompactionExecutor:3] 2016-06-06 23:19:55,135
> CompactionManager.java:578 - Completed anticompaction successfully
> {noformat}
> This will be a welcomed improvement for environments where compaction can
> easily get behind by a lot of incoming small SSTables from repair streaming
> (LCS and now-obsolete DTCS both suffer from this symptom a lot).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)