P.S.: Here's a corresponding log from the second node:

INFO  [AntiEntropyStage:1] 2018-06-04 13:37:16,409 Validator.java:281
- [repair #afc2ef90-67c0-11e8-b07c-c365701888e8] Sending completed
merkle tree to /14.0.53.234 for asm_log.event
INFO  [StreamReceiveTask:30] 2018-06-04 14:14:28,989
StreamResultFuture.java:187 - [Stream
#6244fd50-67ff-11e8-b07c-c365701888e8] Session with /14.0.53.234 is
complete
INFO  [StreamReceiveTask:30] 2018-06-04 14:14:28,990
StreamResultFuture.java:219 - [Stream
#6244fd50-67ff-11e8-b07c-c365701888e8] All sessions completed
INFO  [AntiEntropyStage:1] 2018-06-04 14:14:29,000
ActiveRepairService.java:452 - [repair
#af1aefc0-67c0-11e8-b07c-c365701888e8] Not a global repair, will not
do anticompaction


Why is there no anticompaction if it's an incremental repair?

We have two datacenters currently, this concerns the second one that
we recently brought up (with nodetool rebuild). We cannot do a repair
across datacenters, because nodes in the old DC would run out of disk
space.

Regards,

Martin



On Tue, Jun 5, 2018 at 6:06 PM, Martin Mačura <m.mac...@gmail.com> wrote:
> Hi,
> we're on cassandra 3.11.2, and we're having some issues with repairs.
> They take ages to complete, and some time ago the incremental repair
> stopped working - that is, SSTables are not being marked as repaired,
> even though the repair reports success.
>
> Running a full or incremental repair does not make any difference.
>
> Here's a log of a typical repair (omitted a lot of 'Maximum memory
> usage' messages):
>
> INFO  [Repair-Task-12] 2018-06-04 06:29:50,396 RepairRunnable.java:139
> - Starting repair command #11 (af1aefc0-67c0-11e8-b07c-c365701888e8),
> repairing keyspace prod with repair options (parallelism: parallel,
> primary range: false, incremental: true, job threads: 1,
> ColumnFamilies: [event], dataCenters: [DC1], hosts: [], # of ranges:
> 1280, pull repair: false)
> INFO  [Repair-Task-12] 2018-06-04 06:29:51,497 RepairSession.java:228
> - [repair #afc2ef90-67c0-11e8-b07c-c365701888e8] new session: will
> sync /14.0.53.234, /14.0.52.115 on range [...] for asm_log.[event]
> INFO  [Repair#11:1] 2018-06-04 06:29:51,776 RepairJob.java:169 -
> [repair #afc2ef90-67c0-11e8-b07c-c365701888e8] Requesting merkle trees
> for event (to [/14.0.52.115, /14.0.53.234])
> INFO  [ValidationExecutor:10] 2018-06-04 06:31:13,859
> NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB),
> cannot allocate chunk of 1.000MiB
> WARN  [PERIODIC-COMMIT-LOG-SYNCER] 2018-06-04 06:32:01,385
> NoSpamLogger.java:94 - Out of 14 commit log syncs over the past
> 134.02s with average duration of 34.90ms, 2 have exceeded the
> configured commit interval by an average of 60.66ms
> ...
> INFO  [ValidationExecutor:10] 2018-06-04 13:31:19,011
> NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB),
> cannot allocate chunk of 1.000MiB
> INFO  [AntiEntropyStage:1] 2018-06-04 13:37:17,357
> RepairSession.java:180 - [repair
> #afc2ef90-67c0-11e8-b07c-c365701888e8] Received merkle tree for event
> from /14.0.52.115
> INFO  [ValidationExecutor:10] 2018-06-04 13:46:19,281
> NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB),
> cannot allocate chunk of 1.000MiB
> INFO  [IndexSummaryManager:1] 2018-06-04 13:57:18,772
> IndexSummaryRedistribution.java:76 - Redistributing index summaries
> INFO  [AntiEntropyStage:1] 2018-06-04 13:58:21,971
> RepairSession.java:180 - [repair
> #afc2ef90-67c0-11e8-b07c-c365701888e8] Received merkle tree for event
> from /14.0.53.234
> INFO  [RepairJobTask:4] 2018-06-04 13:58:39,780 SyncTask.java:73 -
> [repair #afc2ef90-67c0-11e8-b07c-c365701888e8] Endpoints /14.0.52.115
> and /14.0.53.234 have 15406 range(s) out of sync for event
> INFO  [RepairJobTask:4] 2018-06-04 13:58:39,781 LocalSyncTask.java:71
> - [repair #afc2ef90-67c0-11e8-b07c-c365701888e8] Performing streaming
> repair of 15406 ranges with /14.0.52.115
> INFO  [RepairJobTask:4] 2018-06-04 13:59:49,075
> StreamResultFuture.java:90 - [Stream
> #6244fd50-67ff-11e8-b07c-c365701888e8] Executing streaming plan for
> Repair
> INFO  [StreamConnectionEstablisher:3] 2018-06-04 13:59:49,076
> StreamSession.java:266 - [Stream
> #6244fd50-67ff-11e8-b07c-c365701888e8] Starting streaming to
> /14.0.52.115
> INFO  [StreamConnectionEstablisher:3] 2018-06-04 13:59:49,089
> StreamCoordinator.java:264 - [Stream
> #6244fd50-67ff-11e8-b07c-c365701888e8, ID#0] Beginning stream session
> with /14.0.52.115
> INFO  [STREAM-IN-/14.0.52.115:7000] 2018-06-04 14:01:14,423
> StreamResultFuture.java:173 - [Stream
> #6244fd50-67ff-11e8-b07c-c365701888e8 ID#0] Prepare completed.
> Receiving 321 files(6.238GiB), sending 318 files(6.209GiB)
> WARN  [Service Thread] 2018-06-04 14:12:15,578 GCInspector.java:282 -
> ConcurrentMarkSweep GC in 4095ms.  CMS Old Gen: 4086661264 ->
> 1107272664; Par Eden Space: 503316480 -> 0; Par Survivor Space:
> 21541464 -> 0
> ...
> WARN  [GossipTasks:1] 2018-06-04 14:12:15,677 FailureDetector.java:288
> - Not marking nodes down due to local pause of 5123793157 > 5000000000
> INFO  [ScheduledTasks:1] 2018-06-04 14:12:20,611 NoSpamLogger.java:91
> - Some operations were slow, details available at debug level
> (debug.log)
> INFO  [STREAM-IN-/14.0.52.115:7000] 2018-06-04 14:14:29,188
> StreamResultFuture.java:187 - [Stream
> #6244fd50-67ff-11e8-b07c-c365701888e8] Session with /14.0.52.115 is
> complete
> INFO  [STREAM-IN-/14.0.52.115:7000] 2018-06-04 14:14:29,190
> StreamResultFuture.java:219 - [Stream
> #6244fd50-67ff-11e8-b07c-c365701888e8] All sessions completed
> INFO  [STREAM-IN-/14.0.52.115:7000] 2018-06-04 14:14:29,190
> LocalSyncTask.java:121 - [repair
> #afc2ef90-67c0-11e8-b07c-c365701888e8] Sync complete using session
> afc2ef90-67c0-11e8-b07c-c365701888e8 between /14.0.52.115 and
> /14.0.53.234 on event
> INFO  [RepairJobTask:5] 2018-06-04 14:14:29,191 RepairJob.java:143 -
> [repair #afc2ef90-67c0-11e8-b07c-c365701888e8] event is fully synced
> INFO  [RepairJobTask:5] 2018-06-04 14:14:29,193 RepairSession.java:270
> - [repair #afc2ef90-67c0-11e8-b07c-c365701888e8] Session completed
> successfully
>
>
>
> Tablestats:
>                SSTable count: 714
>                Space used (live): 489416489322
>                Space used (total): 489416489322
>                Space used by snapshots (total): 0
>                Off heap memory used (total): 311166982
>                SSTable Compression Ratio: 0.28468835622230676
>                Number of partitions (estimate): 22511321
>                Memtable cell count: 6635
>                Memtable data size: 3236775
>                Memtable off heap memory used: 0
>                Memtable switch count: 1649
>                ...
>                Percent repaired: 85.62
>                Bloom filter false positives: 36849
>                Bloom filter false ratio: 0.00024
>                Bloom filter space used: 87092784
>                Bloom filter off heap memory used: 87061568
>                Index summary off heap memory used: 15474806
>                Compression metadata off heap memory used: 208630608
>                Compacted partition minimum bytes: 61
>                Compacted partition maximum bytes: 107964792
>                Compacted partition mean bytes: 35262
>                Average live cells per slice (last five minutes):
> 301.46281597264624
>                Maximum live cells per slice (last five minutes): 3311
>                Average tombstones per slice (last five minutes): 1.0
>                Maximum tombstones per slice (last five minutes): 1
>                Dropped Mutations: 0
>
> Histograms:
>
> Percentile  SSTables     Write Latency      Read Latency    Partition
> Size        Cell Count
>                              (micros)          (micros)
> (bytes)
> 50%             2.00             24.60           3379.39
> 3311               310
> 75%             2.00             29.52           8409.01
> 51012              5722
> 95%             2.00             35.43          30130.99
> 105778              9887
> 98%             2.00             42.51          43388.63
> 263210             35425
> 99%             2.00             42.51          62479.63
> 263210             35425
> Min             0.00              9.89            182.79
>  61                 0
> Max             3.00            263.21         155469.30
> 107964792          10090808
>
>
>
> Thanks for any help!
>
> Martin

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Reply via email to