[
https://issues.apache.org/jira/browse/CASSANDRA-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271317#comment-17271317
]
Alexander Dejanovski commented on CASSANDRA-16406:
--------------------------------------------------
[~spod], seems like you added most of these debug loggings.
Are you ok with me moving them to TRACE level?
> Debug logging affects repair performance
> ----------------------------------------
>
> Key: CASSANDRA-16406
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16406
> Project: Cassandra
> Issue Type: Bug
> Components: Consistency/Repair
> Reporter: Alexander Dejanovski
> Assignee: Alexander Dejanovski
> Priority: Normal
> Attachments: with_debug_logging.png, without_debug_logging.png
>
>
> While working on the repair quality testing in CASSANDRA-16245, it appeared
> that the node coordinating repairs on a 20GB per node dataset was generating
> more than 2G of log with a total duration for the incremental repair
> scenarios of ~2h40m:
> https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/37/workflows/6a7a41c8-0fca-4080-b37e-3b38998b3fab/jobs/49/steps
> ]
> !with_debug_logging.png!
> The logs showed a lot of messages from the MerkleTree class at high pace:
> {noformat}
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10)
> Fully inconsistent range [#<TreeRange
> (-6738651767434294419,-6738622715859497972] depth=11>, #<TreeRange
> (-6738622715859497972,-6738593664284701525] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (9)
> Inconsistent digest on right sub-range #<TreeRange
> (-6738593664284701525,-6738535561135108630] depth=10>: [#<OffHeapInner
> -6738564612709905078
> hash=[b8efd3d684474276f316b1bc9f23b463cda4f8d620a4b5cf4d2c7dbb101bbe85]
> children=[#<OffHeapLeaf
> [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>
> #<OffHeapLeaf
> [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>]>,
> #<OffHeapInner -6738564612709905078
> hash=[95334327a0b50b7d3c51d6588a5f3d57701e0b978f6ab28e85cda3cb5a094eb5]
> children=[#<OffHeapLeaf
> [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>
> #<OffHeapLeaf
> [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:262 - (10)
> Hashing sub-ranges [#<TreeRange (-6738593664284701525,-6738564612709905078]
> depth=11>, #<TreeRange (-6738564612709905078,-6738535561135108630] depth=11>]
> for #<TreeRange (-6738593664284701525,-6738535561135108630] depth=10> divided
> by midpoint -6738564612709905078
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:272 - (10)
> Inconsistent digest on left sub-range #<TreeRange
> (-6738593664284701525,-6738564612709905078] depth=11>: [#<OffHeapLeaf
> [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>,
> #<OffHeapLeaf
> [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (10)
> Inconsistent digest on right sub-range #<TreeRange
> (-6738564612709905078,-6738535561135108630] depth=11>: [#<OffHeapLeaf
> [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>,
> #<OffHeapLeaf
> [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10)
> Fully inconsistent range [#<TreeRange
> (-6738593664284701525,-6738564612709905078] depth=11>, #<TreeRange
> (-6738564612709905078,-6738535561135108630] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (9)
> Fully inconsistent range [#<TreeRange
> (-6738651767434294419,-6738593664284701525] depth=10>, #<TreeRange
> (-6738593664284701525,-6738535561135108630] depth=10>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (8)
> Inconsistent digest on right sub-range #<TreeRange
> (-6738535561135108630,-6738419354835922841] depth=9>: [#<OffHeapInner
> -6738477457985515736
> hash=[806ede1a35783bf5fafe8b8ccefe4d3ff48e8f0e1314f8a9ce4b23f13fed4bf9]
> children=[#<OffHeapInner -6738506509560312183
> hash=[e6d133afbd8041266f8a1cfe456ff07c9e7debe8ff54279b579b252f2af78b6b]
> children=[#]> #<OffHeapInner -6738448406410719289
> hash=[66bfedb588f87ad3957497728b91bd436af364e6ec40df3299d006de151ac092]
> children=[#]>]>, #<OffHeapInner -6738477457985515736
> hash=[bf128431ddf7ad72417ce853f90abd0bc7a2a38088d5fbec864c02c516ed6458]
> children=[#<OffHeapInner -6738506509560312183
> hash=[12a71b62ceb096369c4797f56edd91a16dd9803348b4f4cfe1454027d7dae3d2]
> children=[#]> #<OffHeapInner -6738448406410719289
> hash=[adb59f5313473b44dd3b7fa697d72caaaa7b23b3c0610f23670942e2c137878a]
> children=[#]>]>]{noformat}
> When disabling debug logging, the duration dropped to ~2h05m with decent log
> sizes:
> [https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/38/workflows/c91e6ceb-438b-4f00-b38a-d670f9afb4c3/jobs/51]
> !without_debug_logging.png!
> There's apparently too much logging for each inconsistency found in the
> Merkle tree comparisons and we should move this to TRACE level if we still
> want to allow debug logging to be turned on by default.
> I'll prepare a patch for the MerkleTree class and run the repair testing
> scenarios again to verify their duration.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]