[
https://issues.apache.org/jira/browse/CASSANDRA-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Dejanovski updated CASSANDRA-16406:
---------------------------------------------
Description:
While working on the repair quality testing in CASSANDRA-16245, it appeared
that the node coordinating repairs on a 20GB per node dataset was generating
more than 2G of log with a total duration for the incremental repair scenarios
of ~2h40m:
https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/37/workflows/6a7a41c8-0fca-4080-b37e-3b38998b3fab/jobs/49/steps
]
!with_debug_logging.png!
The logs showed a lot of messages from the MerkleTree class at high pace:
{noformat}
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10)
Fully inconsistent range [#<TreeRange
(-6738651767434294419,-6738622715859497972] depth=11>, #<TreeRange
(-6738622715859497972,-6738593664284701525] depth=11>]
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (9)
Inconsistent digest on right sub-range #<TreeRange
(-6738593664284701525,-6738535561135108630] depth=10>: [#<OffHeapInner
-6738564612709905078
hash=[b8efd3d684474276f316b1bc9f23b463cda4f8d620a4b5cf4d2c7dbb101bbe85]
children=[#<OffHeapLeaf
[fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>
#<OffHeapLeaf
[471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>]>,
#<OffHeapInner -6738564612709905078
hash=[95334327a0b50b7d3c51d6588a5f3d57701e0b978f6ab28e85cda3cb5a094eb5]
children=[#<OffHeapLeaf
[0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>
#<OffHeapLeaf
[981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]>]
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:262 - (10)
Hashing sub-ranges [#<TreeRange (-6738593664284701525,-6738564612709905078]
depth=11>, #<TreeRange (-6738564612709905078,-6738535561135108630] depth=11>]
for #<TreeRange (-6738593664284701525,-6738535561135108630] depth=10> divided
by midpoint -6738564612709905078
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:272 - (10)
Inconsistent digest on left sub-range #<TreeRange
(-6738593664284701525,-6738564612709905078] depth=11>: [#<OffHeapLeaf
[fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>,
#<OffHeapLeaf
[0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>]
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (10)
Inconsistent digest on right sub-range #<TreeRange
(-6738564612709905078,-6738535561135108630] depth=11>: [#<OffHeapLeaf
[471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>,
#<OffHeapLeaf
[981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10)
Fully inconsistent range [#<TreeRange
(-6738593664284701525,-6738564612709905078] depth=11>, #<TreeRange
(-6738564612709905078,-6738535561135108630] depth=11>]
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (9) Fully
inconsistent range [#<TreeRange (-6738651767434294419,-6738593664284701525]
depth=10>, #<TreeRange (-6738593664284701525,-6738535561135108630] depth=10>]
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (8)
Inconsistent digest on right sub-range #<TreeRange
(-6738535561135108630,-6738419354835922841] depth=9>: [#<OffHeapInner
-6738477457985515736
hash=[806ede1a35783bf5fafe8b8ccefe4d3ff48e8f0e1314f8a9ce4b23f13fed4bf9]
children=[#<OffHeapInner -6738506509560312183
hash=[e6d133afbd8041266f8a1cfe456ff07c9e7debe8ff54279b579b252f2af78b6b]
children=[#]> #<OffHeapInner -6738448406410719289
hash=[66bfedb588f87ad3957497728b91bd436af364e6ec40df3299d006de151ac092]
children=[#]>]>, #<OffHeapInner -6738477457985515736
hash=[bf128431ddf7ad72417ce853f90abd0bc7a2a38088d5fbec864c02c516ed6458]
children=[#<OffHeapInner -6738506509560312183
hash=[12a71b62ceb096369c4797f56edd91a16dd9803348b4f4cfe1454027d7dae3d2]
children=[#]> #<OffHeapInner -6738448406410719289
hash=[adb59f5313473b44dd3b7fa697d72caaaa7b23b3c0610f23670942e2c137878a]
children=[#]>]>]{noformat}
When disabling debug logging, the duration dropped to ~2h05m with decent log
sizes:
[https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/38/workflows/c91e6ceb-438b-4f00-b38a-d670f9afb4c3/jobs/51]
!without_debug_logging.png!
There's apparently too much logging for each inconsistency found in the Merkle
tree comparisons and we should move this to TRACE level if we still want to
allow debug logging to be turned on by default.
I'll prepare a patch for the MerkleTree class and run the repair testing
scenarios again to verify their duration.
was:
While working on the repair quality testing in CASSANDRA-16245, it appeared
that the node coordinating repairs on a 20GB per node dataset was generating
more than 2G of log with a total duration for the incremental repair scenarios
of ~2h40m:
[https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/37/workflows/6a7a41c8-0fca-4080-b37e-3b38998b3fab/jobs/49/steps
]
!with_debug_logging.png!
The logs showed a lot of messages from the MerkleTree class at high pace:
{noformat}
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10)
Fully inconsistent range [#<TreeRange
(-6738651767434294419,-6738622715859497972] depth=11>, #<TreeRange
(-6738622715859497972,-6738593664284701525] depth=11>]
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (9)
Inconsistent digest on right sub-range #<TreeRange
(-6738593664284701525,-6738535561135108630] depth=10>: [#<OffHeapInner
-6738564612709905078
hash=[b8efd3d684474276f316b1bc9f23b463cda4f8d620a4b5cf4d2c7dbb101bbe85]
children=[#<OffHeapLeaf
[fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>
#<OffHeapLeaf
[471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>]>,
#<OffHeapInner -6738564612709905078
hash=[95334327a0b50b7d3c51d6588a5f3d57701e0b978f6ab28e85cda3cb5a094eb5]
children=[#<OffHeapLeaf
[0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>
#<OffHeapLeaf
[981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]>]
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:262 - (10)
Hashing sub-ranges [#<TreeRange (-6738593664284701525,-6738564612709905078]
depth=11>, #<TreeRange (-6738564612709905078,-6738535561135108630] depth=11>]
for #<TreeRange (-6738593664284701525,-6738535561135108630] depth=10> divided
by midpoint -6738564612709905078
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:272 - (10)
Inconsistent digest on left sub-range #<TreeRange
(-6738593664284701525,-6738564612709905078] depth=11>: [#<OffHeapLeaf
[fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>,
#<OffHeapLeaf
[0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>]
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (10)
Inconsistent digest on right sub-range #<TreeRange
(-6738564612709905078,-6738535561135108630] depth=11>: [#<OffHeapLeaf
[471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>,
#<OffHeapLeaf
[981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10)
Fully inconsistent range [#<TreeRange
(-6738593664284701525,-6738564612709905078] depth=11>, #<TreeRange
(-6738564612709905078,-6738535561135108630] depth=11>]
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (9) Fully
inconsistent range [#<TreeRange (-6738651767434294419,-6738593664284701525]
depth=10>, #<TreeRange (-6738593664284701525,-6738535561135108630] depth=10>]
DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (8)
Inconsistent digest on right sub-range #<TreeRange
(-6738535561135108630,-6738419354835922841] depth=9>: [#<OffHeapInner
-6738477457985515736
hash=[806ede1a35783bf5fafe8b8ccefe4d3ff48e8f0e1314f8a9ce4b23f13fed4bf9]
children=[#<OffHeapInner -6738506509560312183
hash=[e6d133afbd8041266f8a1cfe456ff07c9e7debe8ff54279b579b252f2af78b6b]
children=[#]> #<OffHeapInner -6738448406410719289
hash=[66bfedb588f87ad3957497728b91bd436af364e6ec40df3299d006de151ac092]
children=[#]>]>, #<OffHeapInner -6738477457985515736
hash=[bf128431ddf7ad72417ce853f90abd0bc7a2a38088d5fbec864c02c516ed6458]
children=[#<OffHeapInner -6738506509560312183
hash=[12a71b62ceb096369c4797f56edd91a16dd9803348b4f4cfe1454027d7dae3d2]
children=[#]> #<OffHeapInner -6738448406410719289
hash=[adb59f5313473b44dd3b7fa697d72caaaa7b23b3c0610f23670942e2c137878a]
children=[#]>]>]{noformat}
When disabling debug logging, the duration dropped to ~2h05m with decent log
sizes:
[https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/38/workflows/c91e6ceb-438b-4f00-b38a-d670f9afb4c3/jobs/51]
!without_debug_logging.png!
There's apparently too much logging for each inconsistency found in the Merkle
tree comparisons and we should move this to TRACE level if we still want to
allow debug logging to be turned on by default.
I'll prepare a patch for the MerkleTree class and run the repair testing
scenarios again to verify their duration.
> Debug logging affects repair performance
> ----------------------------------------
>
> Key: CASSANDRA-16406
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16406
> Project: Cassandra
> Issue Type: Bug
> Components: Consistency/Repair
> Reporter: Alexander Dejanovski
> Assignee: Alexander Dejanovski
> Priority: Normal
> Attachments: with_debug_logging.png, without_debug_logging.png
>
>
> While working on the repair quality testing in CASSANDRA-16245, it appeared
> that the node coordinating repairs on a 20GB per node dataset was generating
> more than 2G of log with a total duration for the incremental repair
> scenarios of ~2h40m:
> https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/37/workflows/6a7a41c8-0fca-4080-b37e-3b38998b3fab/jobs/49/steps
> ]
> !with_debug_logging.png!
> The logs showed a lot of messages from the MerkleTree class at high pace:
> {noformat}
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10)
> Fully inconsistent range [#<TreeRange
> (-6738651767434294419,-6738622715859497972] depth=11>, #<TreeRange
> (-6738622715859497972,-6738593664284701525] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (9)
> Inconsistent digest on right sub-range #<TreeRange
> (-6738593664284701525,-6738535561135108630] depth=10>: [#<OffHeapInner
> -6738564612709905078
> hash=[b8efd3d684474276f316b1bc9f23b463cda4f8d620a4b5cf4d2c7dbb101bbe85]
> children=[#<OffHeapLeaf
> [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>
> #<OffHeapLeaf
> [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>]>,
> #<OffHeapInner -6738564612709905078
> hash=[95334327a0b50b7d3c51d6588a5f3d57701e0b978f6ab28e85cda3cb5a094eb5]
> children=[#<OffHeapLeaf
> [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>
> #<OffHeapLeaf
> [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:262 - (10)
> Hashing sub-ranges [#<TreeRange (-6738593664284701525,-6738564612709905078]
> depth=11>, #<TreeRange (-6738564612709905078,-6738535561135108630] depth=11>]
> for #<TreeRange (-6738593664284701525,-6738535561135108630] depth=10> divided
> by midpoint -6738564612709905078
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:272 - (10)
> Inconsistent digest on left sub-range #<TreeRange
> (-6738593664284701525,-6738564612709905078] depth=11>: [#<OffHeapLeaf
> [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>,
> #<OffHeapLeaf
> [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (10)
> Inconsistent digest on right sub-range #<TreeRange
> (-6738564612709905078,-6738535561135108630] depth=11>: [#<OffHeapLeaf
> [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>,
> #<OffHeapLeaf
> [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10)
> Fully inconsistent range [#<TreeRange
> (-6738593664284701525,-6738564612709905078] depth=11>, #<TreeRange
> (-6738564612709905078,-6738535561135108630] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (9)
> Fully inconsistent range [#<TreeRange
> (-6738651767434294419,-6738593664284701525] depth=10>, #<TreeRange
> (-6738593664284701525,-6738535561135108630] depth=10>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (8)
> Inconsistent digest on right sub-range #<TreeRange
> (-6738535561135108630,-6738419354835922841] depth=9>: [#<OffHeapInner
> -6738477457985515736
> hash=[806ede1a35783bf5fafe8b8ccefe4d3ff48e8f0e1314f8a9ce4b23f13fed4bf9]
> children=[#<OffHeapInner -6738506509560312183
> hash=[e6d133afbd8041266f8a1cfe456ff07c9e7debe8ff54279b579b252f2af78b6b]
> children=[#]> #<OffHeapInner -6738448406410719289
> hash=[66bfedb588f87ad3957497728b91bd436af364e6ec40df3299d006de151ac092]
> children=[#]>]>, #<OffHeapInner -6738477457985515736
> hash=[bf128431ddf7ad72417ce853f90abd0bc7a2a38088d5fbec864c02c516ed6458]
> children=[#<OffHeapInner -6738506509560312183
> hash=[12a71b62ceb096369c4797f56edd91a16dd9803348b4f4cfe1454027d7dae3d2]
> children=[#]> #<OffHeapInner -6738448406410719289
> hash=[adb59f5313473b44dd3b7fa697d72caaaa7b23b3c0610f23670942e2c137878a]
> children=[#]>]>]{noformat}
> When disabling debug logging, the duration dropped to ~2h05m with decent log
> sizes:
> [https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/38/workflows/c91e6ceb-438b-4f00-b38a-d670f9afb4c3/jobs/51]
> !without_debug_logging.png!
> There's apparently too much logging for each inconsistency found in the
> Merkle tree comparisons and we should move this to TRACE level if we still
> want to allow debug logging to be turned on by default.
> I'll prepare a patch for the MerkleTree class and run the repair testing
> scenarios again to verify their duration.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]