[jira] [Commented] (CASSANDRA-16406) Debug logging affects repair performance

2021-01-28 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273412#comment-17273412
 ] 

Marcus Eriksson commented on CASSANDRA-16406:
-

running tests here: 
https://app.circleci.com/pipelines/github/krummas/cassandra?branch=CASSANDRA-16406

> Debug logging affects repair performance
> 
>
> Key: CASSANDRA-16406
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16406
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Normal
> Fix For: 4.0-rc
>
> Attachments: 16406-2-trunk.txt, CASSANDRA-16406.png, 
> with_debug_logging.png, without_debug_logging.png
>
>
> While working on the repair quality testing in CASSANDRA-16245, it appeared 
> that the node coordinating repairs on a 20GB per node dataset was generating 
> more than 2G of log with a total duration for the incremental repair 
> scenarios of ~2h40m: 
> https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/37/workflows/6a7a41c8-0fca-4080-b37e-3b38998b3fab/jobs/49/steps
>  ]
>  !with_debug_logging.png!  
>  The logs showed a lot of messages from the MerkleTree class at high pace:
> {noformat}
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10) 
> Fully inconsistent range [# (-6738651767434294419,-6738622715859497972] depth=11>, # (-6738622715859497972,-6738593664284701525] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (9) 
> Inconsistent digest on right sub-range # (-6738593664284701525,-6738535561135108630] depth=10>: [# -6738564612709905078 
> hash=[b8efd3d684474276f316b1bc9f23b463cda4f8d620a4b5cf4d2c7dbb101bbe85] 
> children=[# [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]> 
> # [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>]>, 
> # hash=[95334327a0b50b7d3c51d6588a5f3d57701e0b978f6ab28e85cda3cb5a094eb5] 
> children=[# [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]> 
> # [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:262 - (10) 
> Hashing sub-ranges [# depth=11>, #] 
> for # divided 
> by midpoint -6738564612709905078
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:272 - (10) 
> Inconsistent digest on left sub-range # (-6738593664284701525,-6738564612709905078] depth=11>: [# [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>, 
> # [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (10) 
> Inconsistent digest on right sub-range # (-6738564612709905078,-6738535561135108630] depth=11>: [# [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>, 
> # [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10) 
> Fully inconsistent range [# (-6738593664284701525,-6738564612709905078] depth=11>, # (-6738564612709905078,-6738535561135108630] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (9) 
> Fully inconsistent range [# (-6738651767434294419,-6738593664284701525] depth=10>, # (-6738593664284701525,-6738535561135108630] depth=10>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (8) 
> Inconsistent digest on right sub-range # (-6738535561135108630,-6738419354835922841] depth=9>: [# -6738477457985515736 
> hash=[806ede1a35783bf5fafe8b8ccefe4d3ff48e8f0e1314f8a9ce4b23f13fed4bf9] 
> children=[# hash=[e6d133afbd8041266f8a1cfe456ff07c9e7debe8ff54279b579b252f2af78b6b] 
> children=[#]> # hash=[66bfedb588f87ad3957497728b91bd436af364e6ec40df3299d006de151ac092] 
> children=[#]>]>, # hash=[bf128431ddf7ad72417ce853f90abd0bc7a2a38088d5fbec864c02c516ed6458] 
> children=[# hash=[12a71b62ceb096369c4797f56edd91a16dd9803348b4f4cfe1454027d7dae3d2] 
> children=[#]> # hash=[adb59f5313473b44dd3b7fa697d72c7b23b3c0610f23670942e2c137878a] 
> children=[#]>]>]{noformat}
> When disabling debug logging, the duration dropped to ~2h05m with decent log 
> sizes:
> [https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/38/workflows/c91e6ceb-438b-4f00-b38a-d670f9afb4c3/jobs/51]
> !without_debug_logging.png!
> There's apparently too much logging for each inconsistency found in the 
> Merkle tree comparisons and we should move this to TRACE level if we still 
> want to allow debug logging to be turned on by default.
> I'll prepare a patch for the MerkleTree class and run the repair testing 
> scenarios again to verify their duration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CASSANDRA-16406) Debug logging affects repair performance

2021-01-28 Thread Alexander Dejanovski (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273398#comment-17273398
 ] 

Alexander Dejanovski commented on CASSANDRA-16406:
--

Done, and I attached the updated patch to the ticket.

> Debug logging affects repair performance
> 
>
> Key: CASSANDRA-16406
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16406
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Normal
> Fix For: 4.0-rc
>
> Attachments: 16406-2-trunk.txt, CASSANDRA-16406.png, 
> with_debug_logging.png, without_debug_logging.png
>
>
> While working on the repair quality testing in CASSANDRA-16245, it appeared 
> that the node coordinating repairs on a 20GB per node dataset was generating 
> more than 2G of log with a total duration for the incremental repair 
> scenarios of ~2h40m: 
> https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/37/workflows/6a7a41c8-0fca-4080-b37e-3b38998b3fab/jobs/49/steps
>  ]
>  !with_debug_logging.png!  
>  The logs showed a lot of messages from the MerkleTree class at high pace:
> {noformat}
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10) 
> Fully inconsistent range [# (-6738651767434294419,-6738622715859497972] depth=11>, # (-6738622715859497972,-6738593664284701525] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (9) 
> Inconsistent digest on right sub-range # (-6738593664284701525,-6738535561135108630] depth=10>: [# -6738564612709905078 
> hash=[b8efd3d684474276f316b1bc9f23b463cda4f8d620a4b5cf4d2c7dbb101bbe85] 
> children=[# [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]> 
> # [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>]>, 
> # hash=[95334327a0b50b7d3c51d6588a5f3d57701e0b978f6ab28e85cda3cb5a094eb5] 
> children=[# [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]> 
> # [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:262 - (10) 
> Hashing sub-ranges [# depth=11>, #] 
> for # divided 
> by midpoint -6738564612709905078
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:272 - (10) 
> Inconsistent digest on left sub-range # (-6738593664284701525,-6738564612709905078] depth=11>: [# [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>, 
> # [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (10) 
> Inconsistent digest on right sub-range # (-6738564612709905078,-6738535561135108630] depth=11>: [# [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>, 
> # [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10) 
> Fully inconsistent range [# (-6738593664284701525,-6738564612709905078] depth=11>, # (-6738564612709905078,-6738535561135108630] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (9) 
> Fully inconsistent range [# (-6738651767434294419,-6738593664284701525] depth=10>, # (-6738593664284701525,-6738535561135108630] depth=10>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (8) 
> Inconsistent digest on right sub-range # (-6738535561135108630,-6738419354835922841] depth=9>: [# -6738477457985515736 
> hash=[806ede1a35783bf5fafe8b8ccefe4d3ff48e8f0e1314f8a9ce4b23f13fed4bf9] 
> children=[# hash=[e6d133afbd8041266f8a1cfe456ff07c9e7debe8ff54279b579b252f2af78b6b] 
> children=[#]> # hash=[66bfedb588f87ad3957497728b91bd436af364e6ec40df3299d006de151ac092] 
> children=[#]>]>, # hash=[bf128431ddf7ad72417ce853f90abd0bc7a2a38088d5fbec864c02c516ed6458] 
> children=[# hash=[12a71b62ceb096369c4797f56edd91a16dd9803348b4f4cfe1454027d7dae3d2] 
> children=[#]> # hash=[adb59f5313473b44dd3b7fa697d72c7b23b3c0610f23670942e2c137878a] 
> children=[#]>]>]{noformat}
> When disabling debug logging, the duration dropped to ~2h05m with decent log 
> sizes:
> [https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/38/workflows/c91e6ceb-438b-4f00-b38a-d670f9afb4c3/jobs/51]
> !without_debug_logging.png!
> There's apparently too much logging for each inconsistency found in the 
> Merkle tree comparisons and we should move this to TRACE level if we still 
> want to allow debug logging to be turned on by default.
> I'll prepare a patch for the MerkleTree class and run the repair testing 
> scenarios again to verify their duration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CASSANDRA-16406) Debug logging affects repair performance

2021-01-28 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273393#comment-17273393
 ] 

Marcus Eriksson commented on CASSANDRA-16406:
-

+1 - there is some excessive debug logging in 
{{RepairJob#createOptimisedSyncingSyncTasks}} as well, could you move them to 
{{trace}} as well and I'll get this committed?

> Debug logging affects repair performance
> 
>
> Key: CASSANDRA-16406
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16406
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Normal
> Fix For: 4.0-rc
>
> Attachments: 16406-trunk.txt, CASSANDRA-16406.png, 
> with_debug_logging.png, without_debug_logging.png
>
>
> While working on the repair quality testing in CASSANDRA-16245, it appeared 
> that the node coordinating repairs on a 20GB per node dataset was generating 
> more than 2G of log with a total duration for the incremental repair 
> scenarios of ~2h40m: 
> https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/37/workflows/6a7a41c8-0fca-4080-b37e-3b38998b3fab/jobs/49/steps
>  ]
>  !with_debug_logging.png!  
>  The logs showed a lot of messages from the MerkleTree class at high pace:
> {noformat}
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10) 
> Fully inconsistent range [# (-6738651767434294419,-6738622715859497972] depth=11>, # (-6738622715859497972,-6738593664284701525] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (9) 
> Inconsistent digest on right sub-range # (-6738593664284701525,-6738535561135108630] depth=10>: [# -6738564612709905078 
> hash=[b8efd3d684474276f316b1bc9f23b463cda4f8d620a4b5cf4d2c7dbb101bbe85] 
> children=[# [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]> 
> # [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>]>, 
> # hash=[95334327a0b50b7d3c51d6588a5f3d57701e0b978f6ab28e85cda3cb5a094eb5] 
> children=[# [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]> 
> # [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:262 - (10) 
> Hashing sub-ranges [# depth=11>, #] 
> for # divided 
> by midpoint -6738564612709905078
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:272 - (10) 
> Inconsistent digest on left sub-range # (-6738593664284701525,-6738564612709905078] depth=11>: [# [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>, 
> # [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (10) 
> Inconsistent digest on right sub-range # (-6738564612709905078,-6738535561135108630] depth=11>: [# [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>, 
> # [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10) 
> Fully inconsistent range [# (-6738593664284701525,-6738564612709905078] depth=11>, # (-6738564612709905078,-6738535561135108630] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (9) 
> Fully inconsistent range [# (-6738651767434294419,-6738593664284701525] depth=10>, # (-6738593664284701525,-6738535561135108630] depth=10>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (8) 
> Inconsistent digest on right sub-range # (-6738535561135108630,-6738419354835922841] depth=9>: [# -6738477457985515736 
> hash=[806ede1a35783bf5fafe8b8ccefe4d3ff48e8f0e1314f8a9ce4b23f13fed4bf9] 
> children=[# hash=[e6d133afbd8041266f8a1cfe456ff07c9e7debe8ff54279b579b252f2af78b6b] 
> children=[#]> # hash=[66bfedb588f87ad3957497728b91bd436af364e6ec40df3299d006de151ac092] 
> children=[#]>]>, # hash=[bf128431ddf7ad72417ce853f90abd0bc7a2a38088d5fbec864c02c516ed6458] 
> children=[# hash=[12a71b62ceb096369c4797f56edd91a16dd9803348b4f4cfe1454027d7dae3d2] 
> children=[#]> # hash=[adb59f5313473b44dd3b7fa697d72c7b23b3c0610f23670942e2c137878a] 
> children=[#]>]>]{noformat}
> When disabling debug logging, the duration dropped to ~2h05m with decent log 
> sizes:
> [https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/38/workflows/c91e6ceb-438b-4f00-b38a-d670f9afb4c3/jobs/51]
> !without_debug_logging.png!
> There's apparently too much logging for each inconsistency found in the 
> Merkle tree comparisons and we should move this to TRACE level if we still 
> want to allow debug logging to be turned on by default.
> I'll prepare a patch for the MerkleTree class and run the repair testing 
> scenarios again to verify their duration.



--
This 

[jira] [Commented] (CASSANDRA-16406) Debug logging affects repair performance

2021-01-27 Thread Alexander Dejanovski (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273386#comment-17273386
 ] 

Alexander Dejanovski commented on CASSANDRA-16406:
--

I ran the repair quality test using the [patched 
branch|https://github.com/apache/cassandra/compare/trunk...adejanovski:CASSANDRA-16406?expand=1]
 and got the expected 30 minutes reduction on the full and incremental test 
suites: 

!CASSANDRA-16406.png!

 

[PR|https://github.com/apache/cassandra/pull/881]
 [Branch|https://github.com/adejanovski/cassandra/tree/CASSANDRA-16406]

I've attached the patch.

> Debug logging affects repair performance
> 
>
> Key: CASSANDRA-16406
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16406
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Normal
> Fix For: 4.0-rc
>
> Attachments: 16406-trunk.txt, CASSANDRA-16406.png, 
> with_debug_logging.png, without_debug_logging.png
>
>
> While working on the repair quality testing in CASSANDRA-16245, it appeared 
> that the node coordinating repairs on a 20GB per node dataset was generating 
> more than 2G of log with a total duration for the incremental repair 
> scenarios of ~2h40m: 
> https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/37/workflows/6a7a41c8-0fca-4080-b37e-3b38998b3fab/jobs/49/steps
>  ]
>  !with_debug_logging.png!  
>  The logs showed a lot of messages from the MerkleTree class at high pace:
> {noformat}
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10) 
> Fully inconsistent range [# (-6738651767434294419,-6738622715859497972] depth=11>, # (-6738622715859497972,-6738593664284701525] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (9) 
> Inconsistent digest on right sub-range # (-6738593664284701525,-6738535561135108630] depth=10>: [# -6738564612709905078 
> hash=[b8efd3d684474276f316b1bc9f23b463cda4f8d620a4b5cf4d2c7dbb101bbe85] 
> children=[# [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]> 
> # [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>]>, 
> # hash=[95334327a0b50b7d3c51d6588a5f3d57701e0b978f6ab28e85cda3cb5a094eb5] 
> children=[# [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]> 
> # [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:262 - (10) 
> Hashing sub-ranges [# depth=11>, #] 
> for # divided 
> by midpoint -6738564612709905078
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:272 - (10) 
> Inconsistent digest on left sub-range # (-6738593664284701525,-6738564612709905078] depth=11>: [# [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>, 
> # [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (10) 
> Inconsistent digest on right sub-range # (-6738564612709905078,-6738535561135108630] depth=11>: [# [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>, 
> # [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10) 
> Fully inconsistent range [# (-6738593664284701525,-6738564612709905078] depth=11>, # (-6738564612709905078,-6738535561135108630] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (9) 
> Fully inconsistent range [# (-6738651767434294419,-6738593664284701525] depth=10>, # (-6738593664284701525,-6738535561135108630] depth=10>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (8) 
> Inconsistent digest on right sub-range # (-6738535561135108630,-6738419354835922841] depth=9>: [# -6738477457985515736 
> hash=[806ede1a35783bf5fafe8b8ccefe4d3ff48e8f0e1314f8a9ce4b23f13fed4bf9] 
> children=[# hash=[e6d133afbd8041266f8a1cfe456ff07c9e7debe8ff54279b579b252f2af78b6b] 
> children=[#]> # hash=[66bfedb588f87ad3957497728b91bd436af364e6ec40df3299d006de151ac092] 
> children=[#]>]>, # hash=[bf128431ddf7ad72417ce853f90abd0bc7a2a38088d5fbec864c02c516ed6458] 
> children=[# hash=[12a71b62ceb096369c4797f56edd91a16dd9803348b4f4cfe1454027d7dae3d2] 
> children=[#]> # hash=[adb59f5313473b44dd3b7fa697d72c7b23b3c0610f23670942e2c137878a] 
> children=[#]>]>]{noformat}
> When disabling debug logging, the duration dropped to ~2h05m with decent log 
> sizes:
> [https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/38/workflows/c91e6ceb-438b-4f00-b38a-d670f9afb4c3/jobs/51]
> !without_debug_logging.png!
> There's apparently too much logging for each inconsistency found in the 
> Merkle tree 

[jira] [Commented] (CASSANDRA-16406) Debug logging affects repair performance

2021-01-25 Thread Alexander Dejanovski (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17271317#comment-17271317
 ] 

Alexander Dejanovski commented on CASSANDRA-16406:
--

[~spod], seems like you added most of these debug loggings.

Are you ok with me moving them to TRACE level?

> Debug logging affects repair performance
> 
>
> Key: CASSANDRA-16406
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16406
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Alexander Dejanovski
>Assignee: Alexander Dejanovski
>Priority: Normal
> Attachments: with_debug_logging.png, without_debug_logging.png
>
>
> While working on the repair quality testing in CASSANDRA-16245, it appeared 
> that the node coordinating repairs on a 20GB per node dataset was generating 
> more than 2G of log with a total duration for the incremental repair 
> scenarios of ~2h40m: 
> https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/37/workflows/6a7a41c8-0fca-4080-b37e-3b38998b3fab/jobs/49/steps
>  ]
>  !with_debug_logging.png!  
>  The logs showed a lot of messages from the MerkleTree class at high pace:
> {noformat}
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10) 
> Fully inconsistent range [# (-6738651767434294419,-6738622715859497972] depth=11>, # (-6738622715859497972,-6738593664284701525] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (9) 
> Inconsistent digest on right sub-range # (-6738593664284701525,-6738535561135108630] depth=10>: [# -6738564612709905078 
> hash=[b8efd3d684474276f316b1bc9f23b463cda4f8d620a4b5cf4d2c7dbb101bbe85] 
> children=[# [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]> 
> # [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>]>, 
> # hash=[95334327a0b50b7d3c51d6588a5f3d57701e0b978f6ab28e85cda3cb5a094eb5] 
> children=[# [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]> 
> # [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:262 - (10) 
> Hashing sub-ranges [# depth=11>, #] 
> for # divided 
> by midpoint -6738564612709905078
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:272 - (10) 
> Inconsistent digest on left sub-range # (-6738593664284701525,-6738564612709905078] depth=11>: [# [fff431a30da07558c5106897da9f3fc3bcc598fb59e67c6959f4e70b6d3a7722]>, 
> # [0d2c5c21f6b04b098f71f43d0f3f3a5a823b372eb850261774847c456629bedd]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (10) 
> Inconsistent digest on right sub-range # (-6738564612709905078,-6738535561135108630] depth=11>: [# [471be27589e7372e3606d92b45bc8ba07161602d7942c9a614d89ab07d21c9a7]>, 
> # [981f1f0656054074b32022658560070df2253cb9373a9499f149df8e3c20f068]>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (10) 
> Fully inconsistent range [# (-6738593664284701525,-6738564612709905078] depth=11>, # (-6738564612709905078,-6738535561135108630] depth=11>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:308 - (9) 
> Fully inconsistent range [# (-6738651767434294419,-6738593664284701525] depth=10>, # (-6738593664284701525,-6738535561135108630] depth=10>]
> DEBUG [RepairJobTask:4] 2021-01-21 16:15:29,631 MerkleTree.java:292 - (8) 
> Inconsistent digest on right sub-range # (-6738535561135108630,-6738419354835922841] depth=9>: [# -6738477457985515736 
> hash=[806ede1a35783bf5fafe8b8ccefe4d3ff48e8f0e1314f8a9ce4b23f13fed4bf9] 
> children=[# hash=[e6d133afbd8041266f8a1cfe456ff07c9e7debe8ff54279b579b252f2af78b6b] 
> children=[#]> # hash=[66bfedb588f87ad3957497728b91bd436af364e6ec40df3299d006de151ac092] 
> children=[#]>]>, # hash=[bf128431ddf7ad72417ce853f90abd0bc7a2a38088d5fbec864c02c516ed6458] 
> children=[# hash=[12a71b62ceb096369c4797f56edd91a16dd9803348b4f4cfe1454027d7dae3d2] 
> children=[#]> # hash=[adb59f5313473b44dd3b7fa697d72c7b23b3c0610f23670942e2c137878a] 
> children=[#]>]>]{noformat}
> When disabling debug logging, the duration dropped to ~2h05m with decent log 
> sizes:
> [https://app.circleci.com/pipelines/github/riptano/cassandra-rtest/38/workflows/c91e6ceb-438b-4f00-b38a-d670f9afb4c3/jobs/51]
> !without_debug_logging.png!
> There's apparently too much logging for each inconsistency found in the 
> Merkle tree comparisons and we should move this to TRACE level if we still 
> want to allow debug logging to be turned on by default.
> I'll prepare a patch for the MerkleTree class and run the repair testing 
> scenarios again to verify their duration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To