[jira] [Comment Edited] (CASSANDRA-18268) Log completed status with cleanup and garbagecollect operations

Stefan Miklosovic (Jira) Mon, 27 Feb 2023 23:50:37 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-18268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694368#comment-17694368
 ]


Stefan Miklosovic edited comment on CASSANDRA-18268 at 2/28/23 7:49 AM:
------------------------------------------------------------------------

Thanks [~milan_k] for the cooperation. I took a closer look and I believe I 
have improved it a little bit here:

[https://github.com/instaclustr/cassandra/commit/b72e63fc076d13dfce375baefe3eef177928c3cc]

I think that we are logging too much, we were logging the fact that the 
operation started twice.

Also, now when we added the log at the end, it was also logging that the 
operation finished, twice.

I wonder what added value it has that we are logging the start and end (as you 
added it) and then we log it again inside that method.

with these changes in the commit above, it is like this: 
{code:java}
Starting GARBAGE_COLLECT for ks.tb
Compacting (50fa522a-b737-11ed-9b01-f38e71dab31b) .....
Completed GARBAGE_COLLECT for ks.tb with status SUCCESSFUL

or

Starting GARBAGE_COLLECT for ks.tb
No sstables to .... 
Completed GARBAGE_COLLECT for ks.tb with status SUCCESSFUL
{code}
where there are multiple tables in ks, like tb and tb2, doing "nodetool 
garbagecollect ks" prints this:
{code:java}
Starting GARBAGE_COLLECT for ks.tb2
CompactionTask.java:161 - Compacting ...
Completed GARBAGE_COLLECT for ks.tb2 with status SUCCESSFUL
Starting GARBAGE_COLLECT for ks.tb
CompactionTask.java:161 - Compacting ...
Completed GARBAGE_COLLECT for ks.tb with status SUCCESSFUL {code}
Another problem I see in the implementation of that method in general is that 
we are returning SUCCESSFUL but then there is finally block where we log error 
when cleanup was not successful. I wonder if this is the desired behavior. We 
might probably introduce another enum in AllSSTableOpStatus to reflect that and 
return it in that finally block which would override what we would return 
normally however, on the other hand, it is quite bad to fail otherwise 
successful operation just because we can not clean threads.

Do changes related to logging I proposed make sense to you, [~milan_k]?

 

However, I think that we are moving in circles here. We basically have the same 
thing as it is right now. As I understand it, you are primarily interested in 
knowing when all operations you invoked from "nodetool garbagecollect" 
finished. The way it is implemented now is that it is logged _per logical 
database table._ You wrote in the description that:

_there is no log statuses for the start and end of the operation at the node 
level although statuses on individual sstables are captured._

This is not true. I think you meant "individual tables". There is no logging 
_per sstable_. 

So, if you want to do that, I think it should be logged here:

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L4021

This basically iterates over all tables you specified in garbagecollect so we 
should probably log it before and after that "for" cycle instead doing this.

[~brandon.williams] what do you think about the rest?


was (Author: smiklosovic):
Thanks [~milan_k] for the cooperation. I took a closer look and I believe I 
have improved it a little bit here:

[https://github.com/instaclustr/cassandra/commit/b72e63fc076d13dfce375baefe3eef177928c3cc]

I think that we are logging too much, we were logging the fact that the 
operation started twice.

Also, now when we added the log at the end, it was also logging that the 
operation finished, twice.

I wonder what added value it has that we are logging the start and end (as you 
added it) and then we log it again inside that method.

with these changes in the commit above, it is like this: 
{code:java}
Starting GARBAGE_COLLECT for ks.tb
Compacting (50fa522a-b737-11ed-9b01-f38e71dab31b) .....
Completed GARBAGE_COLLECT for ks.tb with status SUCCESSFUL

or

Starting GARBAGE_COLLECT for ks.tb
No sstables to .... 
Completed GARBAGE_COLLECT for ks.tb with status SUCCESSFUL
{code}
where there are multiple tables in ks, like tb and tb2, doing "nodetool 
garbagecollect ks" prints this:
{code:java}
Starting GARBAGE_COLLECT for ks.tb2
CompactionTask.java:161 - Compacting ...
Completed GARBAGE_COLLECT for ks.tb2 with status SUCCESSFUL
Starting GARBAGE_COLLECT for ks.tb
CompactionTask.java:161 - Compacting ...
Completed GARBAGE_COLLECT for ks.tb with status SUCCESSFUL {code}
Another problem I see in the implementation of that method in general is that 
we are returning SUCCESSFUL but then there is finally block where we log error 
when cleanup was not successful. I wonder if this is the desired behavior. We 
might probably introduce another enum in AllSSTableOpStatus to reflect that and 
return it in that finally block which would override what we would return 
normally however, on the other hand, it is quite bad to fail otherwise 
successful operation just because we can not clean threads.

Do changes related to logging I proposed make sense to you, [~milan_k]?

[~brandon.williams] what do you think about the rest?

> Log completed status with cleanup and garbagecollect operations
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-18268
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18268
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Observability/Logging
>            Reporter: Kan Maung
>            Assignee: Milan Krisko
>            Priority: Normal
>
> Whenever a cleanup or garbagecollect operation is issued from nodetool, there 
> is no log statuses for the start and end of the operation at the node level 
> although statuses on individual sstables are captured. Start can be inferred, 
> but it is difficult to know when the operation has completed because there is 
> no log message.
>  
> [INFO ] cluster_id=5 ip_address=127.1.1.1  CompactionManager.java:353 - 
> Starting Remove deleted data for keyspace1.sstable1
> [INFO ] cluster_id=5 ip_address=127.1.1.1  CompactionManager.java:395 - 
> Finished Remove deleted data for keyspace1.sstable1 successfully
> [INFO ] cluster_id=5 ip_address=127.1.1.1  CompactionManager.java:353 - 
> Starting Remove deleted data for keyspace1.sstable2
> [INFO ] cluster_id=5 ip_address=127.1.1.1  CompactionManager.java:364 - No 
> sstables to GARBAGE_COLLECT for keyspace1.sstable2
> [INFO ] cluster_id=5 ip_address=127.1.1.1  CompactionManager.java:395 - 
> Finished Remove deleted data for keyspace1.sstable2 successfully
>  
> This improvement proposes adding additional for start and end statuses log 
> messages for the above operation at the node level. I.e.,
>  
> [INFO ] cluster_id=5 ip_address=127.1.1.1  CompactionManager.java:395 – 
> Starting GARBAGE_COLLECT
> ….
> [INFO ] cluster_id=5 ip_address=127.1.1.1  CompactionManager.java:395 – 
> Completed GARBAGE_COLLECT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (CASSANDRA-18268) Log completed status with cleanup and garbagecollect operations

Reply via email to