[jira] [Commented] (CASSANDRA-19033) Add virtual table with GC pause history

2023-11-26 Thread Jon Haddad (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789873#comment-17789873
 ] 

Jon Haddad commented on CASSANDRA-19033:


My understanding of [JEP 158|https://openjdk.org/jeps/158] was the logging 
format was to be unified, which seems like it should reduce what we'd 
potentially have to read in, but I'm not actually sure if it just unifies the 
configuration or the log format itself.  I originally thought it was the format 
but as I look closer at it, I'm seeing there's quite a few options to change 
the format itself.  I need to compare the output formats in java 11, 17 and 21 
before I can say for certain.

We may want to consider writing to a C* table, which I like from the simplicity 
standpoint, and realistically it would have a pretty trivial overhead.  Rather 
than reading from the logs, we'd could make the GCInspector write to the table 
after a pause.

Thoughts?

I think regarding schema, I think it's safe to assume that as a starting point, 
we'd want to see the start time, elapsed time, time to stop threads, and the 
entire raw message.  Generational collectors will have information about eden, 
survivor and old gen.  Those could either be stored as JSON or as a map with a 
UDT.  I think the only data available there is the space usage, but I want to 
check on newer versions as well as Shenandoah and ZGC to be sure.  Regional 
collectors like G1 are going to have specific information we'll want to include 
as well.

So far all I can say is that at a minimum we'd want the following:

{noformat}

CREATE TABLE gc_history (
start_time datetime primary key,
total_elapsed_ms int, 
stop_thread_time, 
   raw_message text
)
{noformat}

Obviously the above is fairly minimal, but I think it would be pretty useful 
even in it's limited state.  I'd be able to look at all the pauses in a 
specific window of time, or find all pauses lasting longer than N ms which are 
the two types of queries I'd do most often.  I could also see a simple tool 
that renders a histogram of GC pause times over a specific window of time which 
would be incredibly helpful even if it doesn't provide additional diagnostic 
info.

I'll try to get some examples of different log types this week so we can try to 
break out other useful fields in the schema.




> Add virtual table with GC pause history
> ---
>
> Key: CASSANDRA-19033
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19033
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Virtual Tables
>Reporter: Jon Haddad
>Priority: Normal
>
> We should be able to view GC pause history in a virtual table. 
> I think the best approach here is to read from the GC logs.  The format was 
> unified in Java 9, and we've dropped older JVM support so I think this is 
> reasonable.  The benefits of using logs are that we can preserve it across 
> restarts and we enable GC logs by default.  
> The downside is people might not have GC logs configured and it seems weird 
> that a feature would just stop working because logs aren't enabled.   Maybe 
> that's OK if we call it out, or error if people try to read from it and the 
> logs aren't enabled.  I think if someone disables -Xlog:gc then an error 
> might be fine as I don't expect it to happen often.  I think I lean towards 
> this from a usability perspective, and Microsoft has a 
> [project|https://github.com/microsoft/gctoolkit] to parse them, but I haven't 
> used it so I'm not sure if it's suitable for us.  
> At a minimum, pause time should be it's own field so we can query for pauses 
> over a specific threshold, but there may be other data we want to explicitly 
> split out as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19033) Add virtual table with GC pause history

2023-11-16 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786927#comment-17786927
 ] 

Paulo Motta commented on CASSANDRA-19033:
-

Seems like it could be useful to expose formatted gc info via a vtable for 
troubleshooting/tuning. If GC logging is not enabled I think it's fine to error 
out or perhaps not even load the virtual table.

Would a specific GC logging format be required? Would this support just 
gc.log.current or compressed rolled over files?

Do you have an idea on what the table schema would look like and possible 
queries?

> Add virtual table with GC pause history
> ---
>
> Key: CASSANDRA-19033
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19033
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Virtual Tables
>Reporter: Jon Haddad
>Priority: Normal
>
> We should be able to view GC pause history in a virtual table. 
> I think the best approach here is to read from the GC logs.  The format was 
> unified in Java 9, and we've dropped older JVM support so I think this is 
> reasonable.  The benefits of using logs are that we can preserve it across 
> restarts and we enable GC logs by default.  
> The downside is people might not have GC logs configured and it seems weird 
> that a feature would just stop working because logs aren't enabled.   Maybe 
> that's OK if we call it out, or error if people try to read from it and the 
> logs aren't enabled.  I think if someone disables -Xlog:gc then an error 
> might be fine as I don't expect it to happen often.  I think I lean towards 
> this from a usability perspective, and Microsoft has a 
> [project|https://github.com/microsoft/gctoolkit] to parse them, but I haven't 
> used it so I'm not sure if it's suitable for us.  
> At a minimum, pause time should be it's own field so we can query for pauses 
> over a specific threshold, but there may be other data we want to explicitly 
> split out as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org