Holmistr opened a new issue, #13833:
URL: https://github.com/apache/druid/issues/13833

   ### Description
   
   #### Symptom
   Running JvmMonitorTest on [Azul Platform 
Prime](https://www.azul.com/products/prime/) (formerly known as Azul Zing, an 
OpenJDK based JVM) with C4 garbage collector results in timeout:
   
   ```
   JvmMonitorTest.testGcCounts:49 TestTimedOut test timed out after 60000 
milliseconds
   ```
   
   #### Investigation and root cause
   We investigated and additional logging in GcTrackingEmitter showed:
   ```
   G1 GC:
   event to map = {feed=metrics, metric=jvm/gc/cpu, service=test, 
host=localhost, gcName=[g1], value=0, gcGen=[old], 
timestamp=2023-02-21T16:56:15.839Z}
   
   Prime:
   event to map = {feed=metrics, metric=jvm/gc/count, service=test, 
host=localhost, gcName=[GPGC New], value=0, gcGen=[GPGC New], 
timestamp=2023-02-21T16:57:17.964Z}
   ```
   
   gcGen set correctly for G1 because of the following method 
[JvmMonitor.java#L223](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/java/util/metrics/JvmMonitor.java#L223)(there
 are no correct processing of Prime managers)
   
   Due to this switch accepts an invalid string in 'emit' 
method([JvmMonitorTest.java#L89](https://github.com/apache/druid/blob/master/processing/src/test/java/org/apache/druid/java/util/metrics/JvmMonitorTest.java#L89)),
 GcTrackingEmitter does not update {old|young}GcCount and {old|young}GcSeen 
never return true.
   
   We further found out that C4 [is not in the list of explicitly known garbage 
collectors](https://github.com/apache/druid/blob/master/processing/src/main/java/org/apache/druid/java/util/metrics/JvmMonitor.java#L192-L201)
 by Druid.
   
   #### Workaround
   There are two possible workarounds which make Prime "look like" it's G1. To 
be precise, they change the strings of memory managers to the values that G1 
uses. The workarounds:
   * Adding `-XX:+MimicG1GCMemoryManagerNames`
   * Adding `-XX:GPGCOldGCMemoryManagerName="G1 Old Generation"`
   
   #### Fix
   We're happy to contribute the change, it seems rather straightforward. 
Perhaps some guidance from the Druid's team would be helpful to make sure we 
don't overlook any place where this needs to be changed. Any thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to