Re: RFR: 8253064: monitor list simplifications and getting rid of TSM [v2]

David Holmes Sun, 08 Nov 2020 22:09:49 -0800

Hi Dan,

On 9/11/2020 1:50 pm, Daniel D.Daugherty wrote:

On Sun, 8 Nov 2020 21:43:00 GMT, David Holmes <dhol...@openjdk.org> wrote:

How about this:
   static MonitorList   _in_use_list;
   // The ratio of the current _in_use_list count to the ceiling is used
   // to determine if we are above MonitorUsedDeflationThreshold and need
   // to do an async monitor deflation cycle. The ceiling is increased by
   // AvgMonitorsPerThreadEstimate when a thread is added to the system
   // and is decreased by AvgMonitorsPerThreadEstimate when a thread is
   // removed from the system.
   // Note: If the _in_use_list max exceeds the ceiling, then
   // monitors_used_above_threshold() will use the in_use_list max instead
   // of the thread count derived ceiling because we have used more
   // ObjectMonitors than the estimated average.
   static jint          _in_use_list_ceiling;


Thanks for the comment. So instead of checking the threshhold on each OM 
allocation we use this averaging technique to estimate the number of monitors 
in use? Can you explain how this came about rather than the simple/obvious 
check at allocation time. Thanks.


I'm not sure I understand your question, but let me that a stab at it anyway...

We used to compare the sum of the in-use counts from all the in-use lists
with the total population of ObjectMonitors. If that ratio was higher than
MonitorUsedDeflationThreshold, then we would do an async deflation cycle.
Since we got rid of TSM, we no longer had a population of already allocated
ObjectMonitors, we had a max value instead. However, when the VMs use
of ObjectMonitors is first spinning up, the max value is typically very close
to the in-use count so we would always be asking for an async-deflation
during that spinning up phase.

I created the idea of a ceiling value that is tied to thread count and the
AvgMonitorsPerThreadEstimate to replace the population value that we
used to have. By comparing the in-use count against the ceiling value, we
no longer exceed the MonitorUsedDeflationThreshold when the VMs use
of ObjectMonitors is first spinning up so we no longer do async deflations
continuously during that phase. If the max value exceeds the ceiling value,
then we're using a LOT of ObjectMonitors and, in that case, we compare
the in-use count against the max to determine if we're exceeding the
MonitorUsedDeflationThreshold.

Does this help?

It helps but I'm still wrestling with what MonitorUsedDeflationThresholdactually means now.

So the existing MonitorUsedDeflationThreshold is used as a measure ofthe proportion of monitors actually in-use compared to the number ofmonitors pre-allocated. If an inflation request requires a new block tobe allocated and we're above MonitorUsedDeflationThreshold % then arequest for async deflation occurs (when we actually check).

The new code, IIUC, says, lets assume we expectAvgMonitorsPerThreadEstimate monitors-per-thread. If the number ofmonitors in-use is > MonitorUsedDeflationThreshold % of(AvgMonitorsPerThreadEstimate * number_of_threads), then we requestasync deflation.

So ... obviously we need some kind of watermark based system forrequesting deflation otherwise there will be far too many deflationrequests. And we also don't want to have check for exceeding thethreshold on every monitor allocation. So the deflation thread willwakeup periodically and check if the threshold is exceeded.

Okay ... so then it comes down to deciding whetherAvgMonitorsPerThreadEstimate is the best way to establish the watermarkand what the default value should be. This doesn't seem like somethingthat an application developer could reasonably try to estimate so it isjust going to be a tuning knob they adjust somewhat arbitrarily. Iassume the 1024 default came from tuning something?

Have you looked at the affect on memory use these changes have (ie peakRSS use)? Did your performance measurements look at using differentvalues? (I can imagine that with enough memory we can effectivelydisable deflation and so potentially increase performance. OTOH maybedeflation is so infrequent it is a non-issue.)

I have to confess that I never really thought about the old set ofheuristics for this, but the fact we're changing the heuristics doesraise a concern about what impact applications may see.

BTW MonitorUsedDeflationThreshold should really be diagnostic notexperimental, as real applications may need to tune it (and people oftendon't want to use experimental flags in production as a matter of policy).


Thanks,
David
-----

-------------

PR: https://git.openjdk.java.net/jdk/pull/642

Re: RFR: 8253064: monitor list simplifications and getting rid of TSM [v2]

Reply via email to