On Wed, 26 Mar 2025 14:16:06 GMT, Robert Toyonaga <d...@openjdk.org> wrote:

> > Are there any code that we know of that doesn't fit into a synchronization 
> > pattern similar to the above?
> > I can think of some contrived example where Thread B asks the OS for memory 
> > mappings and uses that to ascertain that a pre-determined address has been 
> > reserved, and how that could lead to an incorrect booking as you described, 
> > but do we really have code like that?
> 
> From what I can tell, it doesn't look like that's happening anywhere, someone 
> else may know better though. Similarly, for uncommit, the base address must 
> be passed over from somewhere else in the JVM so relying on some external 
> synchonization seems reasonable here too. If this problem scenario is not 
> present in the current code and it's not expected it to become a possiblity 
> in the future, then I suppose there's no reason to guard against it. Maybe 
> just a comment explaining the reasoning is good enough (and a warning not to 
> use such patterns).
> 
> > When does a release/uncommit fail? Would that be a JVM bug?
> 
> On Windows, VirtualFree also looks like it only fails if an invalid set of 
> arguments are passed. So if os::pd_release fails it's probably a JVM bug. 
> Uncommit uses mmap, which could fail for a larger variety of reasons. Some 
> reasons are out of control of the JVM. For example: "The number of mapped 
> regions would exceed an implementation-defined limit (per process or per 
> system)." See 
> [here](https://github.com/openjdk/jdk/blob/jdk-25%2B15/src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp#L191)

Right. And that failure is fatal, so there should be no need to fix any NMT 
bookkeeping for that.

> 
> > What state is the memory in when such a failure happens? Do we even know if 
> > the memory is still committed if an uncommit fails?
> 
> If release/uncommit fails, then it would be hard to know what state the 
> target memory is in. If the arguments are invalid (bad base address), the 
> target region may not even be allocated. Or, in the case of uncommit, if the 
> base address is not aligned, maybe the target committed region does indeed 
> exist but the uncommit still fails. So it would be hard to determine how to 
> readjust the NMT accounting afterward.

Agreed. And this would be a pre-existing problem already. If a release/uncommit 
fails, then we have the similar issues for that as well.

> 
> > I don't understand why we don't treat that as a fatal error OR make sure 
> > that all call-sites handles that error, which they don't do today.
> 
> I think release/uncommit failures should be handled by the callers. 
> Currently, uncommit failure is handled in most places by the caller, release 
> failure seems mostly not. Since, at least for uncommit, we could sometimes 
> fail for valid reasons, I think we shouldn't fail fatally in the os:: 
> functions.

I would like to drill a bit deeper into this. Do you have any concrete examples 
of an uncommit failure that should not be handled as a fatal error?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24084#issuecomment-2754953604

Reply via email to