On Thu, 3 Apr 2025 15:27:15 GMT, Robert Toyonaga <d...@openjdk.org> wrote:

>> ### Summary:
>> This PR makes memory operations atomic with NMT accounting.
>> 
>> ### The problem:
>> In memory related functions like `os::commit_memory` and 
>> `os::reserve_memory` the OS memory operations are currently done before 
>> acquiring the the NMT mutex. And the the virtual memory accounting is done 
>> later in `MemTracker`, after the lock has been acquired. Doing the memory 
>> operations outside of the lock scope can lead to races. 
>> 
>> 1.1 Thread_1 releases range_A.
>> 1.2 Thread_1 tells NMT "range_A has been released".
>> 
>> 2.1 Thread_2 reserves (the now free) range_A.
>> 2.2 Thread_2 tells NMT "range_A is reserved".
>> 
>> Since the sequence (1.1)  (1.2) is not atomic, if Thread_2 begins operating 
>> after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid 
>> subsequent calls (release range_A, followed by map range_A). But NMT sees 
>> "reserve range_A", "release range_A" and is now out of sync with the OS.
>> 
>> ### Solution:
>> Where memory operations such as reserve, commit, or release virtual memory 
>> happen,  I've expanded the scope of `NmtVirtualMemoryLocker` to protect both 
>> the NMT accounting and the memory operation itself. 
>> 
>> ### Other notes:
>> I also simplified this  pattern found in many places:
>> 
>> if (MemTracker::enabled()) {
>>   MemTracker::NmtVirtualMemoryLocker nvml;
>>   result = pd_some_operation(addr, bytes);
>>   if (result != nullptr) {
>>     MemTracker::record_some_operation(addr, bytes);
>>   }
>> } else {
>>   result = pd_unmap_memory(addr, bytes);
>> }
>> ``` 
>> To:
>> 
>> MemTracker::NmtVirtualMemoryLocker nvml;
>> result = pd_unmap_memory(addr, bytes);
>> MemTracker::record_some_operation(addr, bytes);
>> ``` 
>> This is possible because `NmtVirtualMemoryLocker` now checks 
>> `MemTracker::enabled()`.  `MemTracker::record_some_operation` already checks 
>> `MemTracker::enabled()` and checks against nullptr.  This refactoring 
>> previously wasn't possible because `ThreadCritical` was used before 
>> https://github.com/openjdk/jdk/pull/22745 introduced 
>> `NmtVirtualMemoryLocker`. 
>> 
>> I considered moving the locking and NMT accounting down into platform 
>> specific code:  Ex. lock around { munmap() + MemTracker::record }. The hope 
>> was that this would help reduce the size of the critical section. However, I 
>> found that the OS-specific "pd_" functions are already short and 
>> to-the-point, so doing this wasn't reducing the lock scope very much. 
>> Instead it just makes the code more messy by having to maintain the locking 
>> and NMT accounting in each platform specific i...
>
> Robert Toyonaga has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   exclude file mapping tests on AIX.

I ran the tests over the weekend again and now they passed.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24084#issuecomment-2782761982

Reply via email to