Florian Weimer wrote: > * Andrew Haley: > >> But those running Linux won't benefit from such a change because >> on Linux there is no transient doubling of process size: all that happens >> is that the page table entries in the new process are mapped copy on write. >> The extra pages count towards the overcommit limit, but that's wholly >> artifical. > > On Linux in vm.overcommit_memory=2 mode, the whole heap (not just the > committed part) counts against the system's total memory allocation > limit. After a fork, the heap counts twice. Copy-on-write is just an > optimization in this mode, it does not change the physical memory > requirements of the workload (in vm.overcommit_memory=1 mode, it > does).
Well, yes, all that vm.overcommit_memory=2 mode does is disable overcommit, and overcommit is what you need to make this work properly. But even in vm.overcommit_memory=2 mode the pages still aren't copied until written. All that mode 2 does is prevent the transient allocation of the pages in the copy of the forked process, even though the system has all the memory it needs to fulfil the request. AIUI... > A better way seems to be to allocate the heap with PROT_NONE, and > later use mprotect with PROT_READ|PROT_WRITE (and perhaps PROT_EXEC) > to allocate chunks from the kernel. This will fail deterministically > in the garbage collector if no physical memory is available. The > PROT_NONE mapping is only there to reserve a continuous chunk of > address space (so that calls to malloc or dlopen do not create > mappings in the middle of the Java heap). When I tested this some > time ago, a PROT_NONE mapping did not count towards the system's > memory allocation limit, hence the potential failure in the mprotect > call. The main problem with this approach is that this is not a > documented way of using the kernel API; it might work accidentally now > and change behavior in the future. I can see the sense in this. In modes 0 and 1 Java won't behave any differently from the way it does today, except that why it does run out of real memory there will be a decent traceback rather than a segfault. However, I think you don't want PROT_NONE whan a system has been configured with mode 2: in that case, a user has a reasonable expectation that they can use all of the memory they allocated at VM startup. It makes more sense to allocate all the -Xms size immediately. Andrew.