* Volker Simonis: > Not sure this is related to overcommit settings. According to the > man-page, fork() only fails with "ENOMEM" if fork() "failed to > allocate the necessary kernel structures because memory is tight". But > a failing fork is actually no problem at all. Currently, if fork() > fails, I just fall back to normal, synchronous dumping. Of course this > could be made configurable such that a failing asynchronous dump > wouldn't result in a synchronous dump with its long safepoint timeout > but instead completely skip the dump altogether.
The problem is a bit more nefarious than that. If memory is tight, but not too tight, the fork will succeed. The dumper subprocess will do its thing, and the original process will continuing to mutate its heap. But the latter needs to create a copy of every page when it's mutated first. This requires memory allocation. If that fails, the OOM killer is invoked. I don't know which process will be terminated to free memory. If it's the production VM, that would be quite bad. Maybe the dumper could lower its OOM score. (This can happen during regular subprocess start without vfork, of course, but there, the critical phase should pass quickly.) Thanks, Florian
