Below is a java program and a shell extract demonstrating the problem on a Ubuntu-hardy-like system. The fork() fails from within a large Java process (but normal commands started from an independent shell continue to work fine).
Linux overcommit heuristics are hard to understand. The failing fork problem becomes more serious on a server system running only one significant large program, which is using most of the memory on the system, perhaps by sysctl'ing the vm.overcommit_ratio to a value higher than the default 50%. On systems with two large processes using most of memory, you may see rare failures when both try to fork() at the same time.. $ perl -e 'system("cat BigFork.java"); print "-----\n"; system("javac BigFork.java"); system("java -showversion -Xmx6000m BigFork")' import java.util.*; import java.io.*; public class BigFork { static void touchPages(byte[] chunk) { final int pageSize = 4096; for (int i = 0; i < chunk.length; i+= pageSize) { chunk[i] = (byte) '!'; } } static void showCommittedMemory() throws IOException { BufferedReader r = new BufferedReader( new InputStreamReader( new FileInputStream("/proc/meminfo"))); System.out.println("-------"); String line; while ((line = r.readLine()) != null) { if (line.startsWith("Commit")) { System.out.printf("%s%n", line); } } System.out.println("-------"); } public static void main(String[] args) throws Throwable { final int chunkSize = 1024 * 1024 * 100; List<byte[]> chunks = new ArrayList<byte[]>(100); try { for (;;) { byte[] chunk = new byte[chunkSize]; touchPages(chunk); chunks.add(chunk); } } catch (OutOfMemoryError e) { chunks.set(0, null); // Free up one chunk System.gc(); int size = chunks.size(); System.out.printf("size=%.2gGB%n", (double)size/10); showCommittedMemory(); // Can we fork/exec in our current bloated state? Process p = new ProcessBuilder("/bin/true").start(); p.waitFor(); } } } ----- openjdk version "1.7.0-Goobuntu" OpenJDK Runtime Environment (build 1.7.0-Goobuntu-b59) OpenJDK 64-Bit Server VM (build 16.0-b03, mixed mode) size=3.9GB ------- CommitLimit: 6214700 kB Committed_AS: 6804248 kB ------- Exception in thread "main" java.io.IOException: Cannot run program "/bin/true": java.io.IOException: error=12, Cannot allocate memory at java.lang.ProcessBuilder.start(ProcessBuilder.java:1018) at BigFork.main(BigFork.java:45) Caused by: java.io.IOException: java.io.IOException: error=12, Cannot allocate memory at java.lang.UNIXProcess.<init>(UNIXProcess.java:190) at java.lang.ProcessImpl.start(ProcessImpl.java:128) at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010) Martin On Sun, May 24, 2009 at 02:16, Andrew Haley <a...@redhat.com> wrote: > Martin Buchholz wrote: > > I did a little research. > > > > The overcommitment policy on Linux is configurable > > http://lxr.linux.no/linux/Documentation/vm/overcommit-accounting > > Of course, almost everyone will use the default "heuristic" policy, > > and in this case the COW memory after fork() is subject to overcommit > > accounting, which *may* cause the fork to fail. > > Sure, it *may*, but I don't think it's at all common. > > > http://lkml.indiana.edu/hypermail/linux/kernel/0902.1/01777.html > > If a solution using clone(CLONE_VM ...) can be made to work, > > subprocess creation will be a little cheaper and significantly more > > reliable. > > Maybe, but I think that needs to be measured before any changes are made. > I'm not opposed to such a change that makes a real improvement, but I'm > not convinced it will. As usual, I'm happy to be proved wrong. > > There may be a kernel bug in the case described in the mail above: it > certainly should be possible to fork a 38 GB process on a system with > 64 GB RAM. If so, I expect that this will be fixed long before any Java > VM change makes it into production. > > Andrew. >