Yonik Seeley wrote:
On Sun, Jan 4, 2009 at 8:07 PM, Mark Miller <markrmil...@gmail.com> wrote:
Forking for a small script on something that can have such a large memory
footprint is just a huge waste of resources. Ideally you might have a tiny
program running, listening on a socket or something, and it can be alerted
and do the actual fork (being small itself). Or some other such workaround,
other than copying a few gig into RAM or swap :)

Well, fork doesn't actually copy anymore (for a long time now) - it's
really only the page tables that get copied and set to copy-on-write
so the fork is actually pretty lightweight.
Right, copying was the wrong word. It depends. Depending on your Unix variant, it will actually use vfork, or sometimes..., or sometimes sometimes you have no option to share. (Because you can screw with the parent, I've seen warnings in the doc for vfork that its not recommended even for use - but this could be old now, and was for a particular version of UNIX that I don't remember)
The issue is that the OS is being conservative and checking that there
would be enough RAM+SWAP available if all of the process address space
did have to be copied/allocated (older versions of linux didn't do
this check and allowed memory overcommit).  The OS doesn't know that
the fork will be followed by an exec.
I don't think you can just count on that in a unix environment. Maybe Linux took care of it, but is that common on all versions of Unix? And if you have an older version of Linux?
So the workaround of creating more swap is just so that this OS memory
overcommit check passes.  The swap won't actually be used by the fork
+ exec.
Again, only if your lucky. It depends on the many implementations of fork. A lot of times fork is actually vfork or something, but solr can't count on it for everybody I wouldn't think.
The real fix would be for the JVM to use something like vfork when available.
Which kind of happens under the scenes if your lucky already. Some unix guys don't like it, and I assume thats why its not the standard (overly concerned with the child process mucking up the parent process).

I shouldn't have said copy - the issue is that we are looking for way to much RAM. A JVM using 5 gig will look for another 5 - thats terrible. I don't link we can solve it in a universal way for Unix by relying on forking the JVM though. Its hit or miss. The real fix can't depend on your OS varient and its version I wouldn't think.

Reply via email to