On Fri, Sep 30, 2016 at 09:10:21AM +0200, Raimo Niskanen wrote:
> On Wed, Sep 28, 2016 at 09:19:51AM +0200, Raimo Niskanen wrote:
> > Dear misc@
> >
> > I have searched the archives and read the documentation of login.conf(5),
> > ksh(1):ulimit and can not find how to limit the amount of physical memory a
> > process may use.
> >
> > I have the following limits where I have set down ulimit -m and ulimit -l
> > to 10000 kbytes in an attempt to limit the process I spawn which is
> > the Erlang VM.
> >
> > $ ulimit -a
> > time(cpu-seconds) unlimited
> > file(blocks) unlimited
> > coredump(blocks) unlimited
> > data(kbytes) 33554432
> > stack(kbytes) 8192
> > lockedmem(kbytes) 10000
> > memory(kbytes) 10000
> > nofiles(descriptors) 1024
> > processes 1024
> >
> > Note that the machine has got 8 GB of physical memory and 8 GB of swap and
> > that I have set datasize=infinity in /etc/login.conf. I got
> > datasize=33554432 which seems to be the same as kern.shminfo.shmmax.
> > The datasize is twice the physical memory + swap.
> >
> > Then I start the Erlang VM and tell it to allocate an address block of 30000
> > MByte for future use where it will store all literal data in the same block
> > (this is a garbage collector optimization). Not much of this data is
> > actually used.
> >
> > 68196 beam CALL
> > mmap(0,0x753000000,0<PROT_NONE>,0x1002<MAP_PRIVATE|MAP_ANON>,-1,0)
> > 68196 beam RET mmap 11871265173504/0xacbfe8b3000
> >
> > Note the protection flags on the block. No access is allowed. This trick
> > works just fine; here is what top says:
> >
> > load averages: 0.15, 0.13, 0.09 frerin.otp.ericsson.se 08:49:46
> > 48 processes: 47 idle, 1 on processor up 13:49
> > CPU0 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100%
> > idle
> > CPU1 states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100%
> > idle
> > Memory: Real: 43M/636M act/tot Free: 7028M Cache: 508M Swap: 0K/8155M
> >
> > PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND
> > 68196 raimo 2 0 29G 15M sleep poll 0:00 1.42% beam
> >
> > So I have a process with a data size of 29 GB on a machine with 16 GB
> > memory + swap. I have also tried to start an additional Erlang VM that
> > also allocates 29 GB of virtual memory which also works.
> >
> > That this is allowed is just fine for me - this trick of allocating a
> > "large enough" PROT_NONE memory to get one address range for some special
> > data type is very useful for the Erlang VM. But I wonder how to limit the
> > actual memory use? Setting down ulimit -m and ulimit -l to 10000 kbytes
> > did not prevent this process from getting 15 MByte of "RES" memory...
> >
> > Is there some way to limit the actual amount of memory for a process when I
> > need to set up the datasize to allow for large unused virtual memory
> > blocks?
>
> I have found clues in getrlimit,setrlimit(2):
>
> RLIMIT_DATA The maximum size (in bytes) of the data segment for a
> process; this includes memory allocated via malloc(3)
> and all other anonymous memory mapped via mmap(2).
> :
> RLIMIT_RSS The maximum size (in bytes) to which a process's
> resident set size may grow. This imposes a limit
> on the amount of physical memory to be given to a
> process; if memory is tight, the system will prefer
> to take memory from processes that are exceeding
> their declared resident set size.
>
> Now I try to figure out the implications of this... If I set the data size
> so the sum of the data sizes for all processes in the system is larger than
> physical memory + swap, then any process may allocate the last block of
> memory in the system so a more important process later will fail to
> allocate?
yes.
>
> And the memoryuse limit is rather toothless since there is no immediate
> check of this limit. When the system gets low on memory; is all that
> happens that processes that exceed their memoryuse limit probably will get
> blocks swapped out?
RLIMIT_DATA *is* enforced, but it could be that PROT_NONE memory is
not counted. I don;t know atm.
I suppose you are calling mprotect() on pages you want to read or
modify. Those pages should be counted toward RLIMIT_DATA.
>
> If this is correct then programs that for efficiency reasons allocates
> large address ranges of which most is rarely used are hard to control
> safely with this resource limit model, or programs that use this behaviour
> must be considered ill-behaved whith this resource limit model...
>
> Or have I misunderstood something?
It is true that the global vm limit (physmem + swap) is not accounted
for when deciding if a memory request is granted, only RLIMIT_DATA
will be counted (whch is per-process). In that sense you are right,
contolling to total amount of memory used by all processes is not
possible.
-Otto