On Sun, Dec 09, 2012 at 12:26:35AM +0100, Ariane van der Steldt wrote: > On 11/09/12 08:56, Gerhard Roth wrote: > >On Thu, 08 Nov 2012 16:22:41 -0500 > >Ted Unangst <t...@tedunangst.com> wrote: > >>On Thu, Nov 08, 2012 at 13:34, Ilya Bakulin wrote: > >> > >>>The problem seems to be in uvm_map_pageable_all() function > >>>(sys/uvm/uvm_map.c). This function is a "special case of uvm_map_pageable", > >>>which tries to mlockall() all mapped memory regions. > >>>Prior to calling uvm_map_pageable_wire(), which actually does locking, it > >>>tries to count how many memory bytes will be locked, and compares this > >>>number > >>>with uvmexp.wiredmax, which is set by RLIMIT_MEMLOCK. > >>>The problem is that counting algorithm doesn't take into account that some > >>>pages have VM_PROT_NONE flag set and hence won't be locked anyway. > >>>Later in uvm_map_pageable_wire() these pages are skipped when doing actual > >>>job. > >>I don't know if this is right. Should prot_none pages not be wired? > >> > >>I think the opposite should happen. prot_none pages should be locked > >>as well. The app may be using prot_none as a way to protect its super > >>secret secrets from itself. It certainly wouldn't want them being > >>swapped out. > >> > >As long as they have VM_PROT_NONE, they can't be accessed and wiring them > >is just a waste of resources. > > > >If your scenario applies then uvm_map_protect() kicks in. It takes care of > >wiring pages if the protection changes from VM_PROT_NONE to some different > >value, though I have to admit that this happens only in case the > >VM_MAP_WIREFUTURE flag was specified. But that looks acceptable to me. > > Tedu is right and you're wrong. PROT_NONE protected pages must be > wired when calling mlock* functions. > > The main argument: malloc protects its bookkeeping data using > mprotect(PROT_NONE), which you definitely want to wire if you call > mlockall (either because you want to prevent information leaking to > disk or you have a time-sensitive program like ntpd and swap hurts). > As for wasting resources: the kernel has insufficient information to > fix wasteful programs, nor does it have sufficient information to > consider PROT_NONE pages on a case-by-case basis. > > Also consider that there is a limitation on wired memory, if you are > concerned about wasting resources. > > > Ilya Bakulin does point out a serious bug in the vmmap code however: > the resource counting algorithms and locking algorithm count > differently. The code ought to be in sync; if no developer is going > to fix the commit-part of the code, I would seriously recommend > putting Ilya's diff in. > -- > Ariane
A corection is needed here: malloc uses PROT_NONE for guard pages, PROT_NONE is not used to protect meta data. However, if the F flag is used, cached free pages are protected by PROT_NONE. The only other case is pages pointed to by the return values of malloc(0) calls. -Otto