if that memset(0) is in vmopen() then im not sure its unnecessary run these tests to check your patch with different sizes and with/without the memset(0)
bin/package use cd builtin nmake test On Mon, Dec 9, 2013 at 4:53 PM, Roland Mainz <roland.ma...@nrubsig.org>wrote: > On Fri, Dec 6, 2013 at 5:40 AM, Glenn Fowler <glenn.s.fow...@gmail.com> > wrote: > > On Thu, Dec 5, 2013 at 4:50 PM, Irek Szczesniak <iszczesn...@gmail.com> > > wrote: > >> > >> On Wed, Dec 4, 2013 at 3:02 PM, Glenn Fowler <glenn.s.fow...@gmail.com> > >> wrote: > >> > On Sun, Dec 1, 2013 at 4:58 PM, Lionel Cons <lionelcons1...@gmail.com > > > >> > wrote: > >> >> > >> >> On 1 December 2013 17:26, Glenn Fowler <glenn.s.fow...@gmail.com> > >> >> wrote: > >> >> > I believe this is related to vmalloc changes between 2013-05-31 and > >> >> > 2013-06-09 > >> >> > re-run the tests with > >> >> > export VMALLOC_OPTIONS=getmem=safe > >> >> > if that's the problem then it gives a clue on a general solution > >> >> > details after confirmation > >> >> > > >> >> > >> >> timex ~/bin/ksh -c 'function nanosort { typeset -A a ; integer k=0; > >> >> while read i ; do key="$i$((k++))" ; a["$key"]="$i" ; done ; printf > >> >> "%s\n" "${a[@]}" ; } ; print "${.sh.version}" ; nanosort <xxx >yyy' > >> >> Version AIJMP 93v- 2013-10-08 > >> >> > >> >> real 34.60 > >> >> user 33.27 > >> >> sys 1.19 > >> >> > >> >> VMALLOC_OPTIONS=getmem=safe timex ~/bin/ksh -c 'function nanosort { > >> >> typeset -A a ; integer k=0; while read i ; do key="$i$((k++))" ; > >> >> a["$key"]="$i" ; done ; printf "%s\n" "${a[@]}" ; } ; print > >> >> "${.sh.version}" ; nanosort <xxx >yyy' > >> >> Version AIJMP 93v- 2013-10-08 > >> >> real 15.34 > >> >> user 14.67 > >> >> sys 0.52 > >> >> > >> >> So your hunch that VMALLOC_OPTIONS=getmem=safe fixes the problem is > >> >> correct. > >> >> > >> >> What does VMALLOC_OPTIONS=getmem=safe do? > >> > > >> > > >> > vmalloc has an internal discipline/method for getting memory from the > >> > system > >> > several methods are available with varying degrees of thread safety > etc. > >> > see src/lib/libast/vmalloc/vmdcsystem.c for the code > >> > and src/lib/libast/vmalloc/malloc.c for the latest VMALLOC_OPTIONS > >> > description (vmalloc.3 update shortly) > >> > > >> > ** getmemory=f enable f[,g] getmemory() functions if > supported, > >> > all > >> > by default > >> > ** anon: mmap(MAP_ANON) > >> > ** break|sbrk: sbrk() > >> > ** native: native malloc() > >> > ** safe: safe sbrk() emulation via > >> > mmap(MAP_ANON) > >> > ** zero: mmap(/dev/zero) > >> > > >> > i believe the performance regression with "anon" is that on linux > >> > mmap(0....MAP_ANON|MAP_PRIVATE...), > >> > which lets the system decide the address, returns adjacent (when > >> > possible) > >> > region addresses from highest to lowest order > >> > and the reverse order at minimum tends to fragment more memory > >> > "zero" has the same hi=>lo characteristic > >> > i suspect it adversely affects the vmalloc coalescing algorithm but > have > >> > not > >> > dug deeper > >> > for now the probe order in vmalloc/vmdcsystem.c was simply changed to > >> > favor > >> > "safe" > >> > >> MAP_FIXED should be avoided because its only there for special > >> purposes like the runtime linker ld.so.1 or debuggers. > >> > >> Using this for a general-purpose memory allocator causes serious > problems: > >> 1. On some systems this is a privileged operation and only available > >> for users with root privileges > >> > >> 2. SPARC T4 with 256GB and Solaris 11.1 the use of 'safe' degraded the > >> performance from 9 seconds to almost 15 minutes because it utterly > >> destroys the systems concept of large pages. If two MAP_FIXED mappings > >> follow directly each other the system downgrades the page size to the > >> smallest possible size, even trying to break up larger pages, which in > >> turn must be done by a special deamon (vmtasks) > >> > >> 3. MAP_PRIVATE|MAP_FIXED|MAP_ANON may no longer be available in future > >> versions of Solaris > >> > >> 4. Using the 'safe' allocator on SmartOS (solaris 11 clone) triggers a > >> SEGV: > >> map(0xFFFFCD800B482000, 1048576, PROT_READ|PROT_WRITE, > >> MAP_PRIVATE|MAP_FIXED|MAP_ANON, 4294967295, 0) = 0xFFFFCD800B482000 > >> sigaction(SIGSEGV, 0xFFFFFD7FFFDFDE50, 0xFFFFFD7FFFDFDED0) = 0 > >> Incurred fault #6, FLTBOUNDS %pc = 0x0052FE06 > >> siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFCD800B582000 > >> Received signal #11, SIGSEGV [caught] > >> siginfo: SIGSEGV SEGV_MAPERR addr=0xFFFFCD800B582000 > >> lwp_sigmask(SIG_SETMASK, 0x00000400, 0x00000000, 0x00000000, > >> 0x00000000) = 0xFFBFFEFF [0xFFFFFFFF] > > > > edit src/lib/libast/vmalloc/vmmaddress.c and change > > #define VMCHKMEM 0 > > this affects vmalloc detecting overbooked memory but will disable the > > MAP_FIXED codepath > > Erm... Solaris (|__SunOS|) was once (pre-vmalloc-rewrite) "excempt" > from this functionality since it cannot overcommit memory (except if > someone uses |MAP_NORESERVE| or uses kernel debugging options in > /etc/system) ... > > ... attached (as > "astksh20131010_vmalloc_sunos_fragmentation_fix001.diff.txt") is a > patch which... > 1. ... restores this exception for Solaris > > 2. ... bumps the |mmap()| size to 4MB for 32bit processes and 16MB for > 64bit processes since both values are more or less the points where > the fragmentation stops. Note that this does *not* mean it will use so > much memory... it only means that it reserves this amount of memory > and the real allocation happens on the first read, write or execute > access of the matching MMU page. This also means there is no > performance difference between a 1MB |mmap(MAP_ANON)| and a 128MB > |mmap(MAP_ANON)| since it only reserves memory but does not > initalise/allocate it yet... this happens on the first time it's > accessed. The other reasons for the 4MB/16MB size were: x86 has 2MB > largepages, allowing a ksh process to benefit from such pages, > additionaly most AST (including ksh93) applications consume a few MB > of memory... so there is a good chance that the "typical" > application/shell memory consumtion completly fits into that 4MB > chunk. 64bit processes get four times as much memory since it's > expected that they may operate on much larger datasets (and see the > comment about fragmentation above) > > Just to demonstrate "reservation" vs. "real usage" via Solaris pmap: > -- snip -- > $ ksh -c 'print hello ; pmap -x $$ ; true' | egrep '16384.*anon' > FFFFFD7FFDA00000 16384 148 20 - rw--- [ > anon ] > -- snip -- > The test shows that of 16384k only 148k have really been touched... > the difference (16384-148) is reserved by the shell process but not > used. > > 3. Linux has /proc/sys/vm/overcommit_memory which is either 0 or 1 to > describe whether the kernel permits overcommitment of memory or not. > AFAIK a simple function could be written which returns |-1| (not not > permit overcommitment), |0| (don't know) or |1| (does permit > overcommitment) ... and if the function returns |-1| vmalloc should do > the same as on Solaris > > 4. The patch removes one unneccesary |memset(p, 0, size)| which was > touching pages and therefore allocating them > > ---- > > Bye, > Roland > > -- > __ . . __ > (o.\ \/ /.o) roland.ma...@nrubsig.org > \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer > /O /==\ O\ TEL +49 641 3992797 > (;O/ \/ \O;) >
_______________________________________________ ast-developers mailing list ast-developers@lists.research.att.com http://lists.research.att.com/mailman/listinfo/ast-developers