We need to worry about applications that require a large amount of memory
outside of Sfio too. I know of a couple of local apps that would routinely
use 40-50Gbs for shared memory managed by Vmalloc and CDT. We wouldn't
want to thrash them unncessarily.

Phong

> From [email protected] Mon Apr 15 08:15:15 2013
> To: Glenn Fowler <[email protected]>
> Cc: [email protected]
> Subject: Re: [ast-developers] mmap() for command substitutions still not  
> living up to its fullest potential?

> On Mon, Apr 15, 2013 at 7:13 AM, Glenn Fowler <[email protected]> wrote:
> >
> > On Mon, 15 Apr 2013 03:07:41 +0200 Lionel Cons wrote:
> >> Based on the recent discussion about using mmap() for reading the
> >> results of command substitutions I did some testing and found that on
> >> Solaris (Solaris 11 and a 64bit build) ksh93 still behaves not
> >> optimal. The primary problem I see is that MANY mmap() calls with a
> >> very small map size (524288 bytes) are executed instead of either
> >> mapping the input file in one large chunk or at least uses a chunk
> >> size large enough that the system can use largepages (2M for x86,
> >> 4M/32M/256M for SPARC64) if possible. Using a chunk size of 524288
> >> bytes is a joke.
> >
> >> Is there a specific reason why the the code in sfrd.c only maps such
> >> small chunks (I'd expect that a 64bit process could easily map 16GB
> >> each time) from a file or is this a bug?
> >
> > provide some iffe code that spits out the optimal mmap() page size
> > for the current os/arch/configuration and that can be rolled into sfio

> Erm... the "page size" (=the size used for MMU pages) is IMHO the
> wrong property because it (usually) has to be chosen by the kernel
> based on { MMU type, supported page sizes, available continuous memory
> (as backing store) ... and for I/O the IOMMU page size and preferred
> page size for the matching I/O device }.

> The issue here is that the "chunk size" which sfio uses to |mmap()|
> parts of a large file is very very low and prevents in most cases the
> use of large pages (at least on i386/AMD64 which only has 4096bytes
> and 2M/4M pages (other platforms have more choices... for example
> UltraSPARC supports page sizes like 8192, 64k, 512k, 4M, 32M, 256M, 2G
> pages)).

> I did some digging and found that the following patch fixes the issue
> for 64bit builds:
> -- snip --
> --- original/src/lib/libast/sfio/sfrd.c 2012-09-24 20:11:06.000000000 +0200
> +++ build_i386_64bit_debug/src/lib/libast/sfio/sfrd.c   2013-04-15
> 03:24:22.892159982 +0200
> @@ -161,18 +161,20 @@

>                         /* make sure current position is page aligned */
>                         if((a = (size_t)(f->here%_Sfpage)) != 0)
>                         {       f->here -= a;
>                                 r += a;
>                         }

>                         /* map minimal requirement */
> +#if _ptr_bits < 64
>                         if(r > (round = (1 + (n+a)/f->size)*f->size) )
>                                 r = round;
> +#endif

>                         if(f->data)
>                                 SFMUNMAP(f, f->data, f->endb-f->data);

>                         for(;;)
>                         {       f->data = (uchar*)
> sysmmapf((caddr_t)0, (size_t)r,
>                                                         
> (PROT_READ|PROT_WRITE),
>                                                         MAP_PRIVATE,
> -- snip --

> ... for 32bit builds the problem is not easily fixable because there
> has to be a balance between available address space (4GB... but only
> 2GB are usually available for file mappings) and maximum number of
> open files (e.g. the value returned by $ ulimit -n # ...if we use that
> with nfiles==1024 we get a maximum chunk size of $(( (pow(2,32)/2) /
> 1024. ))==2097152 (which would be acceptable) but for nfiles==65536 we
> get a chunk size of $(( (pow(2,32)/2) / 65536. ))  == 32768 ... which
> renders the advantage of using |mmap()| useless).

> Based on that I'd suggest the following solution:
> 1. Take the patch above to allow 64bit libast consumers to allow
> "unlimited" chunk size mapping. This will work in _any_ case because
> a) 64bit address space is vast and b) |sfrd()| will retry with half
> the chunk size if the previous attempt to |mmap()| fails.
> Using an "unlimited" chunk size allows the kernel to pick the best MMU
> page size available (and reduces the syscall overhead to almost zero).

> Optionally we could "clamp" the chunk size to 44bits (which allows
> 65536 files opened with 44bit chunks open (while still being able to
> use multiple 256G MMU pages for each file mapping) and still having
> lots of free virtual address space for memory and stack)

> 2. Optionally for 32bit processes we should add low and high "limits"
> for the chunk size... it should *never* be below 4M and not be higher
> than $(( (pow(2,32)/2) / nfiles )) (unless size is lower than 4M).

> Does that sound reasonable ?

> ----

> Bye,
> Roland

> -- 
>   __ .  . __
>  (o.\ \/ /.o) [email protected]
>   \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
>   /O /==\ O\  TEL +49 641 3992797
>  (;O/ \/ \O;)
> _______________________________________________
> ast-developers mailing list
> [email protected]
> http://lists.research.att.com/mailman/listinfo/ast-developers
_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers

Reply via email to