On 16 April 2013 20:25, Lionel Cons <[email protected]> wrote:
> On 15 April 2013 14:15, Roland Mainz <[email protected]> wrote:
>> On Mon, Apr 15, 2013 at 7:13 AM, Glenn Fowler <[email protected]> wrote:
>>>
>>> On Mon, 15 Apr 2013 03:07:41 +0200 Lionel Cons wrote:
>>>> Based on the recent discussion about using mmap() for reading the
>>>> results of command substitutions I did some testing and found that on
>>>> Solaris (Solaris 11 and a 64bit build) ksh93 still behaves not
>>>> optimal. The primary problem I see is that MANY mmap() calls with a
>>>> very small map size (524288 bytes) are executed instead of either
>>>> mapping the input file in one large chunk or at least uses a chunk
>>>> size large enough that the system can use largepages (2M for x86,
>>>> 4M/32M/256M for SPARC64) if possible. Using a chunk size of 524288
>>>> bytes is a joke.
>>>
>>>> Is there a specific reason why the the code in sfrd.c only maps such
>>>> small chunks (I'd expect that a 64bit process could easily map 16GB
>>>> each time) from a file or is this a bug?
>>>
>>> provide some iffe code that spits out the optimal mmap() page size
>>> for the current os/arch/configuration and that can be rolled into sfio
>>
>> Erm... the "page size" (=the size used for MMU pages) is IMHO the
>> wrong property because it (usually) has to be chosen by the kernel
>> based on { MMU type, supported page sizes, available continuous memory
>> (as backing store) ... and for I/O the IOMMU page size and preferred
>> page size for the matching I/O device }.
>>
>> The issue here is that the "chunk size" which sfio uses to |mmap()|
>> parts of a large file is very very low and prevents in most cases the
>> use of large pages (at least on i386/AMD64 which only has 4096bytes
>> and 2M/4M pages (other platforms have more choices... for example
>> UltraSPARC supports page sizes like 8192, 64k, 512k, 4M, 32M, 256M, 2G
>> pages)).
>>
>> I did some digging and found that the following patch fixes the issue
>> for 64bit builds:
>> -- snip --
>> --- original/src/lib/libast/sfio/sfrd.c 2012-09-24 20:11:06.000000000 +0200
>> +++ build_i386_64bit_debug/src/lib/libast/sfio/sfrd.c   2013-04-15
>> 03:24:22.892159982 +0200
>> @@ -161,18 +161,20 @@
>>
>>                         /* make sure current position is page aligned */
>>                         if((a = (size_t)(f->here%_Sfpage)) != 0)
>>                         {       f->here -= a;
>>                                 r += a;
>>                         }
>>
>>                         /* map minimal requirement */
>> +#if _ptr_bits < 64
>>                         if(r > (round = (1 + (n+a)/f->size)*f->size) )
>>                                 r = round;
>> +#endif
>>
>>                         if(f->data)
>>                                 SFMUNMAP(f, f->data, f->endb-f->data);
>>
>>                         for(;;)
>>                         {       f->data = (uchar*)
>> sysmmapf((caddr_t)0, (size_t)r,
>>                                                         
>> (PROT_READ|PROT_WRITE),
>>                                                         MAP_PRIVATE,
>> -- snip --
>
> We've tested the patch with Solaris 11.1 on a Oracle SPARC-T3 machine.
> Below are the sample numbers, average over 2000 samples for a builtin
> grep;grep -F NoNumber tmpfile;true over a Gb-sized text file:
>
> Without your patch:
> real    5m30.956s
> user    5m2.847s
> sys     0m27.204s
>
> With your patch:
> real    5m8.956s
> user    4m52.592s
> sys     0m11.726s
>
> Notice the significant reduction of time spend in sys!
>
> Another noticeable (impressive!) benefit we noticed with the patch was
> that if many application using sfio for IO were running in parallel
> but working on the same file Solaris automatically assigned 64k pages
> to hotspots in the file mapping, even further increasing the
> throughput. For one particular application, femtoslice, which runs in
> a few thousand iterations on the same file, we noticed an astonishing
> 9% decrease in run time when 100 processes run in parallel on the same
> file. The explanation is simple: The file gets mapped as whole block
> into the processes and Solaris 11.1 allows MMU data sharing between
> SPARC processors. This, the sharing and the 64k page size sum up to a
> 9% performance benefit.
>
> +1 for the patch

This problem is still a MAJOR performance issue. IMO it would be good
if the patch could be unconditionally applied for the next alpha
release to see if there are real problems or not. Our performance lab
staff says there isn't any issues with the patch since we're using it
since April without regressions.

Lionel
_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers

Reply via email to