On Tue, 18 Jun 2013 14:40:33 +0200 Lionel Cons wrote:
> On 18 June 2013 10:20, Glenn Fowler <[email protected]> wrote:
> >
> > you showed ast grep strace output but not gnu grep
> sorry, I didn't verify the data send by my staff
> > gnu /usr/bin/grep does read of varying chunk sizes on my redhat linux
> > for NFS files the chunk sizes were around 32Ki
> chunk size for read() or chunk size for mmap()?
it seems gnu grep uses read(2) to read the file
strace looks like
mmap MAP_ANON
read some large number of bytes, semmingly different each time
munmap
> >
> > I added some test options to the SFIO_OPTIONS env var
> >
> > SFIO_OPTIONS=nomaxmap # disable mmap(), force read()
> > SFIO_OPTIONS=maxmap=-1 # map entire file
> > SFIO_OPTIONS=maxmap=1Gi # map 1Gi chunks etc.
> >
> > as long ast the buffer isn't trivially small I don't think lines
> > spanning buffer boundaries will be a drag on timing
> You don't understand the issue related to *sharing* largepage I/O
> pages among processes.
I do understand the issue of sharing a huge file
but sfio also has to play nice with the opposite scenario of dealing with
enough small and large files across the entire system to blow the page cache
the knee I mentioned below is to find a happy medium between the ends
of the spectrum where neither side suffers because the other side is a hog
to show how bad it could get mapping in entire files
set up a test with F different 20Gi+ files (F*20Gi >> physical mem size)
and loop ast grep and gnu grep through the files
try ast grep with a few different SFIO_OPTIONS maxmap settings
my guess is the best throughput will be with sfio mapping << 20Gi chunks
and based on your large page trigger below maybe the "play nice" size
would be in 128Mi..256Mi range
> Scenario:
> We have one very large 20GB+ input file. The machine in question has
> 64GB memory. We run around 400 or more jobs to filter and analyse the
> input file, mostly in parallel. If the mmap() buffers are to small
> then the kernel will not share them among different processes and
> keeps copying,reading (through consumer of mmap()),purging the
> buffers.
> If the mmap() chunk size is large enough (e.g. > 128MB in this
> particular case; it appears to be a kernel threshold of 64 pages with
> 2M each to trigger the kernel to grant 2M pages for mmap() I/O usage)
> the kernel will enable the use of largepages (2M on AMD64, compared to
> the default I/O page size of 4k) and share (without - and this is the
> crucial point - creating private copies) the pages between processes
> which concurrently work on the file.
> >
> > you might be able to set up a few experiments to see if there is
> > a knee in the performance curves where the space-time tradeoffs meet
> I'll ask my staff
_______________________________________________
ast-users mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-users