On 13 September 2013 18:50, Glenn Fowler <[email protected]> wrote: > > On Fri, 13 Sep 2013 17:14:03 +0200 Lionel Cons wrote: >> On 18 June 2013 10:20, Glenn Fowler <[email protected]> wrote: >> > >> > you showed ast grep strace output but not gnu grep >> > gnu /usr/bin/grep does read of varying chunk sizes on my redhat linux >> > for NFS files the chunk sizes were around 32Ki >> > >> > I added some test options to the SFIO_OPTIONS env var >> > >> > SFIO_OPTIONS=nomaxmap # disable mmap(), force read() >> > SFIO_OPTIONS=maxmap=-1 # map entire file >> > SFIO_OPTIONS=maxmap=1Gi # map 1Gi chunks etc. >> > >> > as long ast the buffer isn't trivially small I don't think lines >> > spanning buffer boundaries will be a drag on timing >> > >> > you might be able to set up a few experiments to see if there is >> > a knee in the performance curves where the space-time tradeoffs meet > >> The test results are in the email below: >> - bigger mmap() windows for sfio file IO. The break even we calculated >> is somewhere between 108-140MB window size on an Intel Sandy Bridge >> four way server and 144-157 for an AMD Jaguar prototype machine >> (X1150). This is likely to 'fix' most of the grep performance issue. >> Also forwarding a clarifying comment: " explain them that this is NOT >> akin to buffer bloat. It only the defines the maximum size of the >> address space window a process can have for this file. The kernel >> itself is in charge to select an appropriate number of pages it can >> donate to pass data through this window" > > thanks for the report > > based on this for the next alpha the default will be 128Mi on machines > where sizeof(void*)>=8 > > it can be tweaked with more feedback from other users / systems >
Glenn, there is a slight misconception: I said 'break even' (starting point where it starts to make sense), but 'the more, the better' is appropriate as well. Give that 64bit platforms have a very large address space something like >=1GB might be much better, at least in the presence of hugepages/largepages and page sharing, i.e. where multiple processes mapping the same file pages share a mapping. Lionel _______________________________________________ ast-users mailing list [email protected] http://lists.research.att.com/mailman/listinfo/ast-users
