Glenn, could you promote SFIO_OPTIONS to the official ksh93(1) and grep(1) manual pages, please?
Lionel ---------- Forwarded message ---------- From: Glenn Fowler <[email protected]> Date: 18 June 2013 10:20 Subject: Re: Thank you for the grep builtin! To: [email protected], [email protected] Cc: [email protected], [email protected], [email protected] you showed ast grep strace output but not gnu grep gnu /usr/bin/grep does read of varying chunk sizes on my redhat linux for NFS files the chunk sizes were around 32Ki I added some test options to the SFIO_OPTIONS env var SFIO_OPTIONS=nomaxmap # disable mmap(), force read() SFIO_OPTIONS=maxmap=-1 # map entire file SFIO_OPTIONS=maxmap=1Gi # map 1Gi chunks etc. as long ast the buffer isn't trivially small I don't think lines spanning buffer boundaries will be a drag on timing you might be able to set up a few experiments to see if there is a knee in the performance curves where the space-time tradeoffs meet On Fri, 7 Jun 2013 16:09:09 +0200 Lionel Cons wrote: > On 6 June 2013 22:00, Glenn Fowler <[email protected]> wrote: > > On Thu, 6 Jun 2013 11:23:27 +0200 Lionel Cons wrote: > >> I'd like say "Thank you" for adding the grep builtin to ksh93. > >> Today we figured out that this gave a major performance boost and > >> helped a lot working around shoddy or broken grep implementations. > > > >> Thank you! > > > > great - any soft numbers on improvement? > I used this test script: > ------------------------------------------------------- > builtin grep > export LC_ALL=C # GNU grep does not fully support multibyte characters > float start > integer i > typeset gr > print '>>>>> small string test:' > for gr in '/usr/bin/grep' 'grep' ; do > printf '>>> testing %s...\n' "$gr" > (( start=SECONDS )) > for (( i=0 ; i < 10000 ; i++ )) ; do > [[ "$( "$gr" foo <<<$'hello\nfoobar\nworld\n' )" != '' ]] > done > printf '> time used: %f\n' $((SECONDS-start)) > done > print '>>>>> large file test:' > seq 1000000 >'tstfile' > for gr in '/usr/bin/grep' 'grep' ; do > printf '>>> testing %s...\n' "$gr" > (( start=SECONDS )) > for (( i=0 ; i < 100 ; i++ )) ; do > [[ "$( "$gr" -E '1((2(.)4)|(5(.)6))45' 'tstfile' )" != '' ]] > done > printf '> time used: %f\n' $((SECONDS-start)) > done > rm 'tstfile' > ------------------------------------------------------- > Running it gives an *impressive* improvements for short string matching: > >>>>> small string test: > >>> testing /usr/bin/grep... > > time used: 19.722883 > >>> testing grep... > > time used: 0.378541 > Large file matching is however crippled by libast because it uses a > sliding window to map only 512k segments of a file: > >>>>> large file test: > >>> testing /usr/bin/grep... > > time used: 1.861391 > >>> testing grep... > > time used: 8.581373 > strace logs a lot of mmap() and mumap() calls: > ... > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4380000) = > 0x7f5d188da000 > lseek(3, 71303168, SEEK_SET) = 71303168 > munmap(0x7f5d188da000, 524288) = 0 > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4400000) = > 0x7f5d188da000 > lseek(3, 71827456, SEEK_SET) = 71827456 > munmap(0x7f5d188da000, 524288) = 0 > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4480000) = > 0x7f5d188da000 > lseek(3, 72351744, SEEK_SET) = 72351744 > munmap(0x7f5d188da000, 524288) = 0 > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4500000) = > 0x7f5d188da000 > lseek(3, 72876032, SEEK_SET) = 72876032 > munmap(0x7f5d188da000, 524288) = 0 > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4580000) = > 0x7f5d188da000 > lseek(3, 73400320, SEEK_SET) = 73400320 > munmap(0x7f5d188da000, 524288) = 0 > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4600000) = > 0x7f5d188da000 > lseek(3, 73924608, SEEK_SET) = 73924608 > munmap(0x7f5d188da000, 524288) = 0 > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4680000) = > 0x7f5d188da000 > lseek(3, 74448896, SEEK_SET) = 74448896 > munmap(0x7f5d188da000, 524288) = 0 > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4700000) = > 0x7f5d188da000 > lseek(3, 74973184, SEEK_SET) = 74973184 > munmap(0x7f5d188da000, 524288) = 0 > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4780000) = > 0x7f5d188da000 > lseek(3, 75497472, SEEK_SET) = 75497472 > munmap(0x7f5d188da000, 524288) = 0 > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4800000) = > 0x7f5d188da000 > lseek(3, 76021760, SEEK_SET) = 76021760 > munmap(0x7f5d188da000, 524288) = 0 > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4880000) = > 0x7f5d188da000 > lseek(3, 76546048, SEEK_SET) = 76546048 > munmap(0x7f5d188da000, 524288) = 0 > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4900000) = > 0x7f5d188da000 > lseek(3, 77070336, SEEK_SET) = 77070336 > munmap(0x7f5d188da000, 524288) = 0 > mmap(NULL, 524288, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0x4980000) = > 0x7f5d188da000 > lseek(3, 77594624, SEEK_SET) = 77594624 > munmap(0x7f5d188da000, 524288) = 0 > ... > The time spend in the mmap() and munmap() syscalls and the overhead in > regex() to handle the segments one by one instead of just one large > segment are the dominating factors, ruining performance. > This issue was discussed in this list earlier, with a patch included. > Applying that patch so that libast maps the whole file improves > performance for large files: > >>>>> large file test: > >>> testing /usr/bin/grep... > > time used: 1.919394 > >>> testing grep... > > time used: 2.619201 > This is still slower than GNU grep but this difference is IMO balanced > by the fact tha AST grep is standard conforming, handles multibyte > characters correctly and has --augmented-regexp. > IMO the only action which must be taken here is to apply the patch to > prevent libast from doing the unnecessary 512k segmentation. > Lionel -- Lionel _______________________________________________ ast-users mailing list [email protected] http://lists.research.att.com/mailman/listinfo/ast-users
