When continuations are too expensive, consider representing the continuation explicitly. In this case, I think a list of paths represents the continuation easily enough.
(define (in-directory5 [orig-dir #f]) (define init-dir (or orig-dir (current-directory))) ;; current state of the sequence is a list of paths to produce; when ;; incrementing past a directory, add the directory's immediate ;; content to the front of the list: (define (next l) (if (directory-exists? (car l)) (append (if orig-dir ;; paths are always complete: (directory-list (car l) #:build? #t) ;; work relative to `orig-dir`: (parameterize ([current-directory init-dir]) (directory-list (car l) #:build? #t))) (cdr l)) (cdr l))) (make-do-sequence (lambda () (values car next (if orig-dir (directory-list orig-dir #:build? #t) (directory-list)) pair? #f #f)))) At Wed, 4 Sep 2013 10:01:03 -0400, Sam Tobin-Hochstadt wrote: > Inspired by a post about a faster directory iteration in Haskell [1], > I decided to try doing the same for Racket. The results are here: > https://gist.github.com/samth/6437192 > > The current implementation uses continuations, which are pretty slow. > The fastest solution would fuse the traversal and processing code, but > Racket's optimizer isn't there yet for any implementation I can come > up with. I tried a few different strategies: > > 1. Build a list of all of the entries, and then just return that. > This is quite fast (3x faster than `in-directory), even for 30k+ > entries, but it does excessive IO when it's not needed. > 2. Use a thread and a channel to communicate. This is about twice as > slow (on my SSD) as the list approach, but it doesn't do any excessive > IO. > 3. Use a thread, a channel, and a vector as a buffer. This can be as > fast as or faster than the list approach with a large buffer size > (1000 elements), but it's quite competitive even with a buffer size of > 5. > > My channel implementation reads one entry ahead of where the current > `in-directory` is. > > If we think it's important never to do _any_ extra I/O, then I think > we should use a modified version of my 2nd implementation. If we're > willing to relax that, we should probably go with the 3rd approach > with a buffer size of 5 or 10. > > Note that even my fastest code is 2.5x slower than Unix `find`, even > when discounting the Racket startup overhead. > > Sam > > [1] https://github.com/JohnLato/posix-paths > _________________________ > Racket Developers list: > http://lists.racket-lang.org/dev _________________________ Racket Developers list: http://lists.racket-lang.org/dev