Jens -- thanks for re-directing us to your article. This saved me quite some time. I noticed the lacking abstractions (slicing, comprehension) and considered sketching a prototype to use with this program. I decided to wait till today because it didn't address what bothered me most:
speed. Racket programs are usually faster than Python but apparently Python hits the sweet spot here (it really just calls the C libraries right below the abstractions). I have run the profiler -- with the expected results. I have fused a bunch of functions -- barely an improvement (5% perhaps). I have toyed with the idea of changing data structures -- but that would be cheating. (I am ok with sets vs hash because I can fake set-like programming that way.) I definitely don't want to change algorithms. I am stumped and I'd love to see how we can fix this -- Matthias On Jan 2, 2015, at 8:42 AM, Jens Axel Søgaard <jensa...@soegaard.net> wrote: > 2015-01-02 13:23 GMT+01:00 Daniel Prager <daniel.a.pra...@gmail.com>: >> Slightly off-topic: >> >> Norvig's code is both concise and -- to my eye -- highly readable. E.g. >> >> def edits1(word): >> splits = [(word[:i], word[i:]) for i in range(len(word) + 1)] >> deletes = [a + b[1:] for a, b in splits if b] >> transposes = [a + b[1] + b[0] + b[2:] for a, b in splits if len(b)>1] >> replaces = [a + c + b[1:] for a, b in splits for c in alphabet if b] >> inserts = [a + c + b for a, b in splits for c in alphabet] >> return set(deletes + transposes + replaces + inserts) >> >> Python's concise list comprehensions, destructuring, string slice operators, >> and operator overloading slot together nicely to make this kind of thing >> easy to knock out. >> >> I was able to achieve the following in Racket, but it's a fair bit of extra >> for a lesser result: >> >>> (define (edits1 word) >>> (define sub substring) >>> (define nth string-ref) >>> (define splits (for/list ([i (range (add1 (string-length word)))]) >>> (cons (sub word 0 i) (sub word i)))) >>> >>> (define (map-splits min-b f [chars '(dummy)]) >>> (for*/list ([split splits] >>> [a (in-value (car split))] >>> [b (in-value (cdr split))] >>> #:when (>= (string-length b) min-b) >>> [ch chars]) >>> (f a b ch))) >>> >>> (let ([deletes (map-splits 1 (λ (a b _) (~a a (sub b 1))))] >>> [transposes (map-splits 2 (λ (a b _) (~a a (nth b 1) (nth b 0) >>> (sub b 2))))] >>> [replaces (map-splits 1 (λ (a b ch) (~a a ch (sub b 1))) >>> alphabet)] >>> [inserts (map-splits 0 (λ (a b ch) (~a a ch b)) alphabet)]) >>> (append deletes transposes replaces inserts))) >> >> >> Comments? Suggestions? > > A DSL to make string manipulations more concise would be nice to have. > An ad hoc attempt can be seen here (the concat macro): > > http://scheme.dk/blog/2007/04/writing-spelling-corrector-in-plt.html > > The blog post was written before for and friends arrived in Racket, > so the code uses srfi 42 instead. > > -- > Jens Axel Søgaard > > ____________________ > Racket Users list: > http://lists.racket-lang.org/users ____________________ Racket Users list: http://lists.racket-lang.org/users