Hi, Bane and all, Am Dienstag, 13. Dezember 2005 14:22 schrieben Sie: > > > In real world situation your KMP will always be fastest on average. > > > I like that we are not using C arrays as then we have advantage > > > of lazyness and save on memory usage. C++ program will be faster > > > on shorter strings but on this large strings will loose due memory > > > latency. and with your test, both programs are very fast.
Yesterday, I did the unspeakable -- I wrote a C-version. Smashes Haskell-performance for short enough Strings (factor 10 for my test, factor 2.2 for Bane's), but once it starts swapping, we catch up, and for really large Strings I dare say we'd win far out. I also managed to get my KMP still faster, using take and drop instead of splitAt helps a lot (Bane, the use of 'break' in my transcript of yours was what slowed it down, I reintroduced searchr''' and both are equal). I'm not quite sure, whether that indeed helps, but I also chose to use listArray for the boxed array of borders. Now it's searchReplace :: String -> String -> String -> String searchReplace "" _ str = str searchReplace src@(c:cs) dst str = process 0 str "" where len = length src pat :: UArray Int Char pat = listArray (0,len-1) src bord = A.listArray (0,len) $ (-1):0: [getBord (pat!i) i + 1 | i <- [1 .. len-1]] getBord s n | m < 0 = m | s == pat!m = m | otherwise = getBord s m where m = bord A.! n bor :: UArray Int Int bor = listArray (0,len) $ A.elems bord getBor s n | m < 0 || s == pat!m = m | otherwise = getBor s m where m = bor!n process n str _ | n >= len = dst ++ process 0 str "" process _ "" mat = reverse mat process 0 (s:st) _ | s == c = process 1 st [s] | otherwise = s:process 0 st "" process n str@(s:st) mat | s == pat!n = process (n+1) st (s:mat) | otherwise = case getBor s n of -1 -> reverse mat ++ process 0 str "" 0 -> reverse mat ++ process 1 st [s] j -> reverse (drop j mat) ++ process (j+1) st (s:take j mat) gives a speedup of roughly 10% on my box versus yesterday's version. > > > > > > Greetings, Bane. > > > >On my 256MB RAM AMD Duron 1200 MHz, Bulat's version is consistently about > >20% > >faster than my KMP on your test -- btw, I unboxed the pat array, which > > gave a > >bit of extra speed, but not much. > > I think that's because on your machine Bulat's version have better > perfromance > with CPU cache. > I don;t know but now your version is 25% faster with my test on P4 > hyperthreaded. Errrr, what's 'hyperthreaded' ? Unfortunately, I'm completely useless with computers. > > your new version: > $ time srchrep.exe > Working:seasearch replace able seaseasearch baker seasearch charlie > True > Done > > > real 0m8.734s > user 0m0.015s > sys 0m0.000s > > Bulat's version: > > [EMAIL PROTECTED] ~/tutorial > $ time replace1.exe > Working:seasearch replace able seaseasearch baker seasearch charlie > True > Done > > > real 0m11.734s > user 0m0.015s > sys 0m0.015s > > 3 secs difference now. > > >And apologies to Sebastian Sylvan, I also included an unboxed version of > >bord, > >built from the boxed version, and that sped things further up -- not much, > >again, but there it is. > > On my machine you got another 10-15% of boost with unboxed arrays. > > >I wonder about this difference, -10% on one system and +20% on another > >system, > >ist that normal? > > Different caching schemes on CPU's perhaps? different memory latencies? > hyperthreading helps your version? more code and data, perhaps because > of that it pays the price on your machine? > > Greetings, Bane. > Well, whatever. Upto now, on my box, Bulat's is still the fastest for your test -- though I've narrowed the gap quite a bit. Cheers, Daniel _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe