> From: Jared Updike [mailto:[EMAIL PROTECTED] 
> 
> Thanks for posting the code. It works on pretty large data sets (for
> example, a thousand Strings each) and I have a hunch that if I use
> Data.ByteString it would even work fast enough on my quarter meg text
> files (split on words, ~40,000 and ~50,000 words each) to use in place
> of GNU sdiff or diff. Did you use FastPackedString or ByteString to
> get performance you alluded to?


You'll notice that the primary interface is:
  diff :: Eq a => [a] -> [a] -> [Modification a]

So it's not optimised for any kind of String input.

The performance I'm alluding to is not something I've tested with any rigor, or 
even metrics to support. I simply needed to compare a couple of large text 
files which GNU diff didn't handle (I think when the input is over a certain 
size it breaks it into chunks and diffs the chunks). I tried this code, and it 
did the job in a reasonable time (my PC has 1G of memory, and that helps when 
the input lists have to reside entirely in memory). That was using naïve 
String-IO too, so there might well be some mileage in a FPS-specific 
implementation.

I profiled the code by generating large inputs in memory, and then invoking the 
various interfaces, so as to isolate the testing from IO performance. And that 
was just to find the performance hotspots in the algorithm, not to compare 
against other diff implementations.

Alistair
*****************************************************************
Confidentiality Note: The information contained in this message,
and any attachments, may contain confidential and/or privileged
material. It is intended solely for the person(s) or entity to
which it is addressed. Any review, retransmission, dissemination,
or taking of any action in reliance upon this information by
persons or entities other than the intended recipient(s) is
prohibited. If you received this in error, please contact the
sender and delete the material from any computer.
*****************************************************************
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to