Re: [Haskell-cafe] MD5? (was: Haskell performance question)

2007-11-09 Thread Thomas M. DuBuisson
I minor changes, fixing up my chunking function (finally) thus
eliminating the space leak.  Performance is now under 3x that of C!
Yay!  Also, nano MD5 benched at 1.15x 'C' (for files small enough for
strict ByteStrings to do ok).

Get the code:
darcs get http://code.haskell.org/~tommd/pureMD5

On the 2GB benchmark it is even more competitive (see my blog on
sequence.complete).  Let me know if you get significantly different
results (and you will if you IO doesn't horribly bottle neck you like on
my laptop).

-Tom

 You might like to test against,
 
 http://hackage.haskell.org/cgi-bin/hackage-scripts/package/nano-md5-0.1
 
 which is a strict bytestring openssl binding.
 
 -- Don
-- 
The philosophy behind your actions should never change, on the other
hand, the practicality of them is never constant. - Thomas Main
DuBuisson

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] MD5? (was: Haskell performance question)

2007-11-08 Thread Andrew Coppin

Don Stewart wrote:

dpiponi:
  

I was getting about 1.5s for the Haskell program and about 0.08s for
the C one with the same n=10,000,000.



I'm sure we can do better than that!
  

That's the spirit! :-D


Speaking of which [yes, I'm going to totally hijack this thread now...], 
does anybody have a Haskell MD5 hash implementation that goes fast? 
IIRC, I found one in MissingH, and it worked great. Except that as soon 
as you feed it a 10 MB file, the standard Unix md5sum executable takes 
about 0.001s to do it, and the Haskell version goes crazy and starts 
eating virtual memory like candy. o_O (Although given a few minutes it 
*does* produce the correct answer. But given that I want to run it over 
an entire CD..)


Given the choise, I'd *like* to find a fast 100% Haskell implementation 
- but failing that, (nice) bindings to a fast C implementation will do I 
guess. (I *only* need to compute MD5 hashes for files on disk. I don't 
need to do anything more fancy than that...)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] MD5? (was: Haskell performance question)

2007-11-08 Thread Thomas M. DuBuisson
Glad you asked!

http://sequence.complete.org/node/367

I just posted that last night!  Once I get a a community.haskell.org
login I will put the code on darcs.

The short of it it:
1) The code is still ugly, I haven't been modivated to clean.
2) Manually unrolled, it is ~ 6 times slower than C
3) When Rolled it is still much slower than that
4) There is some optimizer bug in GHC - this code could be 2x faster, I
feel certain.
5) I benchmarked using a 200MB file, so I think it will handle whatever.

Thomas DuBuisson

On Thu, 2007-11-08 at 22:14 +, Andrew Coppin wrote:
 Don Stewart wrote:
  dpiponi:

  I was getting about 1.5s for the Haskell program and about 0.08s for
  the C one with the same n=10,000,000.
  
 
  I'm sure we can do better than that!

 That's the spirit! :-D
 
 
 Speaking of which [yes, I'm going to totally hijack this thread now...], 
 does anybody have a Haskell MD5 hash implementation that goes fast? 
 IIRC, I found one in MissingH, and it worked great. Except that as soon 
 as you feed it a 10 MB file, the standard Unix md5sum executable takes 
 about 0.001s to do it, and the Haskell version goes crazy and starts 
 eating virtual memory like candy. o_O (Although given a few minutes it 
 *does* produce the correct answer. But given that I want to run it over 
 an entire CD..)
 
 Given the choise, I'd *like* to find a fast 100% Haskell implementation 
 - but failing that, (nice) bindings to a fast C implementation will do I 
 guess. (I *only* need to compute MD5 hashes for files on disk. I don't 
 need to do anything more fancy than that...)
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe
-- 
The philosophy behind your actions should never change, on the other
hand, the practicality of them is never constant. - Thomas Main
DuBuisson

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] MD5? (was: Haskell performance question)

2007-11-08 Thread Don Stewart
andrewcoppin:
 Don Stewart wrote:
 dpiponi:
   
 I was getting about 1.5s for the Haskell program and about 0.08s for
 the C one with the same n=10,000,000.
 
 
 I'm sure we can do better than that!
   
 That's the spirit! :-D
 
 
 Speaking of which [yes, I'm going to totally hijack this thread now...], 
 does anybody have a Haskell MD5 hash implementation that goes fast? 
 IIRC, I found one in MissingH, and it worked great. Except that as soon 
 as you feed it a 10 MB file, the standard Unix md5sum executable takes 
 about 0.001s to do it, and the Haskell version goes crazy and starts 
 eating virtual memory like candy. o_O (Although given a few minutes it 
 *does* produce the correct answer. But given that I want to run it over 
 an entire CD..)
 
 Given the choise, I'd *like* to find a fast 100% Haskell implementation 
 - but failing that, (nice) bindings to a fast C implementation will do I 
 guess. (I *only* need to compute MD5 hashes for files on disk. I don't 
 need to do anything more fancy than that...)

Start with a fast C version, and translate that into code over
ByteStrings. If its not within 2x, call the bytestring hackers hotline,
which is on the wiki.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] MD5? (was: Haskell performance question)

2007-11-08 Thread Don Stewart

thomas.dubuisson:
 Glad you asked!
 
 http://sequence.complete.org/node/367
 
 I just posted that last night!  Once I get a a community.haskell.org
 login I will put the code on darcs.

Cool. I'll look  at this.

You might like to test against,

http://hackage.haskell.org/cgi-bin/hackage-scripts/package/nano-md5-0.1

which is a strict bytestring openssl binding.

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] MD5? (was: Haskell performance question)

2007-11-08 Thread Stefan O'Rear
On Thu, Nov 08, 2007 at 06:14:20PM -0500, Thomas M. DuBuisson wrote:
 Glad you asked!
 
 http://sequence.complete.org/node/367
 
 I just posted that last night!  Once I get a a community.haskell.org
 login I will put the code on darcs.
 
 The short of it it:
 1) The code is still ugly, I haven't been modivated to clean.
 2) Manually unrolled, it is ~ 6 times slower than C
 3) When Rolled it is still much slower than that
 4) There is some optimizer bug in GHC - this code could be 2x faster, I
 feel certain.
 5) I benchmarked using a 200MB file, so I think it will handle whatever.

Why did you put yourself through all this pain when you could have just
copied the code from md5sum(1), removed the main function, and foreign
imported its buffer accumulator wrapping it as a function over lazy
bytestrings?  We have the best foreign function interface in the world.
Reinventing wheels is stupid, especially if the existing wheels are this
easy to use.

Stefan


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe