On Sun, 13 Mar 2005 11:37:59 -0800, Christopher Smith <[EMAIL PROTECTED]> wrote:
> Carl Lowenstein wrote:
>
> >I was using K3b on my Thinkpad laptop (300MHz PII). I burned an ISO
> >image to a CDR disc, with the option "verify the MD5 checksum".
> >Burning the image took 3.5 minutes. Calculating the checksum of the
> >original 650MB image took some 20 minutes, and another 20 minutes to
> >read the disc and calcuate its checksum. During all of this 40 minute
> >interval the CPU was about 100% busy.
> >
> >Finding this hard to believe, I used the "md5sum" program from GNU
> >core utils 5.2.1 and found that it took 37 seconds to calculate the
> >checksum from the image.
> >
> >Some Google research has turned up < http://www.equi4.com/md5/ >
> >"The MD5 algorithm in different programming languages". Timings vary
> >by several orders of magnitude. Chasing through the K3b sources, I
> >think I have found the algorithm as part of the kdecore Library, and
> >it seems to be written in obfuscated C++, and documented in
> >/usr/share/doc/HTML/en/kdelibs-apidocs\
> >/kdecore/html/kmdcodec_8h-source.html
> >
> >Obviously, some computers are faster than others, and if I was doing
> >this on a 2.8GHz P4 it would run about 10x faster. But this is
> >ridiculous.
> >
> >
> I'd be willing to bet your CPU was getting maxed because of issues
> with your driver, rather than the md5 calculations. Depending on how
> the drivers were setup, scanning the CD as fast as possible may
> result in maxing out the CPU. Check out what happens when you
> do an md5sum of /dev/cdrom (whatever that is linked to).
You know, I have been using computer hardware for nigh on to 40 years
now. And I have not recently seen such a misunderstanding of hardware
vs. software speeds.
The md5 calculation takes 20 minutes when done by K3b on the 650MB ISO
image on the hard drive. It also takes 20 minutes when done directly
by scanning the CD.
Without using K3b:
" time dd if=/dev/scd0 of=/dev/null bs=2k
" 32546+0 records in
" 32546+0 records out
" real 2m20.542s
" user 0m0.869s
" sys 0m10.726s
Observe that the hardware can scan the disk in 2 min 20 sec without
significant CPU load. Also observer that this CD drive is achieving a
performance of (74 min / 2.34 min) = 31.6 x.
Try piping directly to md5sum:
" time dd if=/dev/scd0 bs=2k | md5sum
. . .
" real 2m 53.991s
" user 0m 24.075s
" sys 0m24.108s
Calculating md5sum on the fly whlie reading the CD adds some 33
seconds to the total, and is observed to use 33% to 50% CPU time.
Interestingly enough, calculating the md5sum from a disk image on the
hard drive takes 37 seconds.
My point is that whatever committee wrote K3b, they chose a very
inefficient implementation of that algorithm. And it came from a
standard k3b library.
carl
--
carl lowenstein marine physical lab u.c. san diego
[EMAIL PROTECTED]
--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg