I have a feeling there are a couple other things going on here. First, the time reported in GC is pretty low compared to overall time. I did something silly--I tried compressing (using the z3f functions) to a file, then reading or decompressing the data from the file. Compared to the result of gzip-string (0.88s), these functions were about twice as slow (1.5s), not several times as with z3:compress-string (4.0s). Furthermore, these functions do just as many major GCs as z3:compress-string.
Number 2 thing is that gzip-string (on my system) does not take 0.888
seconds real-time to execute. That's just what chicken reports. If
you comment out the other tests and actually run the script from the
command line using 'time', it takes -substantially- longer: 7.9s! I
expect this on OS X because it is notoriously slow at forking new
processes; it may be a lot faster on e.g. Linux. However, the point
is, only the time for Chicken itself is being measured, not for gzip.
Clearly z3:compress-string could be a lot faster, as I was able to
compress to and decompress from a file in 1/3 the time! But I would
also test gzip-string again on your system and verify you are getting
the performance you think you are.
Test data follows and I attached an updated compress-test.scm.
Test expression is: (string-length (gzip-string *input*))
Evaluating once. Return value is: 29
Repeating 1000 times:
0.888 seconds elapsed
0.008 seconds in (major) GC
43772 mutations
127 minor GCs
6 major GCs
real 0m7.926s user 0m1.697s sys 0m5.843s
Test expression is: (string-length (z3:compress-string *input*))
Evaluating once. Return value is: 19
Repeating 1000 times:
4.073 seconds elapsed
0.486 seconds in (major) GC
16924 mutations
20 minor GCs
663 major GCs
real 0m4.109s
Test expression is: (z3:compress-string/file /tmp/b.gz *input*)
Evaluating once. Return value is: (compressed data omitted)
Repeating 1000 times:
1.591 seconds elapsed
0.78 seconds in (major) GC
7196 mutations
2 minor GCs
621 major GCs
Test expression is: (z3:compress-and-decompress-string/file /tmp/b.gz *input*)
Evaluating once. Return value is: "xxxxxxxxx [...]"
Repeating 1000 times:
1.425 seconds elapsed
0.719 seconds in (major) GC
18468 mutations
6 minor GCs
999 major GCs
On 12/21/06, felix winkelmann <[EMAIL PROTECTED]> wrote:
On 12/21/06, Graham Fawcett <[EMAIL PROTECTED]> wrote: > Hi folks, > > Does anyone have code for compressing strings using zlib, lzo or some > other common llibrary/algorithm? I seem to have z3 working, but the > performance is really terrible -- I may be doing something wrong, to > be fair, the documentation is a bit light. > > I'm getting ~20x better performance using (process "gzip") to fork > gzip and compress that way, compared with a string-compression > procedure that I copied from the z3 test-script: > >[...] > > I know it's not an apples-to-apples comparison. But why the huge > difference? My lack of understanding of the z3 egg may be the cause. Your code looks ok. Note the massive number of (major) garbage collections. The code just conses a lot. One reason for that is the repeated (and required) use of substring. The z3lib itself should be fast enough, but it may be that the interface is too simplistic. I will add a set of compress/decompress-the-whole-buffer routines written in C which should be much faster. cheers, felix _______________________________________________ Chicken-users mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/chicken-users
compress-test.scm
Description: Binary data
_______________________________________________ Chicken-users mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/chicken-users
