Hi again, Here I decided to look at the raw performance of guile-zstd vs guile-zlib when decompressing the ungoogled-chromium source into a 4 GiB something tarball.
You'll need to generate the tar.zst and tar.gz yourself, but the script that was used is: --8<---------------cut here---------------start------------->8--- ;; decompress-zstd.scm (use-modules (ice-9 binary-ports) (ice-9 match) (statprof) (zstd)) (define MiB (expt 2 20)) (define input-file "/tmp/chromium-98.0.4758.102.tar.zst") (define output-file "/dev/null") (define (decompression-test) (call-with-input-file input-file (lambda (port) (call-with-zstd-input-port port (lambda (input) (call-with-output-file output-file (lambda (output) (let loop ((bv (get-bytevector-n input (* 4 MiB)))) (match bv ((? eof-object?) #t) (bv (put-bytevector output bv) (loop (get-bytevector-n input (* 4 MiB))))))))))))) (statprof (lambda () (decompression-test))) --8<---------------cut here---------------end--------------->8--- Compiled and run: --8<---------------cut here---------------start------------->8--- $ alias time+ alias time+='command time -f"cpu: %P, mem: %M KiB, wall: %E, sys: %S, usr: %U"' $ guild compile -O3 /tmp/decompress-zstd.scm $ time+ guile /tmp/decompress-zstd.scm % cumulative self time seconds seconds procedure 48.69 13.93 13.93 anon #x1689100 45.38 12.98 12.98 %after-gc-thunk 3.47 0.99 0.99 bytevector->pointer 0.46 28.59 0.13 zstd.scm:234:2:read! 0.39 0.11 0.11 get-bytevector-n! 0.23 0.22 0.07 system/foreign.scm:150:0:write-c-struct 0.23 0.07 0.07 bytevector-u64-native-set! 0.15 0.07 0.04 system/foreign.scm:167:0:read-c-struct 0.15 0.04 0.04 anon #x1688ed0 0.15 0.04 0.04 assv-ref 0.15 0.04 0.04 system/foreign.scm:91:9 0.08 0.26 0.02 system/foreign.scm:182:0:make-c-struct 0.08 0.02 0.02 put-bytevector 0.08 0.02 0.02 list? 0.08 0.02 0.02 sizeof 0.08 0.02 0.02 pointer->bytevector 0.08 0.02 0.02 make-bytevector 0.08 0.02 0.02 bytevector-u64-native-ref 0.00 28.61 0.00 zstd.scm:273:0:call-with-zstd-input-port 0.00 28.61 0.00 ice-9/ports.scm:438:0:call-with-input-file 0.00 28.61 0.00 /tmp/decompress-zstd.scm:16:12 0.00 28.61 0.00 ice-9/ports.scm:456:0:call-with-output-file 0.00 28.59 0.00 get-bytevector-n 0.00 12.98 0.00 anon #x167aed0 0.00 0.07 0.00 system/foreign.scm:187:0:parse-c-struct 0.00 0.04 0.00 zstd.scm:57:4 0.00 0.04 0.00 srfi/srfi-1.scm:452:2:fold 0.00 0.02 0.00 system/foreign.scm:188:20 --- Sample count: 1298 Total time: 28.614481162 seconds (15.671167152 seconds in GC) cpu: 153%, mem: 39156 KiB, wall: 0:18.92, sys: 0.50, usr: 28.45 --8<---------------cut here---------------end--------------->8--- And for guile-zlib, after adjusting the script to: --8<---------------cut here---------------end--------------->8--- (use-modules (ice-9 binary-ports) (ice-9 match) (statprof) (zlib)) (define MiB (expt 2 20)) (define input-file "/tmp/chromium-98.0.4758.102.tar.gz") (define output-file "/dev/null") (define (decompression-test) (call-with-input-file input-file (lambda (port) (call-with-gzip-input-port port (lambda (input) (call-with-output-file output-file (lambda (output) (let loop ((bv (get-bytevector-n input (* 4 MiB)))) (match bv ((? eof-object?) #t) (bv (put-bytevector output bv) (loop (get-bytevector-n input (* 4 MiB))))))))))))) (statprof (lambda () (decompression-test))) --8<---------------cut here---------------end--------------->8--- I got: --8<---------------cut here---------------start------------->8--- $ time+ guile /tmp/decompress-gzip.scm % cumulative self time seconds seconds procedure 71.18 21.21 21.21 anon #x218af40 20.78 6.19 6.19 %after-gc-thunk 5.33 1.59 1.59 bytevector->pointer 2.39 23.51 0.71 zlib.scm:99:4 0.32 6.29 0.09 zlib.scm:182:2:read! 0.00 29.80 0.00 /tmp/decompress-gzip.scm:16:12 0.00 29.80 0.00 get-bytevector-n 0.00 29.80 0.00 ice-9/ports.scm:456:0:call-with-output-file 0.00 29.80 0.00 zlib.scm:217:0:call-with-gzip-input-port 0.00 29.80 0.00 ice-9/ports.scm:438:0:call-with-input-file 0.00 6.19 0.00 anon #x217ced0 --- Sample count: 1256 Total time: 29.800587574 seconds (8.715080702 seconds in GC) cpu: 124%, mem: 60772 KiB, wall: 0:24.12, sys: 0.56, usr: 29.54 --8<---------------cut here---------------end--------------->8--- This confirms that guile-zstd is not noticeably faster than guile-zlib, which is unexpected. Compare to the command line tools: $ time+ zstd -cdk /tmp/chromium-98.0.4758.102.tar.zst > /dev/null cpu: 99%, mem: 10548 KiB, wall: 0:09.37, sys: 0.30, usr: 9.05 $ time+ gunzip -ck /tmp/chromium-98.0.4758.102.tar.gz > /dev/null cpu: 99%, mem: 2908 KiB, wall: 0:22.29, sys: 0.31, usr: 21.98 where zstd is about 2.3x faster. It's unfortunate that the bulk of the time is spent in "anon" (anonymous proc?), which doesn't say much. Perhaps I should open an issue with the guile-zstd project. Thanks, Maxim