Hi,

Thanks for looking into my questions. I posted a self contained example
here https://github.com/coreasync/parallel-gzip with instructions how to
create test data as well. Also attached results below I get on my quite
decent hardware (partial 'time' results are mangled, was not sure how to
separate them). I use two separate 'lazy-seq', however I heard somewhere,
that they are not free even if no synchronization takes place, like in this
case but could be optimized out for a single thread situation. Apologies
for jumping into conclusion ... Also, I do not believe that we deal with a
significant amount of IO as those test files easily fit into O/S buffers.

Two test runs below show, that we can easily take advantage of multiple
cores. Java versions scale well. Same in the Clojure code for uncompressed
files. In all 3 cases, resulting in JVM taking a stable 200% of CPU, i.e.
occupying two cores. Also Java and Clojure time numbers are quite
consistent.

However, as soon as I add a GZIPInputStream input stream, Clojure version
start pegging 400, 500, 600% of CPU varying over time. I assumed initially,
taht effort was spend for some thread synchronization tasks as JIT was not
able to factor out due to more code involved. Interestingly enough, YourKit
shows only two threads busy interlaced with empty spaces, almost looking
like JVM being busy doing some kind of house keeping, hitting CPU really
bad. Thread dumps did not reveal anything weird, no locking contention, etc
... I tried Java 7 and 8 as well as Clojure 1.7 and 1.8 - none of make any
difference.

Understanding where that limitation comes from is quite critical, as I try
to use hardware to the best possible extend.

Thanks in advance for hints and clues ...
AndyL


# create test data
$curl -o 1 http://norvig.com/big.txt
$cat 1 1 1 1 1 1 1 1 > 2
$cat 2 2 2 2 2 2 2 2 > 3
$cat 3 3 3 3 3 3 3 3 > 4
$gzip -k 4
$lein run 4
starting...

uncompressed
Java code:
"Elapsed time: 8258.013802 msecs"
(65769984)
"Elapsed time: 8268.641987 msecs"
Clojure code:
"Elapsed time: 9117.814135 msecs"
(65769984)
"Elapsed time: 9118.270526 msecs"

compressed
Java code:
"Elapsed time: 21522.20167 msecs"
(65769984)
"Elapsed time: 21522.663463 msecs"
Clojure code:
"Elapsed time: 21573.585966 msecs"
(65769984)
"Elapsed time: 21574.013417 msecs"
...finished
$ lein run 4 4
starting...

uncompressed
Java code:
""EEllaappsseedd  ttiimmee::  77226688..0857983348 msec1s "m
secs"
(65769984 65769984)
"Elapsed time: 7280.09169 msecs"
Clojure code:
""EEllaappsseedd  ttiimmee::  99117777..113308627362  mmsseeccss""

(65769984 65769984)
"Elapsed time: 9177.644745 msecs"

compressed
Java code:
"Elapsed time: 22324.81872 msecs"
"Elapsed time: 23122.111874 msecs"
(65769984 65769984)
"Elapsed time: 23122.511818 msecs"
Clojure code:
"Elapsed time: 75968.051536 msecs"
"Elapsed time: 76018.787437 msecs"
(65769984 65769984)
"Elapsed time: 76019.215303 msecs"
...finished

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to