Do you have a corresponding example of the parallel code?  I'm not sure
which part(s) are being delegated to other threads.

Often it is just the I/O cost of reading the file that is the dominant
cost, so parallelism doesn't buy you much.

Alan

On Mon, Sep 14, 2015 at 9:10 PM, Andy L <core.as...@gmail.com> wrote:

> Hi,
>
> I would like ask for some advise with regards to kind of unusual
> interaction between lazy-seq and threads. I have a code opening some big
> compressed text files and processing them line by line. The code reduced to
> a viable example would look like that:
>
>   (with-open [i (-> "mybigfile.gz" clojure.java.io/input-stream
> java.util.zip.GZIPInputStream. clojure.java.io/reader)] (count (line-seq
> i)))
>
> where for the sake of visualization, the processing is replaced by a
> simple counting.
>
> In a single thread situation, everything works very well, with performance
> numbers close to Java (or even equal with "-XX:MaxInlineLevel=16").
> However, once I run it in threads, either native Java Thread or future,
> instead of nice effect parallel processing, things are even slower from as
> they would be run sequentially. Interestingly enough, JVM pegs at 500-600%
> of CPU (I have 8 cores). I was not sure what was the reason, and in order
> to rule out some basics assumptions, I created a Java equivalent. It runs
> at 200% CPU and scales above 4 cores - which is exactly what I want, and
> matches gzip behavior. (I can run almost 6 "gunzip -c mybigfile.gz | wc -l"
> which all taking 100% CPU each).
>
> Next logical step was to look into Clojure sources. What I am finding out,
> is that lazy-seq is synchronized:
> https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/LazySeq.java
> . From what I understand, JIT optimizes the single thread case and removes
> "synchronized" guards, however as soon as other threads come into play I am
> forced to pay price for synchronization, which causes the performance
> degradation*.
>
> Interestingly enough, JIT optimizes a version without GZIPInputStream and
> am getting same results as with Java with multiple threads. I have to run
> it with "-XX:MaxInlineLevel=16" though. With a default
> "-XX:MaxInlineLevel=9", JIT does not kick in and performance is not there.
> There is probably another switch in JVM which would help hinting JIT
> better, however I am not convinces that this is a right direction.
>
> I really like semantics of line-seq, however without that "synchronized"
> part, as in my context there is no way that two threads touch same seq.
>
> I would like ask for some advise, what would be my options here. The last
> resort is to write handling code in Java, but I really want to avoid this.
>
> Best,
> Andy
>
> *My analysis might be wrong of course.
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with
> your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to