Do you have a corresponding example of the parallel code? I'm not sure which part(s) are being delegated to other threads.
Often it is just the I/O cost of reading the file that is the dominant cost, so parallelism doesn't buy you much. Alan On Mon, Sep 14, 2015 at 9:10 PM, Andy L <core.as...@gmail.com> wrote: > Hi, > > I would like ask for some advise with regards to kind of unusual > interaction between lazy-seq and threads. I have a code opening some big > compressed text files and processing them line by line. The code reduced to > a viable example would look like that: > > (with-open [i (-> "mybigfile.gz" clojure.java.io/input-stream > java.util.zip.GZIPInputStream. clojure.java.io/reader)] (count (line-seq > i))) > > where for the sake of visualization, the processing is replaced by a > simple counting. > > In a single thread situation, everything works very well, with performance > numbers close to Java (or even equal with "-XX:MaxInlineLevel=16"). > However, once I run it in threads, either native Java Thread or future, > instead of nice effect parallel processing, things are even slower from as > they would be run sequentially. Interestingly enough, JVM pegs at 500-600% > of CPU (I have 8 cores). I was not sure what was the reason, and in order > to rule out some basics assumptions, I created a Java equivalent. It runs > at 200% CPU and scales above 4 cores - which is exactly what I want, and > matches gzip behavior. (I can run almost 6 "gunzip -c mybigfile.gz | wc -l" > which all taking 100% CPU each). > > Next logical step was to look into Clojure sources. What I am finding out, > is that lazy-seq is synchronized: > https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/LazySeq.java > . From what I understand, JIT optimizes the single thread case and removes > "synchronized" guards, however as soon as other threads come into play I am > forced to pay price for synchronization, which causes the performance > degradation*. > > Interestingly enough, JIT optimizes a version without GZIPInputStream and > am getting same results as with Java with multiple threads. I have to run > it with "-XX:MaxInlineLevel=16" though. With a default > "-XX:MaxInlineLevel=9", JIT does not kick in and performance is not there. > There is probably another switch in JVM which would help hinting JIT > better, however I am not convinces that this is a right direction. > > I really like semantics of line-seq, however without that "synchronized" > part, as in my context there is no way that two threads touch same seq. > > I would like ask for some advise, what would be my options here. The last > resort is to write handling code in Java, but I really want to avoid this. > > Best, > Andy > > *My analysis might be wrong of course. > > > > > > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.