It is hard to say where the root of your problem lies without looking at 
the code more.  I would look closely at laziness.  I find that lazy 
evaluation really kills parallelization.




On Friday, November 8, 2013 4:42:11 PM UTC-5, Jose M. Perez Sanchez wrote:
>
> Hello everyone:
>
> This is my first post here. I'm a researcher writing a numerical 
> simulation software in Clojure. Actually, I'm porting an app a coworker and 
> I wrote in C/Python (called GEMA) to Clojure: The app has been in use for a 
> while at our group, but became very difficult to maintain due to outgrowing 
> its initial design and being very monolithic and at the same time I wanted 
> to learn Functional Programming, so I've been working in the port for a few 
> weeks.
>
> The simulations are embarrassingly parallel Random Walk calculations used 
> to study gas diffusion and Helium-3 Magnetic Resonance diffusion 
> measurements in the lungs. At the core of the simulations we do there is a 
> 3D geometrical model of the pulmonary acinus. The new application is 
> designed in a modular fashion, I'm including part of the current README 
> file with :browse confirm wa
> a description.
>
> I've approached my institution's Technology Transfer Office to request 
> authorization to release the software under an Open Source license, and if 
> everything goes well the code will be published soon. I'm very happy in my 
> Clojure trip so far and all the things I'm learning in the process.
>
> One of the things I've observed is poor scaling with the number of threads 
> for more than 4 threads in an 8-core Intel i7 CPU, as follows:
>
> NT    Time  cpu%x8
>   1   101.9   108
>   2     54.9   220
>   4     36.0   430
>   6     33.9   570
>   8     32.5   700
> 10     32.5   720
>
> Computing times reported are just the time spent in the computation of the 
> NT futures (not total program execution time). CPU x8 percent is measured 
> with "top" in Linux and the % values are approximate, just to give an idea. 
> I'm running on Debian Wheezy with the following Java platform:
>
> JRE: OpenJDK Runtime Environment 1.6.0_27-b27 on Linux 3.2.0-4-amd64 
> (amd64)
> JVM: OpenJDK 64-Bit Server VM (build 20.0-b12 mixed mode)
>
> I'll try in a 16 core (4-way Opteron) soon and see what happens there. The 
> computing happens over an infinite lazy sequence of random walk steps 
> generated with "(iterate move particle)", when an "extraction function" 
> gets values from zero to the highest number of random walk steps and adds 
> (conj) the values to be kept to a vector. The resulting vector for each 
> particle is then added (conj) to a global vector for latter storage.
>
> I've read the previous post about concurrent performance in AMD 
> processors: 
> https://groups.google.com/forum/#!topic/clojure/48W2eff3caU%5B1-25-false%5D. 
> Have to do it again with more time though, to check whether any of the 
> explanations presented there applies to my application. 
>
> Best regards,
>
> Jose Manuel.
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to