hi bruno,
i use your first version of the parallel collider for quiet a while
during model development and also calibration. i saw no differences
between yade-1.07 and your version.
i did some benchmarks with 4 to 16 sandy bridge cores at our bull
cluster. getting more than 16 cores for
Thanks Matthias,
Actually I don't understand your benchmark results. You are the first
one to find no speedup on the colliding part.
It seems the results below were not using the parallel collider, since
the time it takes is exactly the same for all number of threads.
What version is that
On 10/04/14 02:01, Klaus Thoeni wrote:
just to clarify, Test 2 is done by increasing the number of iterations (1x,
3x
and 12x the number of iterations specified in checkPerf.py). This means the
number of interactions should increase as well and, hence, particle
velocities
should
Thanks!
If I understand correctly, particles velocities are decreasing with
iterations. So, more iterations means less weight for the collider
overall (hence less effect of parallelizing it).
From you results with 1million, I see for the collider T(j8)=T(j1)/5.8.
Could you tell if the collider
Hi guys,
just to let you know. I updated the results on the wiki [1]. Still performance
test but with more iterations and up to 1 million particles.
Cheers,
Klaus
[1] https://yade-dem.org/wiki/Performance_Test#Test_2
___
Mailing list:
Hi Bruno,
just to clarify, Test 2 is done by increasing the number of iterations (1x, 3x
and 12x the number of iterations specified in checkPerf.py). This means the
number of interactions should increase as well and, hence, particle velocities
should decrease because of more interactions.
I
2014-03-31 10:29 GMT+02:00 Bruno Chareyre bruno.chare...@hmg.inpg.fr:
I think, we can include this code into the master branch in git.
Let`s check the code more precisely and merge it.
For me the code is in its final version and ready to merge if nobody
find bugs (at least you could run your
Hi Bruno,
I have tested this version of collider and have got a speedup for
about 5..10% with number of cores 2..6. But it was quasi-static
simulations, so the contact list is updating not so often.
I think, we can include this code into the master branch in git.
Let`s check the code more
I have tested this version of collider and have got a speedup for
about 5..10% with number of cores 2..6. But it was quasi-static
simulations, so the contact list is updating not so often.
Thanks Anton for feedback. Testing in quasistatic cases is indeed not
very interesting.
Or, in that case,
Hi guys,
I run some dynamic tests with my mesh too (some times ago, but I forgot to
check). Implementation is fine and speed up is only about 6-8%. However, the
simulation has just about 30 particles.
I even have more results for the performance check (with 1 Mio particles)
which I will
(forwarding to yade-dev)
On 28/02/14 10:13, Klaus Thoeni wrote:
Hi guys.,
have a look at this:
https://yade-dem.org/wiki/Performance_Test
Feel free to add your own tests. If you want I can provide the scripts for
the
graphs.
Cheers
Klaus
https://yade-dem.org/wiki/Performance_Test
Wow! Speed x6 for 500k particules?!
It was definitely worth trying with larger numbers, it changes the
picture completely when the last points are included.
Very nice page.
Could you also give some absolute timings for completness? A convenient
value
https://yade-dem.org/wiki/Performance_Test
Wow! Speed x6 for 500k particules?!
It was definitely worth trying with larger numbers, it changes the
picture completely when the last points are included.
Very nice page.
Could you also give some absolute timings for completness? A convenient
There is apparently a problem with your computer/compilation option/other?
If you run an ordinary simulation with -j4 and many particles do you see
4 cores used?
yes, for normal scripts it is running 4 threads at 4 cores, but
--performance assigns all threads to one core it seems...
Is
: Mittwoch, 26. Februar 2014 08:54
An: yade-dev@lists.launchpad.net
Betreff: Re: [Yade-dev] parallel collider - testing needed
There is apparently a problem with your computer/compilation option/other?
If you run an ordinary simulation with -j4 and many particles do you
see
4 cores used?
yes
yes, it is faster at -j1:
So this is an independent problem.
For me -j4 is always faster and effectively uses 4 cores, be it with the
old or the new collider.
I have no idea what can be wrong with your processor.
B
___
Mailing list:
It is a good benchmark overall, the problem is that it is hardly
reproducible. Each run can give a really different total time (more than
a factor 2 between two measure time, didn't you see that to?
when i run the script with num_balls1D = 10 i get:
Mmmmh... I should try again then (I
after running make install in my build folder I start yade using python
yadeparallel -j4 --performance
Why python in the first place?! I would not be surprised if the number
of cores allocated to python was 1, which may cause yade -j4 to run in
a single thread context.
B
hi guys,
i have also some benchmark results:
for 1 thread
200801
number of bodies 200813
Elapsed 41.6678731441 sec
Performance 4.79986101782 iter/sec
Extrapolation on 1e5 iters 5.78720460335 hours
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*
Name Count
Thanks! Comments below.
On 26/02/14 13:33, Matthias Frank wrote:
i have also some benchmark results:
for 1 thread
---
InsertionSortCollider 7
21314382us
Hi Bruno,
2/ Hyperthreading is completely useless for heavy computing tasks,
actually even bad, as your results suggest.
I did some tests by enabling and disabling hyperthreading some time ago.
Conclusions: always disable hyperthreading, as you say it makes no sense for
the kind of thinks we
Zitat von Klaus Thoeni klaus.tho...@gmail.com:
Hi Bruno,
2/ Hyperthreading is completely useless for heavy computing tasks,
actually even bad, as your results suggest.
I did some tests by enabling and disabling hyperthreading some time ago.
Conclusions: always disable hyperthreading, as you
Zitat von Bruno Chareyre bruno.chare...@hmg.inpg.fr:
I forgot to mention two things:
1- I tried the benchmark used by Christian for comparisons with PFC [1],
however it seems that this test is very special. I get large differences
between two runs. Basically, it seems the simulation only
I rotated the wall below a little bit to make it slightly aslope. This
is the reason why columns can collapse (not because truncation error):
I see.
As you mentioned in a previous post we should define two benchmarking
scripts. One for quasi-static simulations and one for dynamic ones.
It is a good benchmark overall, the problem is that it is hardly
reproducible. Each run can give a really different total time (more than
a factor 2 between two measure time, didn't you see that to?
when i run the script with num_balls1D = 10 i get:
Welcome to Yade 2014-02-18.git-af75797
TCP
On 25/02/14 10:17, Christian Jakob wrote:
It is a good benchmark overall, the problem is that it is hardly
reproducible. Each run can give a really different total time (more than
a factor 2 between two measure time, didn't you see that to?
when i run the script with num_balls1D = 10 i
-dev] parallel collider - testing needed
I forgot to mention two things:
1- I tried the benchmark used by Christian for comparisons with PFC [1],
however it seems that this test is very special. I get large differences
between two runs. Basically, it seems the simulation only depends on truncation
sorry for late reply. Feel free to share the pdf. Originally it was supposed
to be transferred to the wiki, anyway.
I'm thinking about a good way to measure performance for highly dynamic
simulations, now. Maybe the script that martin-niehoff posted[1] would be
useful. It is basicly a
Hi Bruno,
I did some tests with your new collider:
My old machine (2 cpu sockets with 4 cores each, Intel(R) Xeon(R)
CPU X5460 @ 3.16GHz) says:
yade-trunk -j4 --performance
Welcome to Yade 2014-02-18.git-af75797
.
number of bodies 200813
Elapsed 74.6882498264 sec
Performance
There is apparently a problem with your computer/compilation option/other?
If you run an ordinary simulation with -j4 and many particles do you see
4 cores used?
Bruno
On 25/02/14 16:26, Christian Jakob wrote:
Hi Bruno,
I did some tests with your new collider:
My old machine (2 cpu
Is there any difference at all on this machine, between -j1 and -j4?
B
On 25/02/14 18:56, Bruno Chareyre wrote:
There is apparently a problem with your computer/compilation option/other?
If you run an ordinary simulation with -j4 and many particles do you see
4 cores used?
Bruno
On
Hi there,
I implemented a parallel version of the InsertionSortCollider. It is
almost ready but not yet pushed to the main trunk, as I have a few
things to check before that.
It would be helpful if some of you could 1/ test that your scripts work
correctly and 2/ benchmark this for N100k and j4.
I forgot to mention two things:
1- I tried the benchmark used by Christian for comparisons with PFC [1],
however it seems that this test is very special. I get large differences
between two runs. Basically, it seems the simulation only depends on
truncation errors: vertical columns of sheres
33 matches
Mail list logo