>> Ha - I'm not the only one doing libMesh stuff on a Friday night, huh?
>> 
>> Any thoughts on our default and overridden threading grainsizes?  I'm
>> playing with them on MIC, and even for a 2D problem with Q2/Q1
>> Navier-Stokes, I can go from 1000 all the way down to 100 with no
>> performance penalty.  Even going down to 10 elements in the smallest
>> range only adds about 10% overhead worst case.
>> 
>> I'm wondering if I ought to just check n_local_elem() each time we
>> build a range and pick a grain size of n_local_elem()/n_threads()/N
>> for some small integer N.
> 
> And therein lies the rub - what to pick??
> 
> We had some issues with the original implementation because our predicated
> iterators are not random access, so splitting a range was nontrivial.  I
> can't remember if anything in the underlying storage would change to make
> this better, but 10 is surprising and certainly would be no good for linear
> triangles on a scalar problem.

And I think it depends too on the total number of elements versus how small
you are splitting.  Splitting 10,000 into a chunksize of 10 is not as bad as
1,000,000 I seem to recall.  But anyway, I think the environment variable
approach would be pretty cool.

-Ben


------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Libmesh-devel mailing list
Libmesh-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmesh-devel

Reply via email to