On Fri, Sep 19, 2008 at 11:36:28AM +0200, Niclas Jansson wrote: > > I also wonder about the following in PXMLMesh::readVertices: > > > > const uint L = floor( (real) num_vertices / (real) num_processes); > > const uint R = num_vertices % num_processes; > > const uint num_local = (num_vertices + num_processes - > > process_number - 1) / num_processes; > > > > start_index = process_number * L + std::min(process_number, R); > > end_index = start_index + ( num_local - 1); > > > > I think I can guess what it does, but does it have to be this > > complicated? Isn't it enough to do something like > > > > const uint n = num_vertices / num_processors; > > start_index = n*process_number; > > end_index = start_index + n; > > > > and then a fix for the last processor: > > > > if (process_number == num_processors - 1) > > end_index = num_vertices; > > > > ? > > > > But shouldn't that give a bad load balance, for example when N is large, > R << num_processes and (end_index - start_index) >> R. > > Niclas
I don't understand, but maybe I'm missing something. Say N = 1,000,000 and num_processes = 16. Then R = 0. With my scheme above, then there will be 62500 vertices on each processor. If we change N to 1,000,001, then there will be 62500 on each processor except the last which will have 62501. If we increase N further, we will have 62502, 62503 etc until 62515 on the last processor, and after that 62501 on each processor etc. But maybe I'm missing something important? -- Anders
signature.asc
Description: Digital signature
_______________________________________________ DOLFIN-dev mailing list [email protected] http://www.fenics.org/mailman/listinfo/dolfin-dev
