Niclas Jansson wrote: > Anders Logg wrote: >> On Fri, Sep 19, 2008 at 11:36:28AM +0200, Niclas Jansson wrote: >> >>>> I also wonder about the following in PXMLMesh::readVertices: >>>> >>>> const uint L = floor( (real) num_vertices / (real) num_processes); >>>> const uint R = num_vertices % num_processes; >>>> const uint num_local = (num_vertices + num_processes - >>>> process_number - 1) / num_processes; >>>> >>>> start_index = process_number * L + std::min(process_number, R); >>>> end_index = start_index + ( num_local - 1); >>>> >>>> I think I can guess what it does, but does it have to be this >>>> complicated? Isn't it enough to do something like >>>> >>>> const uint n = num_vertices / num_processors; >>>> start_index = n*process_number; >>>> end_index = start_index + n; >>>> >>>> and then a fix for the last processor: >>>> >>>> if (process_number == num_processors - 1) >>>> end_index = num_vertices; >>>> >>>> ? >>>> >>> But shouldn't that give a bad load balance, for example when N is large, >>> R << num_processes and (end_index - start_index) >> R. >>> >>> Niclas >> I don't understand, but maybe I'm missing something. >> >> Say N = 1,000,000 and num_processes = 16. Then R = 0. With my scheme >> above, then there will be 62500 vertices on each processor. >> >> If we change N to 1,000,001, then there will be 62500 on each >> processor except the last which will have 62501. >> >> If we increase N further, we will have 62502, 62503 etc until 62515 on >> the last processor, and after that 62501 on each processor etc. >> >> But maybe I'm missing something important? >> >> -- >> Anders >> > > Ok, it was a bad example. But the point is that the extra elements must > be distributed across all processors to even out the workload. > > For example if N = num_processes**2 + num_processes - 1, the last > processor would get twice the amount of elements. > > And even if the last processor only has small amount of extra elements, > for, let say 1024 processor, the efficiency would drop since 1023 > processors would be wasting cycles waiting on the last one to finish. >
To me the issue is not the correctness of the approach, it's that the code is a bit cryptic. It's hard for me to follow. Garth > Niclas > _______________________________________________ > DOLFIN-dev mailing list > [email protected] > http://www.fenics.org/mailman/listinfo/dolfin-dev _______________________________________________ DOLFIN-dev mailing list [email protected] http://www.fenics.org/mailman/listinfo/dolfin-dev
