On Wed, Nov 26, 2014 at 10:52 AM, Garth N. Wells <[email protected]> wrote:

>
> On Wed, 26 Nov, 2014 at 9:38 AM, Johan Hake <[email protected]> wrote:
>
>> On Wed, Nov 26, 2014 at 10:28 AM, Garth N. Wells <[email protected]> wrote:
>>
>>>
>>> On Wed, 26 Nov, 2014 at 9:09 AM, Johan Hake <[email protected]> wrote:
>>>
>>>> On Wed, Nov 26, 2014 at 9:43 AM, Garth N. Wells <[email protected]>
>>>> wrote:
>>>>
>>>>> On Wed, 26 Nov, 2014 at 8:32 AM, Johan Hake <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> On Wed, Nov 26, 2014 at 9:22 AM, Garth N. Wells <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, 26 Nov, 2014 at 7:50 AM, Johan Hake <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Wed, Nov 26, 2014 at 8:34 AM, Garth N. Wells <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, 25 Nov, 2014 at 9:48 PM, Johan Hake <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hello!
>>>>>>>>>>
>>>>>>>>>> I just pushed some fixes to the jit interface of DOLFIN. Now one
>>>>>>>>>> can jit on different mpi groups.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Nice.
>>>>>>>>>
>>>>>>>>>  Previously jiting was only done on rank 1 of the mpi_comm_world.
>>>>>>>>>> Now it is done on rank 1 of any passed group communicator.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Do you mean rank 0?
>>>>>>>>>
>>>>>>>>
>>>>>>>> ​Yes, of course.​
>>>>>>>>
>>>>>>>>
>>>>>>>>  There is no demo atm showing this but a test has been added:
>>>>>>>>>>
>>>>>>>>>>   test/unit/python/jit/test_jit_with_mpi_groups.py
>>>>>>>>>>
>>>>>>>>>> Here an expression, a subdomain, and a form is constructed on
>>>>>>>>>> different ranks using group. It is somewhat tedious as one need to
>>>>>>>>>> initialize PETSc with the same group, otherwise PETSc will deadlock 
>>>>>>>>>> during
>>>>>>>>>> initialization (the moment a PETSc la object is constructed).
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This is ok. It's arguably a design flaw that we don't make the
>>>>>>>>> user handle MPI initialisation manually.
>>>>>>>>>
>>>>>>>>
>>>>>>>> ​Sure, it is just somewhat tedious. You cannot start your typical
>>>>>>>> script with importing dolfin.​
>>>>>>>>
>>>>>>>>  The procedure in Python for this is:
>>>>>>>>>>
>>>>>>>>>> 1) Construct mpi groups using mpi4py
>>>>>>>>>> 2) Initalize petscy4py using the groups
>>>>>>>>>> 3) Wrap groups to petsc4py comm (dolfin only support petsc4py not
>>>>>>>>>> mpi4py)
>>>>>>>>>> 4) import dolfin
>>>>>>>>>> 5) Do group specific stuff:
>>>>>>>>>>    a) Function and forms no change needed as communicator
>>>>>>>>>>       is passed via mesh
>>>>>>>>>>    b) domain = CompiledSubDomain("...", mpi_comm=group_comm)
>>>>>>>>>>    c) e = Expression("...", mpi_comm=group_comm)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It's not so clear whether passing the communicator means that the
>>>>>>>>> Expression is only defined/available on group_comm, or if group_comm 
>>>>>>>>> is
>>>>>>>>> simply to control who does the JIT. Could you clarify this?
>>>>>>>>>
>>>>>>>>
>>>>>>>> My knowledge is not that good in MPI. I have only tried to access
>>>>>>>> (and construct) the Expression on ranks included in that group. Also 
>>>>>>>> when I
>>>>>>>> tried construct one using a group communicator on a rank that is not
>>>>>>>> included in the group, I got an when calling MPI_size on it. There is
>>>>>>>> probably a perfectly reasonable explaination to this. ​​
>>>>>>>>
>>>>>>>
>>>>>>> Could you clarify what goes on behind-the-scenes with the
>>>>>>> communicator? Is it only used in a call to get the process rank? What do
>>>>>>> the ranks other than zero do?
>>>>>>>
>>>>>>
>>>>>> ​Not sure what you want to know. Instead of using mpi_comm_world to
>>>>>> construct meshes you use the group communicator. This communicator has 
>>>>>> its
>>>>>> own local group of ranks​. JITing is still done on rank 0 of the local
>>>>>> group, which might and most often is different from rank 0 process of the
>>>>>> mpi_comm_word.
>>>>>>
>>>>>
>>>>> I just want to be clear (and have in the docstring) that
>>>>>
>>>>>    e = Expression("...", mpi_comm=group_comm)
>>>>>
>>>>> is valid only on group_comm (if this is the case), or make clear that
>>>>> the communicator only determines the process that does the JIT.
>>>>>
>>>>
>>>> ​I see now what you mean. I can update the docstring. As far as I
>>>> understand it should be that the expression is only valid on group_comm,
>>>> and that rank 0 of that group take care of the JIT.
>>>>
>>>
>>>
>>> OK, could you make this clear in the docstring?
>>>
>>
>> ​Sure.​
>>
>>  If we required all Expressions to have a domain/mesh, as Martin
>>>>> advocates, things would be clearer.
>>>>>
>>>>
>>>> Sure, but the same question is there for the mesh too. Is it available
>>>> on ranks that is not in the group?
>>>>
>>>
>>>
>>> I think in this case it is clear - a mesh lives only on the processes
>>> belonging to its communicator. The ambiguity with an Expression is that is
>>> doesn't have any data that lives on processes.
>>>
>>
>> ​Sure.​
>>
>>  The group communicator works exactly like the world communicator but now
>>>>>> on just a subset of the processes. There were some sharp edges with
>>>>>> deadlocks as a consequence, when barriers were taken on the world
>>>>>> communicator. This is done by default when dolfin is imported and petcs
>>>>>> gets initialized with the world communicator. So we need to initialized
>>>>>> petsc using the group communicator. Other than that there are not real
>>>>>> differences.
>>>>>>
>>>>>
>>>>> That doesn't sound right. PETSc initialisation does not take a
>>>>> communicator. It is collective on MPI_COMM_WORLD, but each PETSc object
>>>>> takes a communicator at construction, which can be something other than
>>>>> MPI_COMM_WORLD or MPI_COMM_SELF.
>>>>>
>>>>
>>>> ​Well, for all I know petsc can be initialized with a mpi_comm​. In
>>>> petsc4py that is done by:
>>>>
>>>>   import petsc4py
>>>>   petsc4py.init(comm=group_1)
>>>>   import petsc4py.PETSc as petsc
>>>>
>>>> It turned out that this was required for the Function constructor to
>>>> not deadlock. The line:
>>>>
>>>>     _vector = factory.create_vector();
>>>>
>>>> initilizes PETSc with world_comm, which obviously deadlocks.
>>>>
>>>
>>> There must be something else wrong. PETScInitialize does not take a
>>> communicator:
>>>
>>>    http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/
>>> PetscInitialize.html
>>>
>>
>>
>> From that web page:
>>
>>   ​Collective on MPI_COMM_WORLD or PETSC_COMM_WORLD if it has been set
>>
>> ​So setting PETSC_COMM_WORLD initializes PETSc on a subset of processes.
>>
>
> It is fundamentally wrong that PETSc is 'initialised' with a
> sub-communicator in DOLFIN to avoid a deadlock for the code use sketched in
> this thread. Communicators belong to objects. Initialising PETSc on a
> sub-communicator, it would not be possible to solve a subsequent problem on
> more processes than contained in the sub-communicator.
>
> If the code is deadlocking, then the underlying problem should be fixed.
> It's probably a function call somewhere that is erroneously not passing the
> communicator (since in cases where a communicator is not passed, we assume
> it to be MPI_COMM_WORLD - not ideal but keeps it simple for most users).


​Found the bugger!

If the first PETScObject is created only on a group,
SubSystemManager::​init_petsc is called. This wont initialize PETSc if it
is already initialized outside, for example with MPI_COMM_WORLD. However
SLEPc will be initialized using PETSc_COMM_WORLD. If that is set
to MPI_COMM_WORLD we have a dead lock. If we however, create a PETScVector
on all processes before we create the group one. We are fine as SLEPc then
gets initialized on all processes.

So the solution is to call:

SubsystemManager.init_petsc()

on all processes before doing anything.

Well, I learned a bunch of MPI related stuff here :) I will update the
tests accordingly.

Johan



>
>
> Garth
>
>
>
>  Also see:
>>
>> ​http://www.mcs.anl.gov/petsc/petsc-current/docs/
>> manualpages/Sys/PETSC_COMM_WORLD.html
>> ​Johan​
>>
>>
>>  Why does petsc4py want one? It doesn't make sense to initialise it with
>>> a communicator - a communicator belongs to objects.
>>>
>>> Garth
>>>
>>>
>>>  You might say that this could be avoided by initializing PETSc on all
>>>> ranks with the world communicator before constructing a Function on a
>>>> group. However it still deadlocks during construction. Here I have just
>>>> assumed it deadlock at the same line, but I need to double check this. And
>>>> when I initilized PETSc using the group communicator it just worked. So
>>>> somewhere a collective call to mpi_world_comm is executed when constructing
>>>> a PETScVector.
>>>>
>>>> Johan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> Garth
>>>>>
>>>>>
>>>>>
>>>>>> ​Johan​
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Garth
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  Please try it out and report any sharp edges. A demo would also be
>>>>>>>>>> fun to include :)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> We could run tests on different communicators to speed them up on
>>>>>>>>> machines with high core counts!
>>>>>>>>>
>>>>>>>>
>>>>>>>> True!
>>>>>>>>
>>>>>>>> Johan​
>>>>>>>>
>>>>>>>>
>>>>>>>>  Garth
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  Johan
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to