On Wed, Nov 26, 2014 at 10:25 AM, Martin Sandve Alnæs <[email protected]>
wrote:

> My understanding of groups is that they allow running different code on
> different processes. Thus I don't see any use for creation of a mesh
> object, expression object, etc. outside of the group it is defined on. Is
> there any reason to do that?
>
​Not that I am aware of.

> "Why wouldn't it?" - it shouldn't. Exactly  why it's a good check. If the
> user runs different code on different processes however, things like meshes
> and functions may be created in a different order, destroying the form
> uniqueness. If that happens I would like to know early.
>
​Ok, ​
so assuming all objects determining the ​signature has been created in the
same order in the same group it should be fine. But in general that might
not be the case. On one rank a used might want to create a local mesh,
screwing the id (counter) for domains. Then he decides to construct a form
using a mesh constructed after the local mesh.


Johan

>  Martin
> 26. nov. 2014 10:11 skrev "Johan Hake" <[email protected]>:
>
> On Wed, Nov 26, 2014 at 10:05 AM, Martin Sandve Alnæs <[email protected]>
>> wrote:
>>
>>> Surely the group_comm object does not exist on processes outside the
>>> group, and the Expression object construction can only happen within the
>>> group?
>>>
>> ​There is nothing that prevent you to construct an mpi group on all
>> processes. However it seems one cannot do anything with it on ranks that is
>> not included in the group.
>>
>>> I don't see how anything else makes sense. But clear docstring is always
>>> good.
>>>
>> ​Sure.​
>>
>>
>>> Btw, can we assert that the jit signatures match across the group?
>>>
>>
>> ​Why wouldn't it?
>>
>> Johan​
>>
>>
>>
>>> I'm a bit nervous about bugs in nonuniform mpi programs, and that would
>>> be a good early indicator of something funny happening.
>>>
>>> Martin
>>> 26. nov. 2014 09:43 skrev "Garth N. Wells" <[email protected]>:
>>>
>>>> On Wed, 26 Nov, 2014 at 8:32 AM, Johan Hake <[email protected]> wrote:
>>>>
>>>>> On Wed, Nov 26, 2014 at 9:22 AM, Garth N. Wells <[email protected]>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, 26 Nov, 2014 at 7:50 AM, Johan Hake <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> On Wed, Nov 26, 2014 at 8:34 AM, Garth N. Wells <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, 25 Nov, 2014 at 9:48 PM, Johan Hake <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello!
>>>>>>>>>
>>>>>>>>> I just pushed some fixes to the jit interface of DOLFIN. Now one
>>>>>>>>> can jit on different mpi groups.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Nice.
>>>>>>>>
>>>>>>>>  Previously jiting was only done on rank 1 of the mpi_comm_world.
>>>>>>>>> Now it is done on rank 1 of any passed group communicator.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Do you mean rank 0?
>>>>>>>>
>>>>>>>
>>>>>>> ​Yes, of course.​
>>>>>>>
>>>>>>>
>>>>>>>  There is no demo atm showing this but a test has been added:
>>>>>>>>>
>>>>>>>>>   test/unit/python/jit/test_jit_with_mpi_groups.py
>>>>>>>>>
>>>>>>>>> Here an expression, a subdomain, and a form is constructed on
>>>>>>>>> different ranks using group. It is somewhat tedious as one need to
>>>>>>>>> initialize PETSc with the same group, otherwise PETSc will deadlock 
>>>>>>>>> during
>>>>>>>>> initialization (the moment a PETSc la object is constructed).
>>>>>>>>>
>>>>>>>>
>>>>>>>> This is ok. It's arguably a design flaw that we don't make the user
>>>>>>>> handle MPI initialisation manually.
>>>>>>>>
>>>>>>>
>>>>>>> ​Sure, it is just somewhat tedious. You cannot start your typical
>>>>>>> script with importing dolfin.​
>>>>>>>
>>>>>>>  The procedure in Python for this is:
>>>>>>>>>
>>>>>>>>> 1) Construct mpi groups using mpi4py
>>>>>>>>> 2) Initalize petscy4py using the groups
>>>>>>>>> 3) Wrap groups to petsc4py comm (dolfin only support petsc4py not
>>>>>>>>> mpi4py)
>>>>>>>>> 4) import dolfin
>>>>>>>>> 5) Do group specific stuff:
>>>>>>>>>    a) Function and forms no change needed as communicator
>>>>>>>>>       is passed via mesh
>>>>>>>>>    b) domain = CompiledSubDomain("...", mpi_comm=group_comm)
>>>>>>>>>    c) e = Expression("...", mpi_comm=group_comm)
>>>>>>>>>
>>>>>>>>
>>>>>>>> It's not so clear whether passing the communicator means that the
>>>>>>>> Expression is only defined/available on group_comm, or if group_comm is
>>>>>>>> simply to control who does the JIT. Could you clarify this?
>>>>>>>>
>>>>>>>
>>>>>>> My knowledge is not that good in MPI. I have only tried to access
>>>>>>> (and construct) the Expression on ranks included in that group. Also 
>>>>>>> when I
>>>>>>> tried construct one using a group communicator on a rank that is not
>>>>>>> included in the group, I got an when calling MPI_size on it. There is
>>>>>>> probably a perfectly reasonable explaination to this. ​​
>>>>>>>
>>>>>>
>>>>>> Could you clarify what goes on behind-the-scenes with the
>>>>>> communicator? Is it only used in a call to get the process rank? What do
>>>>>> the ranks other than zero do?
>>>>>>
>>>>>
>>>>> ​Not sure what you want to know. Instead of using mpi_comm_world to
>>>>> construct meshes you use the group communicator. This communicator has its
>>>>> own local group of ranks​. JITing is still done on rank 0 of the local
>>>>> group, which might and most often is different from rank 0 process of the
>>>>> mpi_comm_word.
>>>>>
>>>>
>>>> I just want to be clear (and have in the docstring) that
>>>>
>>>>    e = Expression("...", mpi_comm=group_comm)
>>>>
>>>> is valid only on group_comm (if this is the case), or make clear that
>>>> the communicator only determines the process that does the JIT.
>>>>
>>>> If we required all Expressions to have a domain/mesh, as Martin
>>>> advocates, things would be clearer.
>>>>
>>>>  The group communicator works exactly like the world communicator but
>>>>> now on just a subset of the processes. There were some sharp edges with
>>>>> deadlocks as a consequence, when barriers were taken on the world
>>>>> communicator. This is done by default when dolfin is imported and petcs
>>>>> gets initialized with the world communicator. So we need to initialized
>>>>> petsc using the group communicator. Other than that there are not real
>>>>> differences.
>>>>>
>>>>
>>>> That doesn't sound right. PETSc initialisation does not take a
>>>> communicator. It is collective on MPI_COMM_WORLD, but each PETSc object
>>>> takes a communicator at construction, which can be something other than
>>>> MPI_COMM_WORLD or MPI_COMM_SELF.
>>>>
>>>> Garth
>>>>
>>>>
>>>>> ​Johan​
>>>>>
>>>>>
>>>>>
>>>>>> Garth
>>>>>>
>>>>>>
>>>>>>
>>>>>>  Please try it out and report any sharp edges. A demo would also be
>>>>>>>>> fun to include :)
>>>>>>>>>
>>>>>>>>
>>>>>>>> We could run tests on different communicators to speed them up on
>>>>>>>> machines with high core counts!
>>>>>>>>
>>>>>>>
>>>>>>> True!
>>>>>>>
>>>>>>> Johan​
>>>>>>>
>>>>>>>
>>>>>>>  Garth
>>>>>>>>
>>>>>>>>
>>>>>>>>  Johan
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> fenics mailing list
>>>> [email protected]
>>>> http://fenicsproject.org/mailman/listinfo/fenics
>>>>
>>>
>>
_______________________________________________
fenics mailing list
[email protected]
http://fenicsproject.org/mailman/listinfo/fenics

Reply via email to