I'm afraid the fix uncovered an issue in the ds21 component that will require 
Mellanox to address it - unsure of the timetable for that to happen.


> On Jan 3, 2020, at 6:28 AM, Ralph Castain via devel 
> <devel@lists.open-mpi.org> wrote:
> 
> I committed something upstream in PMIx master and v3.1 that probably resolves 
> this - another user reported it over there and provided a patch. I can 
> probably backport it to v2.x and give you a patch for OMPI v3.1.
> 
> 
>> On Jan 3, 2020, at 3:25 AM, Jeff Squyres (jsquyres) via devel 
>> <devel@lists.open-mpi.org> wrote:
>> 
>> Is there a configure test we can add to make this kind of behavior be the 
>> default?
>> 
>> 
>>> On Jan 1, 2020, at 11:50 PM, Marco Atzeri via devel 
>>> <devel@lists.open-mpi.org> wrote:
>>> 
>>> thanks Ralph
>>> 
>>> gds = ^ds21
>>> works as expected
>>> 
>>> Am 31.12.2019 um 19:27 schrieb Ralph Castain via devel:
>>>> PMIx likely defaults to the ds12 component - which will work fine but a 
>>>> tad slower than ds21. It is likely something to do with the way cygwin 
>>>> handles memory locks. You can avoid the error message by simply adding 
>>>> "gds = ^ds21" to your default MCA param file (the pmix one - should be 
>>>> named pmix-mca-params.conf).
>>>> Artem - any advice here?
>>>>> On Dec 25, 2019, at 9:56 AM, Marco Atzeri via devel 
>>>>> <devel@lists.open-mpi.org> wrote:
>>>>> 
>>>>> I have no multinode around for testing
>>>>> 
>>>>> I will need to setup one for testing after the holidays
>>>>> 
>>>>> Am 24.12.2019 um 23:27 schrieb Jeff Squyres (jsquyres):
>>>>>> That actually looks like a legit error -- it's failing to initialize a 
>>>>>> shared mutex.
>>>>>> I'm not sure what the consequence is of this failure, though, since the 
>>>>>> job seemed to run ok.
>>>>>> Are you able to run multi-node jobs ok?
>>>>>>> On Dec 22, 2019, at 1:20 AM, Marco Atzeri via devel 
>>>>>>> <devel@lists.open-mpi.org> wrote:
>>>>>>> 
>>>>>>> Hi Developers,
>>>>>>> 
>>>>>>> Cygwin 64bit, openmpi-3.1.5-1
>>>>>>> testing the cygwin package before releasing it
>>>>>>> I see a never seen before spurious error messages that do not seem
>>>>>>> about error at all:
>>>>>>> 
>>>>>>> $ mpirun -n 4 ./hello_c.exe
>>>>>>> [LAPTOP-82F08ILC:02395] PMIX ERROR: INIT in file 
>>>>>>> /cygdrive/d/cyg_pub/devel/openmpi/v3.1/openmpi-3.1.5-1.x86_64/src/openmpi-3.1.5/opal/mca/pmix/pmix2x/pmix/src/mca/gds/ds21/gds_ds21_lock_pthread.c
>>>>>>>  at line 188
>>>>>>> [LAPTOP-82F08ILC:02395] PMIX ERROR: SUCCESS in file 
>>>>>>> /cygdrive/d/cyg_pub/devel/openmpi/v3.1/openmpi-3.1.5-1.x86_64/src/openmpi-3.1.5/opal/mca/pmix/pmix2x/pmix/src/mca/common/dstore/dstore_base.c
>>>>>>>  at line 2432
>>>>>>> Hello, world, I am 0 of 4, (Open MPI v3.1.5, package: Open MPI 
>>>>>>> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.5, repo rev: v3.1.5, Nov 
>>>>>>> 15, 2019, 116)
>>>>>>> Hello, world, I am 1 of 4, (Open MPI v3.1.5, package: Open MPI 
>>>>>>> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.5, repo rev: v3.1.5, Nov 
>>>>>>> 15, 2019, 116)
>>>>>>> Hello, world, I am 2 of 4, (Open MPI v3.1.5, package: Open MPI 
>>>>>>> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.5, repo rev: v3.1.5, Nov 
>>>>>>> 15, 2019, 116)
>>>>>>> Hello, world, I am 3 of 4, (Open MPI v3.1.5, package: Open MPI 
>>>>>>> Marco@LAPTOP-82F08ILC Distribution, ident: 3.1.5, repo rev: v3.1.5, Nov 
>>>>>>> 15, 2019, 116)
>>>>>>> [LAPTOP-82F08ILC:02395] [[20101,0],0] unable to open debugger attach 
>>>>>>> fifo
>>>>>>> 
>>>>>>> There is a know workaround ?
>>>>>>> I have not found anything on the issue list.
>>>>>>> 
>>>>>>> Regards
>>>>>>> MArcp
>> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> 
> 
> 


Reply via email to