Sorry typo 314 not 313, 

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On May 17, 2011, at 2:02 PM, Brock Palen wrote:

> Thanks, I though of looking at ompi_info after I sent that note sigh.
> 
> SEND_INPLACE appears to help performance of larger messages in my synthetic 
> benchmarks over regular SEND.  Also it appears that SEND_INPLACE still allows 
> our code to run.
> 
> We working on getting devs access to our system and code. 
> 
> Brock Palen
> www.umich.edu/~brockp
> Center for Advanced Computing
> bro...@umich.edu
> (734)936-1985
> 
> 
> 
> On May 16, 2011, at 11:49 AM, George Bosilca wrote:
> 
>> Here is the output of the "ompi_info --param btl openib":
>> 
>>                MCA btl: parameter "btl_openib_flags" (current value: <306>, 
>> data
>>                         source: default value)
>>                         BTL bit flags (general flags: SEND=1, PUT=2, GET=4,
>>                         SEND_INPLACE=8, RDMA_MATCHED=64, 
>> HETEROGENEOUS_RDMA=256; flags
>>                         only used by the "dr" PML (ignored by others): 
>> ACK=16,
>>                         CHECKSUM=32, RDMA_COMPLETION=128; flags only used by 
>> the "bfo"
>>                         PML (ignored by others): FAILOVER_SUPPORT=512)
>> 
>> So the 305 flags means: HETEROGENEOUS_RDMA | CHECKSUM | ACK | SEND. Most of 
>> these flags are totally useless in the current version of Open MPI (DR is 
>> not supported), so the only value that really matter is SEND | 
>> HETEROGENEOUS_RDMA.
>> 
>> If you want to enable the send protocol try first with SEND | SEND_INPLACE 
>> (9), if not downgrade to SEND (1)
>> 
>> george.
>> 
>> On May 16, 2011, at 11:33 , Samuel K. Gutierrez wrote:
>> 
>>> 
>>> On May 16, 2011, at 8:53 AM, Brock Palen wrote:
>>> 
>>>> 
>>>> 
>>>> 
>>>> On May 16, 2011, at 10:23 AM, Samuel K. Gutierrez wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> Just out of curiosity - what happens when you add the following MCA 
>>>>> option to your openib runs?
>>>>> 
>>>>> -mca btl_openib_flags 305
>>>> 
>>>> You Sir found the magic combination.
>>> 
>>> :-)  - cool.
>>> 
>>> Developers - does this smell like a registered memory availability hang?
>>> 
>>>> I verified this lets IMB and CRASH progress pass their lockup points,
>>>> I will have a user test this, 
>>> 
>>> Please let us know what you find.
>>> 
>>>> Is this an ok option to put in our environment?  What does 305 mean?
>>> 
>>> There may be a performance hit associated with this configuration, but if 
>>> it lets your users run, then I don't see a problem with adding it to your 
>>> environment.
>>> 
>>> If I'm reading things correctly, 305 turns off RDMA PUT/GET and turns on 
>>> SEND.
>>> 
>>> OpenFabrics gurus - please correct me if I'm wrong :-).
>>> 
>>> Samuel Gutierrez
>>> Los Alamos National Laboratory
>>> 
>>> 
>>>> 
>>>> 
>>>> Brock Palen
>>>> www.umich.edu/~brockp
>>>> Center for Advanced Computing
>>>> bro...@umich.edu
>>>> (734)936-1985
>>>> 
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Samuel Gutierrez
>>>>> Los Alamos National Laboratory
>>>>> 
>>>>> On May 13, 2011, at 2:38 PM, Brock Palen wrote:
>>>>> 
>>>>>> On May 13, 2011, at 4:09 PM, Dave Love wrote:
>>>>>> 
>>>>>>> Jeff Squyres <jsquy...@cisco.com> writes:
>>>>>>> 
>>>>>>>> On May 11, 2011, at 3:21 PM, Dave Love wrote:
>>>>>>>> 
>>>>>>>>> We can reproduce it with IMB.  We could provide access, but we'd have 
>>>>>>>>> to
>>>>>>>>> negotiate with the owners of the relevant nodes to give you 
>>>>>>>>> interactive
>>>>>>>>> access to them.  Maybe Brock's would be more accessible?  (If you
>>>>>>>>> contact me, I may not be able to respond for a few days.)
>>>>>>>> 
>>>>>>>> Brock has replied off-list that he, too, is able to reliably reproduce 
>>>>>>>> the issue with IMB, and is working to get access for us.  Many thanks 
>>>>>>>> for your offer; let's see where Brock's access takes us.
>>>>>>> 
>>>>>>> Good.  Let me know if we could be useful
>>>>>>> 
>>>>>>>>>> -- we have not closed this issue,
>>>>>>>>> 
>>>>>>>>> Which issue?   I couldn't find a relevant-looking one.
>>>>>>>> 
>>>>>>>> https://svn.open-mpi.org/trac/ompi/ticket/2714
>>>>>>> 
>>>>>>> Thanks.  In csse it's useful info, it hangs for me with 1.5.3 & np=32 on
>>>>>>> connectx with more than one collective I can't recall.
>>>>>> 
>>>>>> Extra data point, that ticket said it ran with mpi_preconnect_mpi 1,  
>>>>>> well that doesn't help here, both my production code (crash) and IMB 
>>>>>> still hang.
>>>>>> 
>>>>>> 
>>>>>> Brock Palen
>>>>>> www.umich.edu/~brockp
>>>>>> Center for Advanced Computing
>>>>>> bro...@umich.edu
>>>>>> (734)936-1985
>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> Excuse the typping -- I have a broken wrist
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> George Bosilca
>> Research Assistant Professor
>> Innovative Computing Laboratory
>> Department of Electrical Engineering and Computer Science
>> University of Tennessee, Knoxville
>> http://web.eecs.utk.edu/~bosilca/
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 


Reply via email to