Sorry for the confusion; I was asking George which one wins.  I'm not active in 
the MX portion of the OMPI code base, so I don't know which one is better / 
should be used.


On Jun 18, 2010, at 8:19 AM, guillaume ranquet wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hello,
> 
> sorry for the very long delay, I didn't understood you waited an answer
> from my side on this. (the debate seemed to be between maintainers)
> do not hesitate to bug me if I'm not answering after some days.
> 
> to answer shortly:
> - -yes I've tested the patch submited on this thread by Scott and it
> solved my issues.
> - -no, I havent tested the patch submited by George, I can have a quick
> try if needed.
> 
> as of "which one wins", I'm quite sure you have more clues than me on
> the subjet :)
> 
> 
> On 06/07/2010 09:49 PM, Jeff Squyres wrote:
>> George --
>> 
>> Scott's patch was different than the one you applied.  Apparently, his fixes 
>> this user's problem (I don't know if Guillaume tested yours).
>> 
>> Which one wins?
>> 
>> 
>> 
>> On Jun 3, 2010, at 9:49 AM, Scott Atchley wrote:
>> 
>>> On Jun 3, 2010, at 8:54 AM, guillaume ranquet wrote:
>>> 
>>>> granquet@bordeplage-15 ~ $ mpirun --mca btl mx,openib,sm,self --mca pml
>>>> ^cm --mca mpi_leave_pinned 0 ~/bwlat/mpi_helloworld
>>>> [bordeplage-15.bordeaux.grid5000.fr:02707] Error in mx_init (error No MX
>>>> device entry in /dev.)
>>>> Hello world from process 0 of 1
>>>> 
>>>> it works :)
>>> 
>>> Jeff, you may want to change this message to opal_output_verbose(). It is 
>>> in $OMPI/ompi/mca/common/common_mx.c.
>>> 
>>>>> Ok. I think that OMPI is trying to open the MX MTL first. It fails at
>>>>> mx_init() (the first error message) but it had already created some
>>>>> mpool resources. It then tries to open the MX BTL and it skips the MX
>>>>> initialization and returns SUCCESS. The MX BTL then tries to call
>>>>> mx_get_info() which fails and prints the second message.
>>>>> 
>>>>> Try the attached patch. It tries to clean up if mx_init() fails and
>>>>> does not return SUCCESS on subsequent attempts to initialize MX.
>>>>> 
>>>>> Scott
>>>> 
>>>> I tried your patch and it seems to correct the issue:
>>>> 
>>>> configured with:  --prefix=$HOME/openmpi-1.4.2-nomx-bin/
>>>> - --with-openib=/usr --with-mx=/usr
>>>> 
>>>> $ ~/openmpi-1.4.2-nomx-bin/bin/mpirun ~/bwlat/mpi_helloworld
>>>> [bordeplage-15.bordeaux.grid5000.fr:22406] Error in mx_init (error No MX
>>>> device entry in /dev.)
>>>> Hello world from process 0 of 1
>>> 
>>> Excellent.
>>> 
>>>> don't hesitate if you need further testing :)
>>> 
>>> Thanks for all your assistance!
>>> 
>>>> do you plan on applying this patch on next release? (1.4.3?)
>>> 
>>> Jeff, I leave this up to you and George.
>>> 
>>> Scott
>>> 
>> 
>> 
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.15 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iQEcBAEBAgAGBQJMG46CAAoJEEzIl7PMEAli+2MH/19oFkY+JM1l/1hfRIKVrSl4
> +tzpWuPdrRFBODqKrZz6TTvZTBqCHar0M6FLPVTr3wvTRVMgEbdlBwr6u7GUBdVP
> 3XJw25jFUKkaAOM8PbDI7V3FMZ6oyF7Xxefo2EBCRvp9lVeop6Y0c01fXz9LS6F+
> SYn8mi5bmn58GKd8xKLvK2zgGDwdw5CRQRdWGPOfHVo4hcosvv0d55RhpDs1/U1C
> YRabXwCM0ZU251bYLwhZCjVPZZMfrQBy8oEc1DBiHOXPnc1c25GBwMxL5WPRkR+b
> xXHM2PECDACLZYKAtb/CZh94DXWxTbsMKxM9N37zf48avgKyqQYJdkwrUSlDsxc=
> =zGo1
> -----END PGP SIGNATURE-----
> 


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to