Sorry for the confusion; I was asking George which one wins. I'm not active in the MX portion of the OMPI code base, so I don't know which one is better / should be used.
On Jun 18, 2010, at 8:19 AM, guillaume ranquet wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello, > > sorry for the very long delay, I didn't understood you waited an answer > from my side on this. (the debate seemed to be between maintainers) > do not hesitate to bug me if I'm not answering after some days. > > to answer shortly: > - -yes I've tested the patch submited on this thread by Scott and it > solved my issues. > - -no, I havent tested the patch submited by George, I can have a quick > try if needed. > > as of "which one wins", I'm quite sure you have more clues than me on > the subjet :) > > > On 06/07/2010 09:49 PM, Jeff Squyres wrote: >> George -- >> >> Scott's patch was different than the one you applied. Apparently, his fixes >> this user's problem (I don't know if Guillaume tested yours). >> >> Which one wins? >> >> >> >> On Jun 3, 2010, at 9:49 AM, Scott Atchley wrote: >> >>> On Jun 3, 2010, at 8:54 AM, guillaume ranquet wrote: >>> >>>> granquet@bordeplage-15 ~ $ mpirun --mca btl mx,openib,sm,self --mca pml >>>> ^cm --mca mpi_leave_pinned 0 ~/bwlat/mpi_helloworld >>>> [bordeplage-15.bordeaux.grid5000.fr:02707] Error in mx_init (error No MX >>>> device entry in /dev.) >>>> Hello world from process 0 of 1 >>>> >>>> it works :) >>> >>> Jeff, you may want to change this message to opal_output_verbose(). It is >>> in $OMPI/ompi/mca/common/common_mx.c. >>> >>>>> Ok. I think that OMPI is trying to open the MX MTL first. It fails at >>>>> mx_init() (the first error message) but it had already created some >>>>> mpool resources. It then tries to open the MX BTL and it skips the MX >>>>> initialization and returns SUCCESS. The MX BTL then tries to call >>>>> mx_get_info() which fails and prints the second message. >>>>> >>>>> Try the attached patch. It tries to clean up if mx_init() fails and >>>>> does not return SUCCESS on subsequent attempts to initialize MX. >>>>> >>>>> Scott >>>> >>>> I tried your patch and it seems to correct the issue: >>>> >>>> configured with: --prefix=$HOME/openmpi-1.4.2-nomx-bin/ >>>> - --with-openib=/usr --with-mx=/usr >>>> >>>> $ ~/openmpi-1.4.2-nomx-bin/bin/mpirun ~/bwlat/mpi_helloworld >>>> [bordeplage-15.bordeaux.grid5000.fr:22406] Error in mx_init (error No MX >>>> device entry in /dev.) >>>> Hello world from process 0 of 1 >>> >>> Excellent. >>> >>>> don't hesitate if you need further testing :) >>> >>> Thanks for all your assistance! >>> >>>> do you plan on applying this patch on next release? (1.4.3?) >>> >>> Jeff, I leave this up to you and George. >>> >>> Scott >>> >> >> > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.15 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iQEcBAEBAgAGBQJMG46CAAoJEEzIl7PMEAli+2MH/19oFkY+JM1l/1hfRIKVrSl4 > +tzpWuPdrRFBODqKrZz6TTvZTBqCHar0M6FLPVTr3wvTRVMgEbdlBwr6u7GUBdVP > 3XJw25jFUKkaAOM8PbDI7V3FMZ6oyF7Xxefo2EBCRvp9lVeop6Y0c01fXz9LS6F+ > SYn8mi5bmn58GKd8xKLvK2zgGDwdw5CRQRdWGPOfHVo4hcosvv0d55RhpDs1/U1C > YRabXwCM0ZU251bYLwhZCjVPZZMfrQBy8oEc1DBiHOXPnc1c25GBwMxL5WPRkR+b > xXHM2PECDACLZYKAtb/CZh94DXWxTbsMKxM9N37zf48avgKyqQYJdkwrUSlDsxc= > =zGo1 > -----END PGP SIGNATURE----- > -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/