FWIW: there seems to be some message attempting to be sent down to the child 
procs on termination that is causing that issue. I’m not sure where it comes 
from, but probably is due to the restoration of the usock OOB component.


> On Apr 25, 2016, at 7:25 AM, Josh Hursey <jjhur...@open-mpi.org> wrote:
> 
> IBM had a stale version of ompi-tests. I have sync'ed that repo, and will try 
> again later today.
> 
> The loop spawn error will take some digging. I'll see what we can find.
> 
> On Mon, Apr 25, 2016 at 9:14 AM, Gilles Gouaillardet 
> <gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> wrote:
> This is a known bug that is being discussed at 
> https://github.com/open-mpi/ompi/pull/1473/commits/0d1431f02c6b2876cdeee4fd783d6b6807dfff2a
>  
> <https://github.com/open-mpi/ompi/pull/1473/commits/0d1431f02c6b2876cdeee4fd783d6b6807dfff2a>
> it affects big endian machine or 8 bytes fortran integer
> 
> Cheers,
> 
> Gilles
> 
> 
> On Monday, April 25, 2016, Adrian Reber <adr...@lisas.de 
> <mailto:adr...@lisas.de>> wrote:
> Errors like that (Win::Get_attr: Got wrong value for disp unit) are from
> my ppc64 machine: https://mtt.open-mpi.org/index.php?do_redir=2295 
> <https://mtt.open-mpi.org/index.php?do_redir=2295>
> 
> The MTT setup is checking out the tests from github directly:
> 
> [Test get: ibm]
> module = SCM
> scm_module = Git
> scm_url = https://github.com/open-mpi/ompi-tests.git 
> <https://github.com/open-mpi/ompi-tests.git>
> scm_subdir = ibm
> 
> Not sure Ralph meant those errors. But they only happen on ppc64 and not
> on x86_64 with a very similar mtt configuration file.
> 
>                 Adrian
> 
> On Mon, Apr 25, 2016 at 10:50:03PM +0900, Gilles Gouaillardet wrote:
> > Cisco mtt looks clean
> > since ompi_tests repo is private, it cannot be automatically pulled unless
> > a password is saved (https) or a public key was uploaded to github (ssh)
> > for that reason, I would not simply assume the latest test suite is used :-(
> > and fwiw, Jeff uses an internally mirrored repo for ompi-tests, so it Cisco
> > clusters should use the latest test suites.
> >
> > Geoffrey,
> > can you please comment on the config of the ibm cluster ?
> >
> > Cheers,
> >
> > Gilles
> >
> > On Monday, April 25, 2016, Ralph Castain <r...@open-mpi.org <>
> > <javascript:_e(%7B%7D,'cvml','r...@open-mpi.org <>');>> wrote:
> >
> > > I don’t know - this isn’t on my machine, but rather in the weekend and
> > > nightly MTT reports. I’m assuming folks are running the latest test suite,
> > > but...
> > >
> > >
> > > On Apr 25, 2016, at 6:20 AM, Gilles Gouaillardet <
> > > gilles.gouaillar...@gmail.com <>> wrote:
> > >
> > > Ralph,
> > >
> > > can you make sure the ibm test suite is up to date ?
> > > I pushed a fix for datatypes a few days ago, and it should be fine now.
> > >
> > > I will double check this tomorrow anyway
> > >
> > > Cheers,
> > >
> > > Gilles
> > >
> > > On Monday, April 25, 2016, Ralph Castain <r...@open-mpi.org <>> wrote:
> > >
> > >> I’m seeing some consistent errors in the 1.10.3rc MTT results and would
> > >> appreciate it if folks could check them out:
> > >>
> > >> ONESIDED:
> > >> onesided/cxx_win_attr:
> > >> [**ERROR**]: MPI_COMM_WORLD rank 0, file cxx_win_attr.cc:50:
> > >> Win::Get_attr: Got wrong value for disp unit
> > >> [**ERROR**]: MPI_COMM_WORLD rank 1, file cxx_win_attr.cc:50:
> > >> Win::Get_attr: Got wrong value for disp
> > >>
> > >>
> > >> DATATYPE:
> > >> datatype/predefined-datatype-name
> > >> MPI_LONG_LONG                    != MPI_LONG_LONG_INT
> > >>
> > >>
> > >> LOOP SPAWN:
> > >> too many retries sending message to <addr>, giving up
> > >>
> > >> Thanks
> > >> Ralph
> > >>
> > >> _______________________________________________
> > >> devel mailing list
> > >> de...@open-mpi.org <>
> > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > >> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> > >> Link to this post:
> > >> http://www.open-mpi.org/community/lists/devel/2016/04/18809.php 
> > >> <http://www.open-mpi.org/community/lists/devel/2016/04/18809.php>
> > >
> > > _______________________________________________
> > > devel mailing list
> > > de...@open-mpi.org <>
> > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > > <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> > > Link to this post:
> > > http://www.open-mpi.org/community/lists/devel/2016/04/18810.php 
> > > <http://www.open-mpi.org/community/lists/devel/2016/04/18810.php>
> > >
> > >
> > >
> 
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org <>
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2016/04/18812.php 
> > <http://www.open-mpi.org/community/lists/devel/2016/04/18812.php>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18813.php 
> <http://www.open-mpi.org/community/lists/devel/2016/04/18813.php>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18814.php 
> <http://www.open-mpi.org/community/lists/devel/2016/04/18814.php>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18816.php

Reply via email to