Re: [OMPI devel] v1.7.0rc7
On 02/23/13 14:45, Ralph Castain wrote: This release candidate is the last one we expect to have before release, so please test it. Can be downloaded from the usual place: http://www.open-mpi.org/software/ompi/v1.7/ I haven't looked at this very carefully yet. Maybe someone can confirm what I'm seeing? If I try to "mpirun `pwd`", the job should fail (since I'm launching a directory rather than an executable). With v1.7, however, the return status is 0. (The error message also suggests some confusion.) My experiment is to run mpirun `pwd` echo status is $status Here is v1.7: -- Open MPI tried to fork a new process via the "execve" system call but failed. This is an unusual error because Open MPI checks many things before attempting to launch a child process. This error may be indicative of another problem on the target host. Your job will now abort. Local host:/workspace/eugene/v1.7-testing Application name: Permission denied -- status is 0 Here is v1.6: -- mpirun noticed that the job aborted, but has no info as to the process that caused that situation. -- status is 1
Re: [OMPI devel] RFC: Remove windows support
No other issues were raised about this, and today was the timeout. On the call today, Ralph volunteered to do the work: - svn rm the windows-specific components - remove all the #if Windows-specific code He'll be doing that over the next week or so. On Feb 18, 2013, at 1:34 PM, Ralph Castainwrote: > Thanks Marco - I was hoping that would be the case! > > > On Feb 18, 2013, at 8:42 AM, marco atzeri wrote: > >> On 2/18/2013 5:10 PM, Jeff Squyres (jsquyres) wrote: >>> WHAT: Remove all Windows code from the trunk. >>> >>> WHY: This issue keeps coming up over and over and over... >>> >> [cut] >>> 2. Remove all Windows code. This involves some wholesale removing of >>> components as well as a bunch of #if code throughout the code base. >>> >>> ==> Removing this code can probably be done in multiple SVN commits: >>> >>> 2a. Removing Windows-only components (which, given the rate of change that >>> we are planning for the trunk, may well need to be re-written if they are >>> ever re-introduced into the tree). >> >> Cygwin does not use them. I'm currently building the trunk packages with >> >> --enable-mca-no-build=paffinity,installdirs-windows,timer-windows,shmem-sysv,if-windows,shmem-windows >> >> to specifically exclude them >> >>> 2b. Removing "#if WINDOWS" code (e.g., in opal/util/*, etc.). This code >>> may not be changing as much as the rest of the trunk, and may be suitable >>> for svn reverting someday. >>> >>> This does kill Cygwin support, too. I realize we have a downstream >>> packager for Cygwin, but the fact that we can't get any developer support >>> for Windows -- despite multiple appeals -- seems to imply that the Windows >>> Open MPI audience is very, very small. So while it feels a bit sad to kill >>> it, it may still be the Right Thing to do. >> >> I assume it is __WINDOWS__ >> That is not defined on cygwin, so the build should survive >> >>> >>> This is a proposal, and is open for discussion. >>> >> >> Regards >> Marco >> >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [MTT devel] fix zombie commit
On Feb 26, 2013, at 2:11 AM, Mike Dubmanwrote: > On Mon, Feb 25, 2013 at 6:24 PM, Jeff Squyres (jsquyres) > wrote: > >Looking at the code, you're checking for zombie status before MTT kills the > >proc. Am I reading that right? > I don`t think the order matters, if process is not Zombie yet and about to be > killed by MTT later - it is a good flow. > If process is already Zombie - mtt will not be able to kill it anyway and and > can stop waiting and switch to the new task. No, the _kill_proc() routine does both a kill() and a waitpid(). The waitpid() should reap the zombie. I.e., if the process has died, MTT simply just hasn't reaped it yet. Hence, it's a zombie. > >If so, then it could well be that the process has exited but not yet been > >reaped (because _kill_proc() hasn't been invoked yet). If this is the case, > >is the real cause of the problem that >the OUTread and ERRread aren't being > >closed when the child process exits, and therefore we keep looping looking > >for new output from them? > yep, sounds like it can be the cause, need to look into this code. Ok. It would be interesting to see if the process dies, but: 1) MTT is still blocking in select() (i.e., OUTread and OUTerr aren't returning 0 from sysread upon process death) 2) $done is somehow not getting set to 0, and therefore MTT is still looping until the timeout expires -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] v1.7.0rc7
Hi > This release candidate is the last one we expect to have > before release, so please test it. Can be downloaded from > the usual place: > > http://www.open-mpi.org/software/ompi/v1.7/ > > Latest changes include: > > * update of the alps/lustre configure code > * fixed solaris hwloc code > * various mxm updates > * removed java bindings (delayed until later release) > * improved the --report-bindings output > * a variety of minor cleanups My rankfiles don't work. tyr rankfiles 106 ompi_info | grep "MPI:" Open MPI: 1.7rc7 tyr rankfiles 107 mpiexec -report-bindings -rf rf_ex_linpc hostname -- All nodes which are allocated for this job are already filled. -- tyr rankfiles 108 mpiexec -report-bindings -rf rf_ex_sunpc hostname -- All nodes which are allocated for this job are already filled. -- tyr rankfiles 109 mpiexec -report-bindings -rf rf_ex_sunpc_linpc hostname -- All nodes which are allocated for this job are already filled. -- tyr rankfiles 110 They work as expected for openmpi-1.6.4. tyr rankfiles 99 ompi_info | grep "MPI:" Open MPI: 1.6.4rc4r28039 tyr rankfiles 100 mpiexec -report-bindings -rf rf_ex_linpc hostname [linpc0:17655] MCW rank 0 bound to socket 0[core 0-1] socket 1[core 0-1]: [B B][B B] (slot list 0:0-1,1:0-1) linpc0 linpc1 [linpc1:06707] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .] (slot list 0:0-1) linpc1 [linpc1:06707] MCW rank 2 bound to socket 1[core 0]: [. .][B .] (slot list 1:0) [linpc1:06707] MCW rank 3 bound to socket 1[core 1]: [. .][. B] (slot list 1:1) linpc1 tyr rankfiles 101 mpiexec -report-bindings -rf rf_ex_sunpc hostname [sunpc0:22706] MCW rank 0 bound to socket 0[core 0-1] socket 1[core 0-1]: [B B][B B] (slot list 0:0-1,1:0-1) sunpc0 sunpc1 [sunpc1:25189] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .] (slot list 0:0-1) sunpc1 [sunpc1:25189] MCW rank 2 bound to socket 1[core 0]: [. .][B .] (slot list 1:0) [sunpc1:25189] MCW rank 3 bound to socket 1[core 1]: [. .][. B] (slot list 1:1) sunpc1 tyr rankfiles 102 mpiexec -report-bindings -rf rf_ex_sunpc_linpc hostname [linpc1:06777] MCW rank 0 bound to socket 0[core 0-1] socket 1[core 0-1]: [B B][B B] (slot list 0:0-1,1:0-1) linpc1 sunpc1 [sunpc1:25226] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .] (slot list 0:0-1) sunpc1 [sunpc1:25226] MCW rank 2 bound to socket 1[core 0]: [. .][B .] (slot list 1:0) [sunpc1:25226] MCW rank 3 bound to socket 1[core 1]: [. .][. B] (slot list 1:1) sunpc1 tyr rankfiles 103 Kind regards Siegmar
Re: [OMPI devel] v1.7.0rc7
These warnings are now fixed (r28106). Thanks for reporting them. George. On Feb 26, 2013, at 04:27 , marco atzeriwrote: > CC to_self.o > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: > In function ‘create_indexed_constant_gap_ddt’: > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:48:5: > warning: ‘MPI_Type_struct’ is deprecated (declared at > ../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by > MPI_Type_create_struct in MPI-2.0 > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: > In function ‘create_indexed_gap_ddt’: > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:89:5: > warning: ‘MPI_Address’ is deprecated (declared at > ../../ompi/include/mpi.h:1057): MPI_Address is superseded by MPI_Get_address > in MPI-2.0 > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:90:5: > warning: ‘MPI_Address’ is deprecated (declared at > ../../ompi/include/mpi.h:1057): MPI_Address is superseded by MPI_Get_address > in MPI-2.0 > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:93:5: > warning: ‘MPI_Type_struct’ is deprecated (declared at > ../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by > MPI_Type_create_struct in MPI-2.0 > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:99:5: > warning: ‘MPI_Address’ is deprecated (declared at > ../../ompi/include/mpi.h:1057): MPI_Address is superseded by MPI_Get_address > in MPI-2.0 > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:100:5: > warning: ‘MPI_Address’ is deprecated (declared at > ../../ompi/include/mpi.h:1057): MPI_Address is superseded by MPI_Get_address > in MPI-2.0 > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:105:5: > warning: ‘MPI_Type_struct’ is deprecated (declared at > ../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by > MPI_Type_create_struct in MPI-2.0 > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: > In function ‘create_indexed_gap_optimized_ddt’: > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:139:5: > warning: ‘MPI_Type_struct’ is deprecated (declared at > ../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by > MPI_Type_create_struct in MPI-2.0 > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: > In function ‘do_test_for_ddt’: > /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:307:5: > warning: ‘MPI_Type_extent’ is deprecated (declared at > ../../ompi/include/mpi.h:1541): MPI_Type_extent is superseded by > MPI_Type_get_extent in MPI-2.0 > CCLD to_self.exe