Re: [OMPI devel] v1.7.0rc7

2013-02-27 Thread Ralph Castain
This is now fixed and CMRd - thanks!

On Feb 26, 2013, at 1:36 PM, Eugene Loh  wrote:

> On 02/23/13 14:45, Ralph Castain wrote:
>> This release candidate is the last one we expect to have before release, so 
>> please test it. Can be downloaded from the usual place:
>> http://www.open-mpi.org/software/ompi/v1.7/
> 
> I haven't looked at this very carefully yet.  Maybe someone can confirm what 
> I'm seeing?  If I try to "mpirun `pwd`", the job should fail (since I'm 
> launching a directory rather than an executable).  With v1.7, however, the 
> return status is 0.  (The error message also suggests some confusion.)
> 
> My experiment is to run
> 
>mpirun `pwd`
>echo status is $status
> 
> Here is v1.7:
> 
>--
>Open MPI tried to fork a new process via the "execve" system call but
>failed.  This is an unusual error because Open MPI checks many things
>before attempting to launch a child process.  This error may be
>indicative of another problem on the target host.  Your job will now
>abort.
> 
>  Local host:/workspace/eugene/v1.7-testing
>  Application name:  Permission denied
>--
>status is 0
> 
> Here is v1.6:
> 
>--
>mpirun noticed that the job aborted, but has no info as to the process
>that caused that situation.
>--
>status is 1
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] v1.7.0rc7

2013-02-27 Thread Jeff Squyres (jsquyres)
On Feb 25, 2013, at 10:27 PM, marco atzeri  wrote:

> plus the additional ones
> 
>   ERROR.patch : ERROR is already defined, so another label
> is needed for "goto ERROR"

Snipped.

I finally filed a ticket about this: 
https://svn.open-mpi.org/trac/ompi/ticket/3527

We talked about this on the weekly call yesterday.  The RM's said they would 
evaluate the combined patch and see how much risk it posed this close to a 
release.  If it doesn't make 1.7.0, it'll go into 1.7.1.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] v1.7.0rc7

2013-02-26 Thread Eugene Loh

On 02/23/13 14:45, Ralph Castain wrote:

This release candidate is the last one we expect to have before release, so 
please test it. Can be downloaded from the usual place:

http://www.open-mpi.org/software/ompi/v1.7/


I haven't looked at this very carefully yet.  Maybe someone can confirm what I'm seeing?  If I try to "mpirun `pwd`", the job should 
fail (since I'm launching a directory rather than an executable).  With v1.7, however, the return status is 0.  (The error message 
also suggests some confusion.)


My experiment is to run

mpirun `pwd`
echo status is $status

Here is v1.7:

--
Open MPI tried to fork a new process via the "execve" system call but
failed.  This is an unusual error because Open MPI checks many things
before attempting to launch a child process.  This error may be
indicative of another problem on the target host.  Your job will now
abort.

  Local host:/workspace/eugene/v1.7-testing
  Application name:  Permission denied
--
status is 0

Here is v1.6:

--
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--
status is 1


Re: [OMPI devel] v1.7.0rc7

2013-02-26 Thread Siegmar Gross
Hi

> This release candidate is the last one we expect to have
> before release, so please test it. Can be downloaded from
> the usual place:
> 
> http://www.open-mpi.org/software/ompi/v1.7/
> 
> Latest changes include:
> 
> * update of the alps/lustre configure code
> * fixed solaris hwloc code
> * various mxm updates
> * removed java bindings (delayed until later release)
> * improved the --report-bindings output
> * a variety of minor cleanups


My rankfiles don't work.

tyr rankfiles 106 ompi_info | grep "MPI:"
Open MPI: 1.7rc7
tyr rankfiles 107 mpiexec -report-bindings -rf rf_ex_linpc hostname
--
All nodes which are allocated for this job are already filled.
--
tyr rankfiles 108 mpiexec -report-bindings -rf rf_ex_sunpc hostname
--
All nodes which are allocated for this job are already filled.
--
tyr rankfiles 109 mpiexec -report-bindings -rf rf_ex_sunpc_linpc hostname
--
All nodes which are allocated for this job are already filled.
--
tyr rankfiles 110 



They work as expected for openmpi-1.6.4.

tyr rankfiles 99 ompi_info | grep "MPI:"
Open MPI: 1.6.4rc4r28039
tyr rankfiles 100 mpiexec -report-bindings -rf rf_ex_linpc hostname
[linpc0:17655] MCW rank 0 bound to socket 0[core 0-1]
  socket 1[core 0-1]: [B B][B B] (slot list 0:0-1,1:0-1)
linpc0
linpc1
[linpc1:06707] MCW rank 1 bound to socket 0[core 0-1]:
  [B B][. .] (slot list 0:0-1)
linpc1
[linpc1:06707] MCW rank 2 bound to socket 1[core 0]:
  [. .][B .] (slot list 1:0)
[linpc1:06707] MCW rank 3 bound to socket 1[core 1]:
  [. .][. B] (slot list 1:1)
linpc1

tyr rankfiles 101 mpiexec -report-bindings -rf rf_ex_sunpc hostname
[sunpc0:22706] MCW rank 0 bound to socket 0[core 0-1]
  socket 1[core 0-1]: [B B][B B] (slot list 0:0-1,1:0-1)
sunpc0
sunpc1
[sunpc1:25189] MCW rank 1 bound to socket 0[core 0-1]:
  [B B][. .] (slot list 0:0-1)
sunpc1
[sunpc1:25189] MCW rank 2 bound to socket 1[core 0]:
  [. .][B .] (slot list 1:0)
[sunpc1:25189] MCW rank 3 bound to socket 1[core 1]:
  [. .][. B] (slot list 1:1)
sunpc1

tyr rankfiles 102 mpiexec -report-bindings -rf rf_ex_sunpc_linpc hostname
[linpc1:06777] MCW rank 0 bound to socket 0[core 0-1]
  socket 1[core 0-1]: [B B][B B] (slot list 0:0-1,1:0-1)
linpc1
sunpc1
[sunpc1:25226] MCW rank 1 bound to socket 0[core 0-1]:
  [B B][. .] (slot list 0:0-1)
sunpc1
[sunpc1:25226] MCW rank 2 bound to socket 1[core 0]:
  [. .][B .] (slot list 1:0)
[sunpc1:25226] MCW rank 3 bound to socket 1[core 1]:
  [. .][. B] (slot list 1:1)
sunpc1
tyr rankfiles 103 


Kind regards

Siegmar



Re: [OMPI devel] v1.7.0rc7

2013-02-26 Thread George Bosilca
These warnings are now fixed (r28106). Thanks for reporting them.

  George.

On Feb 26, 2013, at 04:27 , marco atzeri  wrote:

>  CC   to_self.o
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:
>  In function ‘create_indexed_constant_gap_ddt’:
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:48:5:
>  warning: ‘MPI_Type_struct’ is deprecated (declared at 
> ../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by 
> MPI_Type_create_struct in MPI-2.0
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:
>  In function ‘create_indexed_gap_ddt’:
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:89:5:
>  warning: ‘MPI_Address’ is deprecated (declared at 
> ../../ompi/include/mpi.h:1057): MPI_Address is superseded by MPI_Get_address 
> in MPI-2.0
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:90:5:
>  warning: ‘MPI_Address’ is deprecated (declared at 
> ../../ompi/include/mpi.h:1057): MPI_Address is superseded by MPI_Get_address 
> in MPI-2.0
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:93:5:
>  warning: ‘MPI_Type_struct’ is deprecated (declared at 
> ../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by 
> MPI_Type_create_struct in MPI-2.0
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:99:5:
>  warning: ‘MPI_Address’ is deprecated (declared at 
> ../../ompi/include/mpi.h:1057): MPI_Address is superseded by MPI_Get_address 
> in MPI-2.0
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:100:5:
>  warning: ‘MPI_Address’ is deprecated (declared at 
> ../../ompi/include/mpi.h:1057): MPI_Address is superseded by MPI_Get_address 
> in MPI-2.0
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:105:5:
>  warning: ‘MPI_Type_struct’ is deprecated (declared at 
> ../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by 
> MPI_Type_create_struct in MPI-2.0
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:
>  In function ‘create_indexed_gap_optimized_ddt’:
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:139:5:
>  warning: ‘MPI_Type_struct’ is deprecated (declared at 
> ../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by 
> MPI_Type_create_struct in MPI-2.0
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:
>  In function ‘do_test_for_ddt’:
> /pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:307:5:
>  warning: ‘MPI_Type_extent’ is deprecated (declared at 
> ../../ompi/include/mpi.h:1541): MPI_Type_extent is superseded by 
> MPI_Type_get_extent in MPI-2.0
>  CCLD to_self.exe




Re: [OMPI devel] v1.7.0rc7

2013-02-25 Thread marco atzeri

On 2/26/2013 3:37 AM, Jeff Squyres (jsquyres) wrote:

Marco --

Is it just these 2 patches:

r28059 [[BR]]
Patch for Cygwin support: use correct DSO/shared library prefix and
suffix.  Thanks to Marco Atzeri for reporting the issue and providing
an initial patch.

r28060 [[BR]]
Patch for Cygwin support: Use S_IRWXU for shmget() and include
.  Thanks to Marco Atzeri for reporting the issue and
providing an initial patch.




plus the additional ones

   ERROR.patch : ERROR is already defined, so another label
 is needed for "goto ERROR"
   interface.patch : same for interface , it seems a
 struct somewhere else
   min.patch : min already defined as macro
 (I saw in the source also a MIN macro defined)
   mpifh.patch : to avoid undefined error
 libmpi_mpifh_la_LIBADD needs
 $(top_builddir)/ompi/libmpi.la
   winsock.patch : defense against  usage

attached here
http://www.open-mpi.org/community/lists/devel/2012/12/11855.php
https://svn.open-mpi.org/trac/ompi/ticket/3437

All test passed, but I noticed these warnings:

  CC   to_self.o
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: 
In function ‘create_indexed_constant_gap_ddt’:
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:48:5: 
warning: ‘MPI_Type_struct’ is deprecated (declared at 
../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by 
MPI_Type_create_struct in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: 
In function ‘create_indexed_gap_ddt’:
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:89:5: 
warning: ‘MPI_Address’ is deprecated (declared at 
../../ompi/include/mpi.h:1057): MPI_Address is superseded by 
MPI_Get_address in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:90:5: 
warning: ‘MPI_Address’ is deprecated (declared at 
../../ompi/include/mpi.h:1057): MPI_Address is superseded by 
MPI_Get_address in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:93:5: 
warning: ‘MPI_Type_struct’ is deprecated (declared at 
../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by 
MPI_Type_create_struct in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:99:5: 
warning: ‘MPI_Address’ is deprecated (declared at 
../../ompi/include/mpi.h:1057): MPI_Address is superseded by 
MPI_Get_address in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:100:5: 
warning: ‘MPI_Address’ is deprecated (declared at 
../../ompi/include/mpi.h:1057): MPI_Address is superseded by 
MPI_Get_address in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:105:5: 
warning: ‘MPI_Type_struct’ is deprecated (declared at 
../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by 
MPI_Type_create_struct in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: 
In function ‘create_indexed_gap_optimized_ddt’:
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:139:5: 
warning: ‘MPI_Type_struct’ is deprecated (declared at 
../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by 
MPI_Type_create_struct in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: 
In function ‘do_test_for_ddt’:
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:307:5: 
warning: ‘MPI_Type_extent’ is deprecated (declared at 
../../ompi/include/mpi.h:1541): MPI_Type_extent is superseded by 
MPI_Type_get_extent in MPI-2.0

  CCLD to_self.exe


Is it expected ?
Or should the test updated to MPI-2.0 convention ?

Regards
Marco



Re: [OMPI devel] v1.7.0rc7

2013-02-25 Thread Jeff Squyres (jsquyres)
On Feb 25, 2013, at 6:30 PM, Pavel Mezentsev  wrote:

> I've tried to build it but got different errors with different compilers.
> 
> With Intel (2011.5.220) and pgi (13.2) I get the following error:
> CC   bcol_iboffload_module.lo
> bcol_iboffload_module.c(37): catastrophic error: cannot open source file 
> "ompi/mca/common/netpatterns/common_netpatterns.h"
>   #include "ompi/mca/common/netpatterns/common_netpatterns.h"

This is a clear error.

Pasha?

> I failed to find that file anywhere among the sources.
> 
> With pathscale (4.0.12.1) I get the following:
>   PPFC mpi-f08-interfaces-callbacks.lo
> 
> module mpi_f08_interfaces_callbacks
>^
> pathf95-855 pathf95: ERROR MPI_F08_INTERFACES_CALLBACKS, File = 
> mpi-f08-interfaces-callbacks.F90, Line = 9, Column = 8 

I don't have access to the Pathscale compiler.  Without more detail, it's hard 
to say what's wrong here.

I've pinged my pathscale contact; perhaps he can shed some light on this...

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] v1.7.0rc7

2013-02-25 Thread Jeff Squyres (jsquyres)
Marco -- 

Is it just these 2 patches:

r28059 [[BR]]
Patch for Cygwin support: use correct DSO/shared library prefix and
suffix.  Thanks to Marco Atzeri for reporting the issue and providing
an initial patch.

r28060 [[BR]]
Patch for Cygwin support: Use S_IRWXU for shmget() and include
.  Thanks to Marco Atzeri for reporting the issue and
providing an initial patch.


On Feb 25, 2013, at 4:40 PM, marco atzeri  wrote:

> On 2/23/2013 11:45 PM, Ralph Castain wrote:
>> This release candidate is the last one we expect to have before release, so 
>> please test it. Can be downloaded from the usual place:
>> 
>> http://www.open-mpi.org/software/ompi/v1.7/
>> 
>> Latest changes include:
>> 
>> * update of the alps/lustre configure code
>> * fixed solaris hwloc code
>> * various mxm updates
>> * removed java bindings (delayed until later release)
>> * improved the --report-bindings output
>> * a variety of minor cleanups
>> 
> 
> any reason to not include the cygwin patches added to 1.6.4 ?
> 
> Marco
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] v1.7.0rc7

2013-02-25 Thread Pavel Mezentsev
I've tried to build it but got different errors with different compilers.

With Intel (2011.5.220) and pgi (13.2) I get the following error:
CC   bcol_iboffload_module.lo
bcol_iboffload_module.c(37): catastrophic error: cannot open source file
"ompi/mca/common/netpatterns/common_netpatterns.h"
  #include "ompi/mca/common/netpatterns/common_netpatterns.h"

I failed to find that file anywhere among the sources.

With pathscale (4.0.12.1) I get the following:
  PPFC mpi-f08-interfaces-callbacks.lo

module mpi_f08_interfaces_callbacks
   ^
pathf95-855 pathf95: ERROR MPI_F08_INTERFACES_CALLBACKS, File =
mpi-f08-interfaces-callbacks.F90, Line = 9, Column = 8
  The compiler has detected errors in module
"MPI_F08_INTERFACES_CALLBACKS".  No module information file will be created
for this module.


 attribute_val_in,attribute_val_out,flag,ierror) &
  ^

pathf95-1691 pathf95: ERROR MPI_COMM_COPY_ATTR_FUNCTION, File =
mpi-f08-interfaces-callbacks.F90, Line = 66, Column = 75
  For "FLAG", LOGICAL(KIND=4) not allowed with BIND(C)


attribute_val_in,attribute_val_out,flag,ierror) &
 ^

pathf95-1691 pathf95: ERROR MPI_WIN_COPY_ATTR_FUNCTION, File =
mpi-f08-interfaces-callbacks.F90, Line = 91, Column = 74
  For "FLAG", LOGICAL(KIND=4) not allowed with BIND(C)


 attribute_val_in,attribute_val_out,flag,ierror) &
  ^

pathf95-1691 pathf95: ERROR MPI_TYPE_COPY_ATTR_FUNCTION, File =
mpi-f08-interfaces-callbacks.F90, Line = 116, Column = 75
  For "FLAG", LOGICAL(KIND=4) not allowed with BIND(C)

SUBROUTINE MPI_Grequest_cancel_function(extra_state,complete,ierror) &
^
pathf95-1691 pathf95: ERROR MPI_GREQUEST_CANCEL_FUNCTION, File =
mpi-f08-interfaces-callbacks.F90, Line = 195, Column = 53
  For "COMPLETE", LOGICAL(KIND=4) not allowed with BIND(C)

pathf95: PathScale(TM) Fortran Version 4.0.12.1 (f14) Tue Feb 26, 2013
 06:33:40
pathf95: 429 source lines
pathf95: 5 Error(s), 0 Warning(s), 0 Other message(s), 0 ANSI(s)
pathf95: "explain pathf95-message number" gives more information about each
message
make[2]: *** [mpi-f08-interfaces-callbacks.lo] Error 1
make[2]: Leaving directory
`/tmp/mpi_install_tmp21558/openmpi-1.7rc7/ompi/mpi/fortran/base'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/tmp/mpi_install_tmp21558/openmpi-1.7rc7/ompi'
make: *** [all-recursive] Error 1

I am not a fortran guy and don't really know what's the problem here.

I tried configuring all cases only with setting the compilers in the
environment variables and setting --prefix. I managed to build 1.6.3 using
all 3 mentioned compilers with the same configuration lines without any
errors.

Not sure about the problem with pathscale but the first problem seems to be
a real error. Or did I miss something?

Regards, Pavel Mezentsev.


2013/2/26 Ralph Castain 

>
> On Feb 25, 2013, at 1:40 PM, marco atzeri  wrote:
>
> > On 2/23/2013 11:45 PM, Ralph Castain wrote:
> >> This release candidate is the last one we expect to have before
> release, so please test it. Can be downloaded from the usual place:
> >>
> >> http://www.open-mpi.org/software/ompi/v1.7/
> >>
> >> Latest changes include:
> >>
> >> * update of the alps/lustre configure code
> >> * fixed solaris hwloc code
> >> * various mxm updates
> >> * removed java bindings (delayed until later release)
> >> * improved the --report-bindings output
> >> * a variety of minor cleanups
> >>
> >
> > any reason to not include the cygwin patches added to 1.6.4 ?
>
> I don't believe they were ever CMR'd for 1.7.0, so they were never moved
>
> >
> > Marco
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>


Re: [OMPI devel] v1.7.0rc7

2013-02-25 Thread Ralph Castain

On Feb 25, 2013, at 1:40 PM, marco atzeri  wrote:

> On 2/23/2013 11:45 PM, Ralph Castain wrote:
>> This release candidate is the last one we expect to have before release, so 
>> please test it. Can be downloaded from the usual place:
>> 
>> http://www.open-mpi.org/software/ompi/v1.7/
>> 
>> Latest changes include:
>> 
>> * update of the alps/lustre configure code
>> * fixed solaris hwloc code
>> * various mxm updates
>> * removed java bindings (delayed until later release)
>> * improved the --report-bindings output
>> * a variety of minor cleanups
>> 
> 
> any reason to not include the cygwin patches added to 1.6.4 ?

I don't believe they were ever CMR'd for 1.7.0, so they were never moved

> 
> Marco
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] v1.7.0rc7

2013-02-25 Thread marco atzeri

On 2/23/2013 11:45 PM, Ralph Castain wrote:

This release candidate is the last one we expect to have before release, so 
please test it. Can be downloaded from the usual place:

http://www.open-mpi.org/software/ompi/v1.7/

Latest changes include:

* update of the alps/lustre configure code
* fixed solaris hwloc code
* various mxm updates
* removed java bindings (delayed until later release)
* improved the --report-bindings output
* a variety of minor cleanups



any reason to not include the cygwin patches added to 1.6.4 ?

Marco