Re: [OMPI devel] v1.7.0rc7

2013-02-25 Thread marco atzeri

On 2/26/2013 3:37 AM, Jeff Squyres (jsquyres) wrote:

Marco --

Is it just these 2 patches:

r28059 [[BR]]
Patch for Cygwin support: use correct DSO/shared library prefix and
suffix.  Thanks to Marco Atzeri for reporting the issue and providing
an initial patch.

r28060 [[BR]]
Patch for Cygwin support: Use S_IRWXU for shmget() and include
.  Thanks to Marco Atzeri for reporting the issue and
providing an initial patch.




plus the additional ones

   ERROR.patch : ERROR is already defined, so another label
 is needed for "goto ERROR"
   interface.patch : same for interface , it seems a
 struct somewhere else
   min.patch : min already defined as macro
 (I saw in the source also a MIN macro defined)
   mpifh.patch : to avoid undefined error
 libmpi_mpifh_la_LIBADD needs
 $(top_builddir)/ompi/libmpi.la
   winsock.patch : defense against  usage

attached here
http://www.open-mpi.org/community/lists/devel/2012/12/11855.php
https://svn.open-mpi.org/trac/ompi/ticket/3437

All test passed, but I noticed these warnings:

  CC   to_self.o
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: 
In function ‘create_indexed_constant_gap_ddt’:
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:48:5: 
warning: ‘MPI_Type_struct’ is deprecated (declared at 
../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by 
MPI_Type_create_struct in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: 
In function ‘create_indexed_gap_ddt’:
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:89:5: 
warning: ‘MPI_Address’ is deprecated (declared at 
../../ompi/include/mpi.h:1057): MPI_Address is superseded by 
MPI_Get_address in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:90:5: 
warning: ‘MPI_Address’ is deprecated (declared at 
../../ompi/include/mpi.h:1057): MPI_Address is superseded by 
MPI_Get_address in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:93:5: 
warning: ‘MPI_Type_struct’ is deprecated (declared at 
../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by 
MPI_Type_create_struct in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:99:5: 
warning: ‘MPI_Address’ is deprecated (declared at 
../../ompi/include/mpi.h:1057): MPI_Address is superseded by 
MPI_Get_address in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:100:5: 
warning: ‘MPI_Address’ is deprecated (declared at 
../../ompi/include/mpi.h:1057): MPI_Address is superseded by 
MPI_Get_address in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:105:5: 
warning: ‘MPI_Type_struct’ is deprecated (declared at 
../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by 
MPI_Type_create_struct in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: 
In function ‘create_indexed_gap_optimized_ddt’:
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:139:5: 
warning: ‘MPI_Type_struct’ is deprecated (declared at 
../../ompi/include/mpi.h:1579): MPI_Type_struct is superseded by 
MPI_Type_create_struct in MPI-2.0
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c: 
In function ‘do_test_for_ddt’:
/pub/devel/openmpi/openmpi-1.7rc7-1/src/openmpi-1.7rc7/test/datatype/to_self.c:307:5: 
warning: ‘MPI_Type_extent’ is deprecated (declared at 
../../ompi/include/mpi.h:1541): MPI_Type_extent is superseded by 
MPI_Type_get_extent in MPI-2.0

  CCLD to_self.exe


Is it expected ?
Or should the test updated to MPI-2.0 convention ?

Regards
Marco



Re: [OMPI devel] v1.7.0rc7

2013-02-25 Thread Jeff Squyres (jsquyres)
On Feb 25, 2013, at 6:30 PM, Pavel Mezentsev  wrote:

> I've tried to build it but got different errors with different compilers.
> 
> With Intel (2011.5.220) and pgi (13.2) I get the following error:
> CC   bcol_iboffload_module.lo
> bcol_iboffload_module.c(37): catastrophic error: cannot open source file 
> "ompi/mca/common/netpatterns/common_netpatterns.h"
>   #include "ompi/mca/common/netpatterns/common_netpatterns.h"

This is a clear error.

Pasha?

> I failed to find that file anywhere among the sources.
> 
> With pathscale (4.0.12.1) I get the following:
>   PPFC mpi-f08-interfaces-callbacks.lo
> 
> module mpi_f08_interfaces_callbacks
>^
> pathf95-855 pathf95: ERROR MPI_F08_INTERFACES_CALLBACKS, File = 
> mpi-f08-interfaces-callbacks.F90, Line = 9, Column = 8 

I don't have access to the Pathscale compiler.  Without more detail, it's hard 
to say what's wrong here.

I've pinged my pathscale contact; perhaps he can shed some light on this...

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] v1.7.0rc7

2013-02-25 Thread Jeff Squyres (jsquyres)
Marco -- 

Is it just these 2 patches:

r28059 [[BR]]
Patch for Cygwin support: use correct DSO/shared library prefix and
suffix.  Thanks to Marco Atzeri for reporting the issue and providing
an initial patch.

r28060 [[BR]]
Patch for Cygwin support: Use S_IRWXU for shmget() and include
.  Thanks to Marco Atzeri for reporting the issue and
providing an initial patch.


On Feb 25, 2013, at 4:40 PM, marco atzeri  wrote:

> On 2/23/2013 11:45 PM, Ralph Castain wrote:
>> This release candidate is the last one we expect to have before release, so 
>> please test it. Can be downloaded from the usual place:
>> 
>> http://www.open-mpi.org/software/ompi/v1.7/
>> 
>> Latest changes include:
>> 
>> * update of the alps/lustre configure code
>> * fixed solaris hwloc code
>> * various mxm updates
>> * removed java bindings (delayed until later release)
>> * improved the --report-bindings output
>> * a variety of minor cleanups
>> 
> 
> any reason to not include the cygwin patches added to 1.6.4 ?
> 
> Marco
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] v1.7.0rc7

2013-02-25 Thread Pavel Mezentsev
I've tried to build it but got different errors with different compilers.

With Intel (2011.5.220) and pgi (13.2) I get the following error:
CC   bcol_iboffload_module.lo
bcol_iboffload_module.c(37): catastrophic error: cannot open source file
"ompi/mca/common/netpatterns/common_netpatterns.h"
  #include "ompi/mca/common/netpatterns/common_netpatterns.h"

I failed to find that file anywhere among the sources.

With pathscale (4.0.12.1) I get the following:
  PPFC mpi-f08-interfaces-callbacks.lo

module mpi_f08_interfaces_callbacks
   ^
pathf95-855 pathf95: ERROR MPI_F08_INTERFACES_CALLBACKS, File =
mpi-f08-interfaces-callbacks.F90, Line = 9, Column = 8
  The compiler has detected errors in module
"MPI_F08_INTERFACES_CALLBACKS".  No module information file will be created
for this module.


 attribute_val_in,attribute_val_out,flag,ierror) &
  ^

pathf95-1691 pathf95: ERROR MPI_COMM_COPY_ATTR_FUNCTION, File =
mpi-f08-interfaces-callbacks.F90, Line = 66, Column = 75
  For "FLAG", LOGICAL(KIND=4) not allowed with BIND(C)


attribute_val_in,attribute_val_out,flag,ierror) &
 ^

pathf95-1691 pathf95: ERROR MPI_WIN_COPY_ATTR_FUNCTION, File =
mpi-f08-interfaces-callbacks.F90, Line = 91, Column = 74
  For "FLAG", LOGICAL(KIND=4) not allowed with BIND(C)


 attribute_val_in,attribute_val_out,flag,ierror) &
  ^

pathf95-1691 pathf95: ERROR MPI_TYPE_COPY_ATTR_FUNCTION, File =
mpi-f08-interfaces-callbacks.F90, Line = 116, Column = 75
  For "FLAG", LOGICAL(KIND=4) not allowed with BIND(C)

SUBROUTINE MPI_Grequest_cancel_function(extra_state,complete,ierror) &
^
pathf95-1691 pathf95: ERROR MPI_GREQUEST_CANCEL_FUNCTION, File =
mpi-f08-interfaces-callbacks.F90, Line = 195, Column = 53
  For "COMPLETE", LOGICAL(KIND=4) not allowed with BIND(C)

pathf95: PathScale(TM) Fortran Version 4.0.12.1 (f14) Tue Feb 26, 2013
 06:33:40
pathf95: 429 source lines
pathf95: 5 Error(s), 0 Warning(s), 0 Other message(s), 0 ANSI(s)
pathf95: "explain pathf95-message number" gives more information about each
message
make[2]: *** [mpi-f08-interfaces-callbacks.lo] Error 1
make[2]: Leaving directory
`/tmp/mpi_install_tmp21558/openmpi-1.7rc7/ompi/mpi/fortran/base'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/tmp/mpi_install_tmp21558/openmpi-1.7rc7/ompi'
make: *** [all-recursive] Error 1

I am not a fortran guy and don't really know what's the problem here.

I tried configuring all cases only with setting the compilers in the
environment variables and setting --prefix. I managed to build 1.6.3 using
all 3 mentioned compilers with the same configuration lines without any
errors.

Not sure about the problem with pathscale but the first problem seems to be
a real error. Or did I miss something?

Regards, Pavel Mezentsev.


2013/2/26 Ralph Castain 

>
> On Feb 25, 2013, at 1:40 PM, marco atzeri  wrote:
>
> > On 2/23/2013 11:45 PM, Ralph Castain wrote:
> >> This release candidate is the last one we expect to have before
> release, so please test it. Can be downloaded from the usual place:
> >>
> >> http://www.open-mpi.org/software/ompi/v1.7/
> >>
> >> Latest changes include:
> >>
> >> * update of the alps/lustre configure code
> >> * fixed solaris hwloc code
> >> * various mxm updates
> >> * removed java bindings (delayed until later release)
> >> * improved the --report-bindings output
> >> * a variety of minor cleanups
> >>
> >
> > any reason to not include the cygwin patches added to 1.6.4 ?
>
> I don't believe they were ever CMR'd for 1.7.0, so they were never moved
>
> >
> > Marco
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>


Re: [OMPI devel] v1.7.0rc7

2013-02-25 Thread Ralph Castain

On Feb 25, 2013, at 1:40 PM, marco atzeri  wrote:

> On 2/23/2013 11:45 PM, Ralph Castain wrote:
>> This release candidate is the last one we expect to have before release, so 
>> please test it. Can be downloaded from the usual place:
>> 
>> http://www.open-mpi.org/software/ompi/v1.7/
>> 
>> Latest changes include:
>> 
>> * update of the alps/lustre configure code
>> * fixed solaris hwloc code
>> * various mxm updates
>> * removed java bindings (delayed until later release)
>> * improved the --report-bindings output
>> * a variety of minor cleanups
>> 
> 
> any reason to not include the cygwin patches added to 1.6.4 ?

I don't believe they were ever CMR'd for 1.7.0, so they were never moved

> 
> Marco
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] v1.7.0rc7

2013-02-25 Thread marco atzeri

On 2/23/2013 11:45 PM, Ralph Castain wrote:

This release candidate is the last one we expect to have before release, so 
please test it. Can be downloaded from the usual place:

http://www.open-mpi.org/software/ompi/v1.7/

Latest changes include:

* update of the alps/lustre configure code
* fixed solaris hwloc code
* various mxm updates
* removed java bindings (delayed until later release)
* improved the --report-bindings output
* a variety of minor cleanups



any reason to not include the cygwin patches added to 1.6.4 ?

Marco



Re: [OMPI devel] RFC: orte_db moving to opal

2013-02-25 Thread Ralph Castain
Hi folks

I have completed the move of the orte db framework to opal. You can see the 
changes here:

https://bitbucket.org/rhc/ompi-trunk

Unless someone has an objection, I'll commit this on Wed (2/27) of this week.
Ralph


On Feb 16, 2013, at 12:25 PM, Ralph Castain  wrote:

> Hi folks
> 
> We had a design meeting last week on moving the BTLs to the OPAL layer. One 
> of the requirements for doing so is that we move the ORTE db framework down 
> to OPAL so it can support the revised modex. I'll be committing that change 
> during the next week.
> 
> Shouldn't impact anyone - I suspect I'm the only one with "off-trunk" 
> components for that framework - so this is just a "heads-up" to the change.
> 
> Please holler if you have any concerns.
> Ralph
> 




Re: [MTT devel] fix zombie commit

2013-02-25 Thread Jeff Squyres (jsquyres)
On Feb 24, 2013, at 6:59 AM, Mike Dubman  wrote:

> What protection do you mean? Check that /proc/pid/status exists? It is done 
> in Grep()

Ah, excellent -- I hadn't noticed that.

> We observe that process which was launched by mtt and hangs (mtt detect 
> timeout and starts do_command procedure), later enters into "defunct" state.

Looking at the code, you're checking for zombie status before MTT kills the 
proc.  Am I reading that right?

If so, then it could well be that the process has exited but not yet been 
reaped (because _kill_proc() hasn't been invoked yet).  If this is the case, is 
the real cause of the problem that the OUTread and ERRread aren't being closed 
when the child process exits, and therefore we keep looping looking for new 
output from them?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/