[OMPI users] win: cmake: release+debug

2010-12-01 Thread Hicham Mouline
Hi,

Following the instructions from Readme.windows, I've used cmake and 4 build
directories to generate release and debug win32 and x64 builds. When it came
to install, I wondered: there are 4 directories involved, bin, lib, share
and include.

Are include and share identical across the 4 configurations. If so, it'd be
good to have a cmake way to share those directories somewhere. As the debug
libraries have a d added to their names, they could also coexist in the same
lib directory as the release libs.

on a win64 box, I could see:
\Program Files\openmpi\bin and bin\debug: 64bit release and debug mpic++ and
co (though I don't see the benefit of debug mpic++)
\Program Files\openmpi\lib: debug and release 64bit libs
\Program Files\openmpi\include: common? include 
\Program Files\openmpi\share: common? share
\Program Files(x86)\openmpi: same as above but for 32bit

on a win32box, 
\Program Files(x86)\openmpi: same as above but _only_ for 32bit

Is it doable easily like this already?

rds,



Re: [OMPI users] win: mpic++ -showme reports duplicate .libs

2010-12-01 Thread Hicham Mouline
> -Original Message-
> From: Shiqing Fan [mailto:f...@hlrs.de]
> Sent: 01 December 2010 11:29
> To: Open MPI Users
> Cc: Hicham Mouline
> Subject: Re: [OMPI users] win: mpic++ -showme reports duplicate .libs
> 
> Hi Hicham,
> 
> Thanks for noticing it. It's now been fixed on trunk.
> 
> 
> Regards,
> Shiqing
> 
> On 2010-12-1 10:02 AM, Hicham Mouline wrote:
> > Hello,
> >
> >> mpic++ -showme:link
> > /TP /EHsc /link /LIBPATH:"C:/Program Files (x86)/openmpi/lib"
> libmpi.lib
> > libopen-pal.lib libopen-rte.lib libmpi_cxx.lib libmpi.lib libopen-
> pal.lib
> > libopen-rte.lib advapi32.lib Ws2_32.lib shlwapi.lib
> >
> > reports using the 4 mpi libs twice.
> >
> > I've followed the cmake way in README.windows.
> >
> > Is this intended or have I wronged somewhere?
> >
> > rds,

That was fast. I'm glad these are sorted quickly.
This is used by FindMPI module in cmake which is hopefully being extended by
the maintainer after some emails, to work on windows as well.

regards,



Re: [OMPI users] How to avoid abort when calling MPI_Finalize without calling MPI_File_close?

2010-12-01 Thread Jeff Squyres
On Dec 1, 2010, at 10:28 AM, Rob Latham wrote:

> under openmpi, this test program fails because openmpi is trying to
> help you out.  I'm going to need some help from the openmpi folks
> here, but the backtrace makes it look like MPI_Finalize is setting the
> "no more mpi calls allowed" flag, and then goes and calls some mpi
> routines to clean up the opened files:

Rob -- I think you're right.

I'll file a ticket, but I don't know exactly when this will be addressed.  
James; if you can find a good solution and send a patch, that would be most 
appreciated.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] SIGPIPE handling?

2010-12-01 Thread Jeff Squyres
On Dec 1, 2010, at 4:12 PM, Jesse Ziser wrote:

> Sorry, one more question: I don't completely understand the version 
> numbering, but can/will this fix go into 1.5.1 at some point?  I notice that 
> the trunk is labeled as 1.7.

Here's an explanation of our version numbering:

http://www.open-mpi.org/software/ompi/versions/

Short version is:

- v1.4: our "super stable" / mature series.  Someday it will be retired.
- v1.5: our "feature" series -- not quite as mature as the v1.4 series.  
Someday it will transition to be the next "super stable" series: v1.6.
- SVN development trunk/v1.7: what will eventually become the v1.7 series 
(i.e., our next "feature" series).

So v1.5 is an official release series.  But it's still under active development 
and having features added.  v1.4 is only having bug fixes applied to it -- it's 
in the stable/production portion of its lifespan.


> Thanks again
> 
> Jesse Ziser wrote:
>> It turned out I was using development version 1.5.0.  After going back to 
>> the release version, I found that there was another problem on my end, which 
>> had nothing to do with OpenMPI.  So thanks for the help; all is well.  (And 
>> sorry for the belated reply.)
>> Ralph Castain wrote:
>>> After digging around a little, I found that you must be using the OMPI 
>>> devel trunk as no release version contains this code. I also looked to see 
>>> why it was done, and found that the concern was with an inadvertent sigpipe 
>>> that can occur internal to OMPI due to a race condition.
>>> 
>>> So I modified the trunk a little. We will ignore the first few sigpipe 
>>> errors we get, but will then abort with an appropriate error.
>>> 
>>> HTH
>>> Ralph
>>> 
>>> On Nov 24, 2010, at 5:08 PM, Jesse Ziser wrote:
>>> 
 Hello,
 
 I've noticed that OpenMPI does not seem to detect when something 
 downstream of it fails.  Specifically, I think it does not handle SIGPIPE 
 or pass it down to its young, but it still prints an error message every 
 time it occurs.
 
 For example, running a command like this:
 
 mpirun -np 1 ./mpi-cat /dev/null
 
 (where mpi-cat is just a simple program that initializes MPI and then 
 copies its input to its output) hangs after the dd quits, and produces an 
 eternity of repetitions of this error message:
 
 [[35845,0],0] reports a SIGPIPE error on fd 13
 
 I am unsure whether this is the intended behavior, but it certainly seems 
 unfortunate from my persepective.  Is there any way to make it exit 
 nicely, preferably with a single error, whenever what it's trying to write 
 to doesn't exist anymore?  I think I could even submit a patch to make it 
 quit on SIGPIPE, if it is agreed that that makes sense.
 
 Here's the source for my mpi-cat example:
 
 #include 
 
 #include 
 
 int main (int iArgC, char *apArgV [])
 {
 int iRank;
 
 MPI_Init (, );
 
 MPI_Comm_rank (MPI_COMM_WORLD, );
 
 if (iRank == 0)
 {
 while(1)
 if(putchar(getchar()) < 0)
 break;
 }
 
 MPI_Finalize ();
 
 return (0);
 }
 
 
 Thank you,
 
 Jesse Ziser
 Applied Research Laboratories:
 The University of Texas at Austin
 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] SIGPIPE handling?

2010-12-01 Thread Ralph Castain
I can schedule it into the 1.5 series, but I don't think it will make 1.5.1 
(too close to release). Have to ask...

On Dec 1, 2010, at 2:12 PM, Jesse Ziser wrote:

> Sorry, one more question: I don't completely understand the version 
> numbering, but can/will this fix go into 1.5.1 at some point?  I notice that 
> the trunk is labeled as 1.7.
> 
> Thanks again
> 
> Jesse Ziser wrote:
>> It turned out I was using development version 1.5.0.  After going back to 
>> the release version, I found that there was another problem on my end, which 
>> had nothing to do with OpenMPI.  So thanks for the help; all is well.  (And 
>> sorry for the belated reply.)
>> Ralph Castain wrote:
>>> After digging around a little, I found that you must be using the OMPI 
>>> devel trunk as no release version contains this code. I also looked to see 
>>> why it was done, and found that the concern was with an inadvertent sigpipe 
>>> that can occur internal to OMPI due to a race condition.
>>> 
>>> So I modified the trunk a little. We will ignore the first few sigpipe 
>>> errors we get, but will then abort with an appropriate error.
>>> 
>>> HTH
>>> Ralph
>>> 
>>> On Nov 24, 2010, at 5:08 PM, Jesse Ziser wrote:
>>> 
 Hello,
 
 I've noticed that OpenMPI does not seem to detect when something 
 downstream of it fails.  Specifically, I think it does not handle SIGPIPE 
 or pass it down to its young, but it still prints an error message every 
 time it occurs.
 
 For example, running a command like this:
 
 mpirun -np 1 ./mpi-cat /dev/null
 
 (where mpi-cat is just a simple program that initializes MPI and then 
 copies its input to its output) hangs after the dd quits, and produces an 
 eternity of repetitions of this error message:
 
 [[35845,0],0] reports a SIGPIPE error on fd 13
 
 I am unsure whether this is the intended behavior, but it certainly seems 
 unfortunate from my persepective.  Is there any way to make it exit 
 nicely, preferably with a single error, whenever what it's trying to write 
 to doesn't exist anymore?  I think I could even submit a patch to make it 
 quit on SIGPIPE, if it is agreed that that makes sense.
 
 Here's the source for my mpi-cat example:
 
 #include 
 
 #include 
 
 int main (int iArgC, char *apArgV [])
 {
 int iRank;
 
 MPI_Init (, );
 
 MPI_Comm_rank (MPI_COMM_WORLD, );
 
 if (iRank == 0)
 {
 while(1)
 if(putchar(getchar()) < 0)
 break;
 }
 
 MPI_Finalize ();
 
 return (0);
 }
 
 
 Thank you,
 
 Jesse Ziser
 Applied Research Laboratories:
 The University of Texas at Austin
 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] SIGPIPE handling?

2010-12-01 Thread Jesse Ziser
Sorry, one more question: I don't completely understand the version 
numbering, but can/will this fix go into 1.5.1 at some point?  I notice 
that the trunk is labeled as 1.7.


Thanks again

Jesse Ziser wrote:
It turned out I was using development version 1.5.0.  After going back 
to the release version, I found that there was another problem on my 
end, which had nothing to do with OpenMPI.  So thanks for the help; all 
is well.  (And sorry for the belated reply.)


Ralph Castain wrote:
After digging around a little, I found that you must be using the OMPI 
devel trunk as no release version contains this code. I also looked to 
see why it was done, and found that the concern was with an 
inadvertent sigpipe that can occur internal to OMPI due to a race 
condition.


So I modified the trunk a little. We will ignore the first few sigpipe 
errors we get, but will then abort with an appropriate error.


HTH
Ralph

On Nov 24, 2010, at 5:08 PM, Jesse Ziser wrote:


Hello,

I've noticed that OpenMPI does not seem to detect when something 
downstream of it fails.  Specifically, I think it does not handle 
SIGPIPE or pass it down to its young, but it still prints an error 
message every time it occurs.


For example, running a command like this:

 mpirun -np 1 ./mpi-cat /dev/null

(where mpi-cat is just a simple program that initializes MPI and then 
copies its input to its output) hangs after the dd quits, and 
produces an eternity of repetitions of this error message:


 [[35845,0],0] reports a SIGPIPE error on fd 13

I am unsure whether this is the intended behavior, but it certainly 
seems unfortunate from my persepective.  Is there any way to make it 
exit nicely, preferably with a single error, whenever what it's 
trying to write to doesn't exist anymore?  I think I could even 
submit a patch to make it quit on SIGPIPE, if it is agreed that that 
makes sense.


Here's the source for my mpi-cat example:

 #include 

 #include 

 int main (int iArgC, char *apArgV [])
 {
 int iRank;

 MPI_Init (, );

 MPI_Comm_rank (MPI_COMM_WORLD, );

 if (iRank == 0)
 {
 while(1)
 if(putchar(getchar()) < 0)
 break;
 }

 MPI_Finalize ();

 return (0);
 }


Thank you,

Jesse Ziser
Applied Research Laboratories:
The University of Texas at Austin
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] Open MPI vs IBM MPI performance help

2010-12-01 Thread Price, Brian M (N-KCI)
OpenMPI version: 1.4.3
Platform: IBM P5, 32 processors, 256 GB memory, Symmetric Multi-Threading (SMT) 
enabled
Application: starts up 48 processes and does MPI using MPI_Barrier, MPI_Get, 
MPI_Put (lots of transfers, large amounts of data)
Issue:  When implemented using Open MPI vs. IBM's MPI ('poe' from HPC Toolkit), 
the application runs 3-5 times slower.
I suspect that IBM's MPI implementation must take advantage of some knowledge 
that it has about data transfers that Open MPI is not taking advantage of.
Any suggestions?
Thanks,
Brian Price



Re: [OMPI users] How to avoid abort when calling MPI_Finalize without calling MPI_File_close?

2010-12-01 Thread James Overfelt
On Wed, Dec 1, 2010 at 8:28 AM, Rob Latham  wrote:
> On Mon, Nov 22, 2010 at 04:40:14PM -0700, James Overfelt wrote:
>> Hello,
>>
>>     I have a small test case where a file created with MPI_File_open
>> is still open at the time MPI_Finalize is called.  In the actual
>> program there are lots of open files and it would be nice to avoid the
>> resulting "Your MPI job will now abort." by either having MPI_Finalize
>> close the files or honor the error handler and return an error code
>> without an abort.
>>
>>   I've tried with with OpenMPI 1.4.3 and 1.5 with the same results.
>> Attached are the configure, compile and source files and the whole
>> program follows.
>
> under MPICH2, this simple test program does not abort.  You leak a lot
> of resources (e.g. info structure allocated is not freed) but it
> sounds like you are well aware of that.
>
> under openmpi, this test program fails because openmpi is trying to
> help you out.  I'm going to need some help from the openmpi folks
> here, but the backtrace makes it look like MPI_Finalize is setting the
> "no more mpi calls allowed" flag, and then goes and calls some mpi
> routines to clean up the opened files:
>
> Breakpoint 1, 0xb7f7c346 in PMPI_Barrier () from 
> /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0
> (gdb) where
> #0  0xb7f7c346 in PMPI_Barrier () from 
> /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0
> #1  0xb78a4c25 in mca_io_romio_dist_MPI_File_close () from 
> /home/robl/work/soft/openmpi-1.4/lib/openmpi/mca_io_romio.so
> #2  0xb787e8b3 in mca_io_romio_file_close () from 
> /home/robl/work/soft/openmpi-1.4/lib/openmpi/mca_io_romio.so
> #3  0xb7f591b1 in file_destructor () from 
> /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0
> #4  0xb7f58f28 in ompi_file_finalize () from 
> /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0
> #5  0xb7f67eb3 in ompi_mpi_finalize () from 
> /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0
> #6  0xb7f82828 in PMPI_Finalize () from 
> /home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0
> #7  0x0804f9c2 in main (argc=1, argv=0xbfffed94) at file_error.cc:17
>
> Why is there an MPI_Barrier in the close path?  It has to do with our
> implementation of shared file pointers.  If you run this test on a file system
> that does not support shared file pointers ( PVFS, for example), you might get
> a little further.
>
> So, I think the ball is back in the OpenMPI court: they have to
> re-jigger the order of the destructors so that closing files comes a
> little earlier in the shutdown process.
>
> ==rob
>


Rob,

  Thank you, that is the answer I was hoping for:  I'm not crazy and
it should be an easy fix.  I'll look through the OpenMPI source code
and maybe suggest a fix.

jro



Re: [OMPI users] Problem with MPI_Comm_accept in a dynamic client/server application

2010-12-01 Thread Kalin Kanov

Hi Shiqing,

I am using OpenMPI version 1.4.2

Here is the output of ompi_info:
 Package: Open MPI Kalin Kanov@LAZAR Distribution
Open MPI: 1.4.2
   Open MPI SVN revision: r23093
   Open MPI release date: May 04, 2010
Open RTE: 1.4.2
   Open RTE SVN revision: r23093
   Open RTE release date: May 04, 2010
OPAL: 1.4.2
   OPAL SVN revision: r23093
   OPAL release date: May 04, 2010
Ident string: 1.4.2
  Prefix: C:/Program Files/openmpi-1.4.2/installed
 Configured architecture: x86 Windows-5.2
  Configure host: LAZAR
   Configured by: Kalin Kanov
   Configured on: 18:00 04.10.2010 ?.
  Configure host: LAZAR
Built by: Kalin Kanov
Built on: 18:00 04.10.2010 ?.
  Built host: LAZAR
  C bindings: yes
C++ bindings: yes
  Fortran77 bindings: no
  Fortran90 bindings: no
 Fortran90 bindings size: na
  C compiler: cl
 C compiler absolute: cl
C++ compiler: cl
   C++ compiler absolute: cl
  Fortran77 compiler: CMAKE_Fortran_COMPILER-NOTFOUND
  Fortran77 compiler abs: none
  Fortran90 compiler:
  Fortran90 compiler abs: none
 C profiling: yes
   C++ profiling: yes
 Fortran77 profiling: no
 Fortran90 profiling: no
  C++ exceptions: no
  Thread support: no
   Sparse Groups: no
  Internal debug support: no
 MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
 libltdl support: no
   Heterogeneous support: no
 mpirun default --prefix: yes
 MPI I/O support: yes
   MPI_WTIME support: gettimeofday
Symbol visibility support: yes
   FT Checkpoint support: yes  (checkpoint thread: no)
   MCA backtrace: none (MCA v2.0, API v2.0, Component v1.4.2)
   MCA paffinity: windows (MCA v2.0, API v2.0, Component v1.4.2)
   MCA carto: auto_detect (MCA v2.0, API v2.0, Component 
v1.4.2)

   MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.4.2)
   MCA timer: windows (MCA v2.0, API v2.0, Component v1.4.2)
 MCA installdirs: windows (MCA v2.0, API v2.0, Component v1.4.2)
 MCA installdirs: env (MCA v2.0, API v2.0, Component v1.4.2)
 MCA installdirs: config (MCA v2.0, API v2.0, Component v1.4.2)
 MCA crs: none (MCA v2.0, API v2.0, Component v1.4.2)
 MCA dpm: orte (MCA v2.0, API v2.0, Component v1.4.2)
  MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.4.2)
   MCA allocator: basic (MCA v2.0, API v2.0, Component v1.4.2)
   MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.4.2)
MCA coll: basic (MCA v2.0, API v2.0, Component v1.4.2)
MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.4.2)
MCA coll: self (MCA v2.0, API v2.0, Component v1.4.2)
MCA coll: sm (MCA v2.0, API v2.0, Component v1.4.2)
MCA coll: sync (MCA v2.0, API v2.0, Component v1.4.2)
   MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.4.2)
   MCA mpool: sm (MCA v2.0, API v2.0, Component v1.4.2)
 MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.4.2)
 MCA bml: r2 (MCA v2.0, API v2.0, Component v1.4.2)
 MCA btl: self (MCA v2.0, API v2.0, Component v1.4.2)
 MCA btl: sm (MCA v2.0, API v2.0, Component v1.4.2)
 MCA btl: tcp (MCA v2.0, API v2.0, Component v1.4.2)
MCA topo: unity (MCA v2.0, API v2.0, Component v1.4.2)
 MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.4.2)
 MCA osc: rdma (MCA v2.0, API v2.0, Component v1.4.2)
 MCA iof: hnp (MCA v2.0, API v2.0, Component v1.4.2)
 MCA iof: orted (MCA v2.0, API v2.0, Component v1.4.2)
 MCA iof: tool (MCA v2.0, API v2.0, Component v1.4.2)
 MCA oob: tcp (MCA v2.0, API v2.0, Component v1.4.2)
MCA odls: process (MCA v2.0, API v2.0, Component v1.4.2)
 MCA ras: ccp (MCA v2.0, API v2.0, Component v1.4.2)
   MCA rmaps: round_robin (MCA v2.0, API v2.0, Component 
v1.4.2)

   MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.4.2)
 MCA rml: ftrm (MCA v2.0, API v2.0, Component v1.4.2)
 MCA rml: oob (MCA v2.0, API v2.0, Component v1.4.2)
  MCA routed: binomial (MCA v2.0, API v2.0, Component v1.4.2)
  MCA routed: linear (MCA v2.0, API v2.0, Component v1.4.2)
 MCA plm: ccp (MCA v2.0, API v2.0, Component v1.4.2)
 MCA plm: process (MCA v2.0, API v2.0, Component v1.4.2)
  MCA errmgr: default (MCA v2.0, API v2.0, Component v1.4.2)
 MCA ess: env (MCA v2.0, API v2.0, Component v1.4.2)
 MCA ess: hnp (MCA 

Re: [OMPI users] How to avoid abort when calling MPI_Finalize without calling MPI_File_close?

2010-12-01 Thread Rob Latham
On Mon, Nov 22, 2010 at 04:40:14PM -0700, James Overfelt wrote:
> Hello,
> 
> I have a small test case where a file created with MPI_File_open
> is still open at the time MPI_Finalize is called.  In the actual
> program there are lots of open files and it would be nice to avoid the
> resulting "Your MPI job will now abort." by either having MPI_Finalize
> close the files or honor the error handler and return an error code
> without an abort.
> 
>   I've tried with with OpenMPI 1.4.3 and 1.5 with the same results.
> Attached are the configure, compile and source files and the whole
> program follows.

under MPICH2, this simple test program does not abort.  You leak a lot
of resources (e.g. info structure allocated is not freed) but it
sounds like you are well aware of that. 

under openmpi, this test program fails because openmpi is trying to
help you out.  I'm going to need some help from the openmpi folks
here, but the backtrace makes it look like MPI_Finalize is setting the
"no more mpi calls allowed" flag, and then goes and calls some mpi
routines to clean up the opened files:

Breakpoint 1, 0xb7f7c346 in PMPI_Barrier () from 
/home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0
(gdb) where
#0  0xb7f7c346 in PMPI_Barrier () from 
/home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0
#1  0xb78a4c25 in mca_io_romio_dist_MPI_File_close () from 
/home/robl/work/soft/openmpi-1.4/lib/openmpi/mca_io_romio.so
#2  0xb787e8b3 in mca_io_romio_file_close () from 
/home/robl/work/soft/openmpi-1.4/lib/openmpi/mca_io_romio.so
#3  0xb7f591b1 in file_destructor () from 
/home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0
#4  0xb7f58f28 in ompi_file_finalize () from 
/home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0
#5  0xb7f67eb3 in ompi_mpi_finalize () from 
/home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0
#6  0xb7f82828 in PMPI_Finalize () from 
/home/robl/work/soft/openmpi-1.4/lib/libmpi.so.0
#7  0x0804f9c2 in main (argc=1, argv=0xbfffed94) at file_error.cc:17

Why is there an MPI_Barrier in the close path?  It has to do with our
implementation of shared file pointers.  If you run this test on a file system
that does not support shared file pointers ( PVFS, for example), you might get
a little further.

So, I think the ball is back in the OpenMPI court: they have to
re-jigger the order of the destructors so that closing files comes a
little earlier in the shutdown process.

==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA


Re: [OMPI users] Problem with MPI_Comm_accept in a dynamic client/server application

2010-12-01 Thread Shiqing Fan

Hi Kalin,

Which version of Open MPI did you use? It seems that the ess component 
couldn't be selected. Could you please send me the output of ompi_info?



Regards,
Shiqing

On 2010-11-30 12:32 AM, Kalin Kanov wrote:

Hi Shiqing,

I must have missed your response among all the e-mails that get sent 
to the mailing list. Here are a little more details about the issues 
that I am having. My client/server programs seem to run sometimes, but 
then after a successful run I always seem to get the error that I 
included in my first post. The way that I run the programs is by 
running the server application first, which generates the port string, 
etc. I then proceed to run the client application with a new call to 
mpirun. After getting the errors that I e-mailed about I also tried to 
run ompi-clean, but the results are the following:


>ompi-clean
[Lazar:05984] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file 
..\..\orte\r

untime\orte_init.c at line 125
-- 


It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
-- 



Any help with this issue will be greatly appreciated.

Thank you,
Kalin


On 27.10.2010 г. 05:52, Shiqing Fan wrote:

  Hi Kalin,

Sorry for the late reply.

I checked the code and got confused. (I'm not and MPI expert) I'm just
wondering how to start the server and client in the same mpirun command
while the client needs a hand-input port name, which is given by the
server at runtime.

I found a similar program on the Internet (see attached), that works
well on my Windows. In this program, the generated port name will be
send among the processes by MPI_Send.


Regards,
Shiqing


On 2010-10-13 11:09 PM, Kalin Kanov wrote:

Hi there,

I am trying to create a client/server application with OpenMPI, which
has been installed on a Windows machine, by following the instruction
(with CMake) in the README.WINDOWS file in the OpenMPI distribution
(version 1.4.2). I have ran other test application that compile file
under the Visual Studio 2008 Command Prompt. However I get the
following errors on the server side when accepting a new client that
is trying to connect:

[Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file
..\..\orte\mca\grp
comm\base\grpcomm_base_allgather.c at line 222
[Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file
..\..\orte\mca\grp
comm\basic\grpcomm_basic_module.c at line 530
[Lazar:02716] [[47880,1],0] ORTE_ERROR_LOG: Not found in file
..\..\ompi\mca\dpm
\orte\dpm_orte.c at line 363
[Lazar:2716] *** An error occurred in MPI_Comm_accept
[Lazar:2716] *** on communicator MPI_COMM_WORLD
[Lazar:2716] *** MPI_ERR_INTERN: internal error
[Lazar:2716] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
-- 



mpirun has exited due to process rank 0 with PID 476 on
node Lazar exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
-- 




The server and client code is attached. I have straggled with this
problem for quite a while, so please let me know what the issue might
be. I have looked at the archives and the FAQ, and the only thing
similar that I have found had to do with different version of OpenMPI
installed, but I only have one version, and I believe it is the one
being used.

Thank you,
Kalin


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
--
Shiqing Fanhttp://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
   Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email:f...@hlrs.de
70569 Stuttgart






--
--
Shiqing Fan  http://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
  Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email: f...@hlrs.de
70569 Stuttgart



Re: [OMPI users] win: mpic++ -showme reports duplicate .libs

2010-12-01 Thread Shiqing Fan

Hi Hicham,

Thanks for noticing it. It's now been fixed on trunk.


Regards,
Shiqing

On 2010-12-1 10:02 AM, Hicham Mouline wrote:

Hello,


mpic++ -showme:link

/TP /EHsc /link /LIBPATH:"C:/Program Files (x86)/openmpi/lib" libmpi.lib
libopen-pal.lib libopen-rte.lib libmpi_cxx.lib libmpi.lib libopen-pal.lib
libopen-rte.lib advapi32.lib Ws2_32.lib shlwapi.lib

reports using the 4 mpi libs twice.

I've followed the cmake way in README.windows.

Is this intended or have I wronged somewhere?

rds,

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
--
Shiqing Fan  http://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
  Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email: f...@hlrs.de
70569 Stuttgart



Re: [OMPI users] failure to launch MPMD program on win32 w 1.4.3

2010-12-01 Thread Shiqing Fan

Hi Hicham,

I've had this issue with -np 3 : -np 3 but not with -np 2: -np 2 or 
-np 1: -np 4 or other combinations.
I've also rebuilt from vs2008 with the libs advapi32.lib Ws2_32.lib 
shlwapi.lib as visible in the text file: 
share\openmpi\mpic++.exe-wrapper-data.txt, and the problem seemed to 
stop happening.


so now it is working.
Great! But I don't see the cause of the problem. If it's missing the 
linking libraries, the compiler should already complain at linking time.


I assume I will be able to do this on several windows boxes? Do they 
need to be all 32bit or 64bit or can I mix?
Yes, you can mix 32 and 64 bit, but you have to take care of the 
executables on each machine. And for running on multiple windows boxes, 
please refer to the windows readme file. In order to simplify the WMI 
configuration process, you may also use the small tool I attached for 
configure users (change the file extension to .exe):


Syntax: wmi-config   [] ...

For example:
   wmi-config add LOCAL_COMPUTER\user
   wmi-config add DOMAIN1\user1 DOMAIN2\user2
   wmi-config del DOMAIN1\user1


Regards,
Shiqing

--
--
Shiqing Fan  http://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
  Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email: f...@hlrs.de
70569 Stuttgart



wmi-config.ex_
Description: Binary data


Re: [OMPI users] failure to launch MPMD program on win32 w 1.4.3

2010-12-01 Thread Hicham Mouline
> -Original Message-
> From: Shiqing Fan [mailto:f...@hlrs.de]
> Sent: 30 November 2010 23:39
> To: Open MPI Users
> Cc: Hicham Mouline; Rainer Keller
> Subject: Re: [OMPI users] failure to launch MPMD program on win32 w
> 1.4.3
> 
> Hi,
> 
> I don't have boost on my Windows, so I made a very similar program just
> using MPI, and everything works just fine for me:
> 
> D:\work\OpenMPI\tests\CXX>more hello.cpp
> 
> # include "mpi.h"
> 
> using namespace std;
> 
> int main ( int argc, char *argv[] )
> {
>int rank, size;
> 
>MPI::Init ( argc, argv );
>size = MPI::COMM_WORLD.Get_size ( );
>rank = MPI::COMM_WORLD.Get_rank ( );
> 
>printf("Process # %d \n", rank);
> 
>MPI::Finalize ( );
>return 0;
> }
> 
> 
> D:\work\OpenMPI\tests\CXX>mpirun -np 3 hello.exe : -np 3 hello.exe
> Process # 2
> Process # 4
> Process # 0
> Process # 3
> Process # 5
> Process # 1
> 
> 
> May be something related to boost?
> 
> 
> Regards,
> Shiqing
> 
I've had this issue with -np 3 : -np 3 but not with -np 2: -np 2 or -np 1: -np 
4 or other combinations.
I've also rebuilt from vs2008 with the libs advapi32.lib Ws2_32.lib shlwapi.lib 
as visible in the text file: share\openmpi\mpic++.exe-wrapper-data.txt, and the 
problem seemed to stop happening.

so now it is working.

I assume I will be able to do this on several windows boxes? Do they need to be 
all 32bit or 64bit or can I mix?

regards,