[OMPI users] EuroPVM/MPI 2008 First Call for Papers

2007-12-18 Thread George Bosilca

Please apologize for multiple copies.

Begin forwarded message:


EuroPVM/MPI 2008 CALL FOR PAPERS

15th European PVMMPI Users' Group Meeting
Dublin, Ireland, September 7 - 10, 2008
web: http://pvmmpi08.ucd.ie
organized by UCD School of Computer Science and Informatics

BACKGROUND AND TOPICS

PVM (Parallel Virtual Machine) and MPI (Message Passing Interface)
have evolved into the standard interfaces for high-performance  
parallel

programming in the message-passing paradigm. EuroPVM/MPI is the most
prominent meeting dedicated to the latest developments of PVM and MPI,
their use, including support tools, and implementation, and to
applications using these interfaces. The EuroPVM/MPI meeting naturally
encourages discussions of new message-passing and other parallel and
distributed programming paradigms beyond MPI and PVM.

The 15th European PVM/MPI Users' Group Meeting will be a forum for  
users

and developers of PVM, MPI, and other message-passing programming
environments. Through the presentation of contributed papers, vendor
presentations, poster presentations and invited talks, attendees will
have the opportunity to share ideas and experiences to contribute to  
the

improvement and furthering of message-passing and related parallel
programming paradigms.

Topics of interest for the meeting include, but are not limited to:

   * PVM and MPI implementation issues and improvements
   * Latest extensions to PVM and MPI
   * PVM and MPI for high-performance computing, clusters and grid
 environments
   * New message-passing and hybrid parallel programming paradigms
   * Interaction between message-passing software and hardware
   * Fault tolerance in message-passing programs
   * Performance evaluation of PVM and MPI applications
   * Tools and environments for PVM and MPI
   * Algorithms using the message-passing paradigm
   * Applications in science and engineering based on message-passing

Submissions on applications demonstrating both the potential and
shortcomings of MPI and PVM are particularly welcome.

As in the previous years, the special session 'ParSim' will focus on
numerical simulation for parallel engineering environments. EuroPVM/ 
MPI

2008
will also hold the 'Outstanding Papers' session, where the best papers
selected by the program committee will be presented.


SUBMISSION INFORMATION

Contributors are invited to submit a full paper as a PDF document not
exceeding 8 pages in English (2 pages for poster abstracts). The title
page should contain an abstract of at most 100 words and five specific
keywords.
The paper needs to be formatted according to the Springer LNCS
guidelines
(http://www.springer.de/comp/lncs/authors.html). The usage of LaTeX  
for

preparation of the contribution as well as the submission in camera
ready
format is strongly recommended. Style files can be found at
http://www.springer.de/comp/lncs/authors.html. New work that is not  
yet

mature for a full paper, short observations, and similar brief
announcements are invited for the poster session. Contributions to the
poster session should be submitted in the form of a two page abstract.
All these contributions will be fully peer reviewed by the program
committee.

Submissions to the special session 'Current Trends in Numerical
Simulation for Parallel Engineering Environments' (ParSim 2008) are
handled and reviewed by the respective session chairs. For more
information please refer to the ParSim website
(http://www.lrr.in.tum.de/Par/arch/events/parsim08/).

All accepted submissions are expected to be presented at the  
conference

by one of the authors, which requires registration for the conference.

IMPORTANT DATES

EuroPVM/MPI Conference
Submission of full papers and poster abstracts  April 6th, 2008
Notification of authors   May 9th, 2008
Camera-ready papers June 7th, 2008
Tutorials   September 7th, 2008
Conference  September 8th-10th,
2008
For up-to-date information, visit the conference web site at
http://pvmmpi08.ucd.ie.

ParSIM Session
Submission of papers  May 5th, 2008
Notification to Authors May 26th, 2008
Camera ready papers   June 7th, 2008


PROCEEDINGS

The conference proceedings consisting of abstracts of invited talks,
full papers, and two page abstracts for the posters will be  
published by

Springer in the LNCS series. Authors are strongly encouraged to read
carefully the recommendations for publications to facilitate the
publication procedure.

In addition, selected papers of the conference, including those from  
the

'Outstanding Papers' session, will be considered for publication in a
special issue of a journal in an extended format.

GENERAL CHAIR
Jack Dongarra (University of Tennessee) 

PROGRAM CHAIRS
Alexey Lastovetsky (University College Dublin)  
Tahar Kechadi (University College Dublin)   

CONFERENCE 

Re: [OMPI users] Gigabit ethernet (PCI Express) and openmpi v1.2.4

2007-12-18 Thread Allan Menezes

Hi Allan,

This suggest that your chipset is not able to handle the full PCI-E  
speed on more than 3 ports. This usually depends on the way the PCI-E  
links are wired trough the ports and the capacity of the chipset  
itself. As an exemple we were never able to reach fullspeed  
performance with Myrinet 10g on IBM e325 nodes, because of chipset  
limitations. We had to make the node changed to solve the issue.  
Running several instances of NPtcp should somewhat show the bandwith  
limit of the PCI-E bus on your machine.


Aurelien
Hi Aurelien or anybody else,
How do you run several instances of NPtcp simultaneouly between two 
idential nodes a1, a2 through 3 similat gigabit etherenet cards with different 
subnets and switches.
a1: eth0 - 192.168.1.1 eth2 - 192.168.5.1 eth3 - 192.168.8.1
a2: eth0 - 192.168.1.2 eth2 - 192.168.5.2 eth3 - 192.168.8.2
This is the way i do it currently.
on a1:>./NPtcp
ssh a2
a2> ./NPtcp -h 192.168.1.1 -n 50
for tcp b/w of eth0 and so on sor eth2 and eth3
I do not how to do it simultaneosly ie check the total b/w of eth0+eth2+eth3 at 
the same time in one invocation of ./NPtcp.
I can do it with mpirun and NPmpi 
Can some one please tell me how to do it with NPtcp because i do not know.

Regards,
Allan Menezes




Re: [OMPI users] Fwd: R npRmpi

2007-12-18 Thread Dirk Eddelbuettel

On 18 December 2007 at 16:08, Randy Heiland wrote:
| The pkg in question is here:  http://www.stats.uwo.ca/faculty/yu/Rmpi/
| 
| The question is:  has anyone on this list got OpenMPI working for  
| this pkg?  Any suggestions?

Yes -- I happen to maintain GNU R, a number of R packages (eg r-cran-*) and
more for Debian and am also part of Debian's Open MPI maintainer group. I
also use Rmpi at work.

Dr Yu and I sorted out all relevant issues a few weeks ago and the most
current Rmpi (ie 0.5-5) works out of the box on Debian and Ubuntu, and is
current in Debian.  It should "just work" on any other recent Linux and Unix
distro.  If not please report back what configure reports and where it fails.

[ As an aside, we do have a current bug in Debian unstable with Open MPI as
we're trying to make transition between LAM, MPICH and Open MPI more
bullet-proof. If you use just Open MPI you should already be fine. ]

Greeting from Chicago,  Dirk 

| 
| thanks, Randy
| 
| 
| Begin forwarded message:
| 
| >
| > Subject: R npRmpi
| >
| > Been looking into the npRmpi problem
| >
| > I can get a segfault executing
| >> mpi.spawn.Rslaves()
| >
| > The documentation .html files under npRmpi contains the following:
| >
| > "mpi.spawn.Rslaves to spawn R slaves on selected hosts. This is
| > a LAM-MPI specific function."
| >
| >> lamhosts()
| > sh: lamnodes: command not found
| >
| > The documentation for nearly all mpi.xxx.xxx calls send you to
| > www.lam-mpi.org for more information.
| >
| > Looks for all the world this package depends on LAM-MPI which
| > is not installed on Quarry. I don't think pointing the build
| > at an OpenMPI install will help. The .c sources will compile
| > just fine but when R goes to use them it refers to LAM-MPI
| > dependent functions and behaves  badly.
| >
| 
| The pkg in question is here:  http://www.stats.uwo.ca/faculty/yu/Rmpi/;>http://www.stats.uwo.ca/faculty/yu/Rmpi/The question is:  has anyone on 
this list got OpenMPI working for this pkg?  Any suggestions?thanks, Randy Begin forwarded 
message:Subject: R npRmpi Been looking into the npRmpi problemI can get a segfault 
executing mpi.spawn.Rslaves() The documentation .html files 
under npRmpi contains the following:"mpi.spawn.Rslaves to spawn R slaves on selected hosts. 
This isa LAM-MPI specific function." lamhosts() sh: lamnodes: command not 
foundThe documentation 
for nearly all mpi.xxx.xxx calls send you towww.lam-mpi.org for 
more information.Looks for all the world this package depends on LAM-MPI whichis not installed on Quarry. I don't think pointing the buildat an OpenMPI install 
will help. The .c sources will compilejust fine but when R 
goes to use them it refers to LAM-MPIdependent functions 
and behaves  badly. 
___
| users mailing list
| us...@open-mpi.org
| http://www.open-mpi.org/mailman/listinfo.cgi/users
-- 
Three out of two people have difficulties with fractions.



Re: [OMPI users] Fwd: R npRmpi

2007-12-18 Thread Caird, Andrew J

Dr. Yu sent me a version of this intended for OpenMPI back in September.
I was just today getting around to trying it, although I noticed that it
doesn't work with R v2.6, so my plans just changed a little.

If Dr. Yu gives permission, I'll send to you what he sent to me, or
perhaps he'll post it to this list.

--andy


> -Original Message-
> From: users-boun...@open-mpi.org 
> [mailto:users-boun...@open-mpi.org] On Behalf Of Randy Heiland
> Sent: Tuesday, December 18, 2007 4:08 PM
> To: us...@open-mpi.org
> Cc: hpa-ad...@iu.edu
> Subject: [OMPI users] Fwd: R npRmpi
> 
> The pkg in question is here:  http://www.stats.uwo.ca/faculty/yu/Rmpi/
> 
> The question is:  has anyone on this list got OpenMPI working 
> for this pkg?  Any suggestions?
> 
> thanks, Randy
> 
> 
> 
> 
> Begin forwarded message:
> 
> 
>   
>   
>   Subject: R npRmpi
> 
>   Been looking into the npRmpi problem
> 
>   I can get a segfault executing
> 
>   mpi.spawn.Rslaves()
> 
> 
>   The documentation .html files under npRmpi contains the 
> following:
> 
>   "mpi.spawn.Rslaves to spawn R slaves on selected hosts. This is
>   a LAM-MPI specific function."
> 
> 
>   lamhosts()
> 
>   sh: lamnodes: command not found
> 
>   The documentation for nearly all mpi.xxx.xxx calls send you to
>   www.lam-mpi.org for more information.
> 
>   Looks for all the world this package depends on LAM-MPI which
>   is not installed on Quarry. I don't think pointing the build
>   at an OpenMPI install will help. The .c sources will compile
>   just fine but when R goes to use them it refers to LAM-MPI
>   dependent functions and behaves  badly.
> 
> 
> 
> 



[OMPI users] Fwd: R npRmpi

2007-12-18 Thread Randy Heiland

The pkg in question is here:  http://www.stats.uwo.ca/faculty/yu/Rmpi/

The question is:  has anyone on this list got OpenMPI working for  
this pkg?  Any suggestions?


thanks, Randy


Begin forwarded message:



Subject: R npRmpi

Been looking into the npRmpi problem

I can get a segfault executing

mpi.spawn.Rslaves()


The documentation .html files under npRmpi contains the following:

"mpi.spawn.Rslaves to spawn R slaves on selected hosts. This is
a LAM-MPI specific function."


lamhosts()

sh: lamnodes: command not found

The documentation for nearly all mpi.xxx.xxx calls send you to
www.lam-mpi.org for more information.

Looks for all the world this package depends on LAM-MPI which
is not installed on Quarry. I don't think pointing the build
at an OpenMPI install will help. The .c sources will compile
just fine but when R goes to use them it refers to LAM-MPI
dependent functions and behaves  badly.





Re: [OMPI users] Bug in oob_tcp_[in|ex]clude?

2007-12-18 Thread Jeff Squyres

On Dec 18, 2007, at 11:12 AM, Marco Sbrighi wrote:


Assumedly this(these) statement(s) are in a config file that is being
read by Open MPI, such as $HOME/.openmpi/mca-params.conf?


I've tried many combinations: only in $HOME/.openmpi/mca-params.conf,
only in command line and both; but none seems to work correctly.
Nevertheless, what I'm expecting is that if something is specified in
$HOME/.openmpi/mca-params.conf, then if differently specified in  
command

line, the last should be assumed, I think.


The only difference in putting values in these locations should be the  
order of precedence in which they are read.  As you stated, values on  
the command line override everything else.  See http://www.open-mpi.org/faq/?category=tuning#setting-mca-params 
. 

Yes, it does.  Specifying the MCA same param twice on the command  
line

results in undefined behavior -- it will only take one of them, and I
assume it'll take the first (but I'd have to check the code to be  
sure).


OK, I can obtain the same behaviour using only one statement:
--mca oob_tcp_include eth1,lo,eth0,ib0,ib1


FWIW, I traced the history of this code -- it looks like it dates all  
the way back to LAM/MPI, where if you specify "--mca foo bar --mca foo  
yow", then foo will get the value "bar,yow".  So it *is* intended  
(albeit undocumented!) behavior.  Who knew!  :-)


note that using  --mca mpi_show_mca_params what I'm seeing in the  
report

is the same for both statements (twice and single):

.
[node255:30188] oob_tcp_debug=0
[node255:30188] oob_tcp_include=eth1,lo,eth0,ib0,ib1
[node255:30188] oob_tcp_exclude=
...


So far, this is all consistent and expected.

Could you try with 1.2.3 or 1.2.4 (1.2.4 is the most recent; 1.2.5  
is

due out "soon" -- it *may* get out before the holiday break, but no
promises...)?


we have 1.2.3 in another cluster and it performs the same behaviour as
1.2.2  (BTW the other cluster has the same eth ifaces)


Crud.


If you can't upgrade, let me know and I can provide a debugging patch
that will give us a little more insight into what is happening on  
your

machines.  Thanks.


It is quite difficult for us to upgrade the open-mpi now. We have the
official CISCO packages installed, and I know the 1.2.2-1 is the only
official CISCO's open-mpi distribution today 



Here's a patch to the OMPI 1.2.2 source that adds some printf's in the  
OOB TCP interface selection logic that should show exactly what each  
process decides.  You should be able to run this with as few as 2  
processes to see what the decision-making process is for each of them.


11:24] svbu-mpi:/home/jsquyres/openmpi-1.2.2 % diff -u orte/mca/oob/ 
tcp/oob_tcp.c.orig orte/mca/oob/tcp/oob_tcp.c

--- orte/mca/oob/tcp/oob_tcp.c.orig 2007-12-18 11:21:08.0 -0800
+++ orte/mca/oob/tcp/oob_tcp.c  2007-12-18 11:22:29.0 -0800
@@ -1344,11 +1344,15 @@
 char name[32];
 opal_ifindextoname(i, name, sizeof(name));
 if (mca_oob_tcp_component.tcp_include != NULL &&
-strstr(mca_oob_tcp_component.tcp_include,name) == NULL)
+strstr(mca_oob_tcp_component.tcp_include,name) == NULL) {
+opal_output(0, "TCP OOB skipping %s because it's not in  
include (%s)\n", name, mca_oob_tcp_component.tcp_include);

 continue;
+}
 if (mca_oob_tcp_component.tcp_exclude != NULL &&
-strstr(mca_oob_tcp_component.tcp_exclude,name) != NULL)
+strstr(mca_oob_tcp_component.tcp_exclude,name) != NULL) {
+opal_output(0, "TCP OOB skipping %s because it's in  
exclude (%s)\n", name, mca_oob_tcp_component.tcp_exclude);

 continue;
+}
 opal_ifindextoaddr(i, (struct sockaddr*), sizeof(addr));
 if(opal_ifcount() > 1 &&
opal_ifislocalhost((struct sockaddr*) ))
@@ -1356,6 +1360,7 @@
 if(ptr != contact_info) {
 ptr += sprintf(ptr, ";");
 }
+opal_output(0, "TCP OOB adding interface: %s\n", name);
 ptr += sprintf(ptr, "tcp://%s:%d", inet_ntoa(addr.sin_addr),
 ntohs(mca_oob_tcp_component.tcp_listen_port));
 }

I attached the patch as well in case my mail client / the mailing list  
munges it.


--
Jeff Squyres
Cisco Systems



ompi-1.2.2-oob-tcp-verbose.patch
Description: Binary data




Re: [OMPI users] Torque and OpenMPI 1.2

2007-12-18 Thread Ralph H Castain
Hate to be a party-pooper, but the answer is "no" in OpenMPI 1.2. We don't
allow the use of a hostfile in a Torque environment in that version.

We have changed this for v1.3, but you'll have to wait for that release.

Sorry
Ralph



On 12/18/07 11:12 AM, "pat.o'bry...@exxonmobil.com"
 wrote:

> Tim,
>  Will OpenMPI 1.2.1 allow the use of a "hostfile"?
>  Thanks,
>   Pat
> 
> J.W. (Pat) O'Bryant,Jr.
> Business Line Infrastructure
> Technical Systems, HPC
> Office: 713-431-7022
> 
> 
> 
> 
>  Tim Prins
>     
> pi.org>To
>  Sent by: Open MPI Users  mpi.org>
>  users-
> bounces@ cc
>  open-mpi.org
>  
> Subject
>   Re: [OMPI users] Torque and
> OpenMPI
>  12/18/07 11:57   1.2
>  AM
> 
> 
>  Please respond
>to
>  Open MPI Users
>    i.org>
> 
> 
> 
> 
> 
> 
> 
> 
> Open MPI v1.2 had some problems with the TM configuration code which was
> fixed
> in v1.2.1. So any version v1.2.1 or later should work fine (and, as you
> indicate, 1.2.4 works fine).
> 
> Tim
> 
> On Tuesday 18 December 2007 12:48:40 pm pat.o'bry...@exxonmobil.com
> wrote:
>> Jeff,
>>Here is the result of the "pbs-config". By the way, I have
> successfully
>> built OpenMPI 1.2.4 on this same system. The "config.log" for OpenMPI
> 1.2.4
>> shows the correct Torque path. That is not surprising since the
> "configure"
>> script for OpenMPI 1.2.4 uses "pbs-config" while the configure
>> script for
>> OpenMPI 1.2 does not.
>> 
> ---
>> - # pbs-config --libs
>> -L/usr/local/pbs/x86_64/lib -ltorque -Wl,--rpath
>> -Wl,/usr/local/pbs/x86_64/lib
>> 
> ---
>> -
>> 
>>Now, to address your concern about the nodes, my users are not
> "adding
>> nodes" to those provided by Torque. They are using a "proper subset"
>> of
> the
>> nodes.  Also,  I believe I read this comment on the OpenMPI web site
> which
>> seems to imply an oversight as far as the "-hostfile" is concerned.
>> 
> ---
>> 
> 
>> - Can I specify a hostfile
>> or use
>> the --host option to mpirun when running in a Torque / PBS
>> environment?
>> As of version v1.2.1, no.
>> Open MPI will fail to launch processes properly when a hostfile is
> specifed
>> on the mpirun command line, or if the mpirun [--host] option is used.
>> 
>> 
>> We're working on correcting the error. A future version of Open MPI
>> will
>> likely launch on the hosts specified either in the hostfile or via the
>> --host option as long as they are a proper subset of the hosts
>> allocated
> to
>> the Torque / PBS Pro job.
>> 
> ---
>> 
> 
>> - Thanks,
>> 
>> J.W. (Pat) O'Bryant,Jr.
>> Business Line Infrastructure
>> Technical Systems, HPC
>> Office: 713-431-7022
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Torque and OpenMPI 1.2

2007-12-18 Thread pat . o'bryant
Tim,
 Will OpenMPI 1.2.1 allow the use of a "hostfile"?
 Thanks,
  Pat

J.W. (Pat) O'Bryant,Jr.
Business Line Infrastructure
Technical Systems, HPC
Office: 713-431-7022




 Tim Prins 
 To 
 Sent by: Open MPI Users   
 users-bounces@ cc 
 open-mpi.org  
   Subject 
  Re: [OMPI users] Torque and OpenMPI  
 12/18/07 11:57   1.2  
 AM


 Please respond
   to  
 Open MPI Users
 








Open MPI v1.2 had some problems with the TM configuration code which was
fixed
in v1.2.1. So any version v1.2.1 or later should work fine (and, as you
indicate, 1.2.4 works fine).

Tim

On Tuesday 18 December 2007 12:48:40 pm pat.o'bry...@exxonmobil.com wrote:
> Jeff,
> Here is the result of the "pbs-config". By the way, I have
successfully
> built OpenMPI 1.2.4 on this same system. The "config.log" for OpenMPI
1.2.4
> shows the correct Torque path. That is not surprising since the
"configure"
> script for OpenMPI 1.2.4 uses "pbs-config" while the configure script for
> OpenMPI 1.2 does not.
>
---
>- # pbs-config --libs
> -L/usr/local/pbs/x86_64/lib -ltorque -Wl,--rpath
> -Wl,/usr/local/pbs/x86_64/lib
>
---
>-
>
> Now, to address your concern about the nodes, my users are not
"adding
> nodes" to those provided by Torque. They are using a "proper subset" of
the
> nodes.  Also,  I believe I read this comment on the OpenMPI web site
which
> seems to imply an oversight as far as the "-hostfile" is concerned.
>
---
>

>- Can I specify a hostfile or use
> the --host option to mpirun when running in a Torque / PBS environment?
> As of version v1.2.1, no.
> Open MPI will fail to launch processes properly when a hostfile is
specifed
> on the mpirun command line, or if the mpirun [--host] option is used.
>
>
> We're working on correcting the error. A future version of Open MPI will
> likely launch on the hosts specified either in the hostfile or via the
> --host option as long as they are a proper subset of the hosts allocated
to
> the Torque / PBS Pro job.
>
---
>

>- Thanks,
>
> J.W. (Pat) O'Bryant,Jr.
> Business Line Infrastructure
> Technical Systems, HPC
> Office: 713-431-7022
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Torque and OpenMPI 1.2

2007-12-18 Thread Tim Prins
Open MPI v1.2 had some problems with the TM configuration code which was fixed 
in v1.2.1. So any version v1.2.1 or later should work fine (and, as you 
indicate, 1.2.4 works fine).

Tim

On Tuesday 18 December 2007 12:48:40 pm pat.o'bry...@exxonmobil.com wrote:
> Jeff,
> Here is the result of the "pbs-config". By the way, I have successfully
> built OpenMPI 1.2.4 on this same system. The "config.log" for OpenMPI 1.2.4
> shows the correct Torque path. That is not surprising since the "configure"
> script for OpenMPI 1.2.4 uses "pbs-config" while the configure script for
> OpenMPI 1.2 does not.
> ---
>- # pbs-config --libs
> -L/usr/local/pbs/x86_64/lib -ltorque -Wl,--rpath
> -Wl,/usr/local/pbs/x86_64/lib
> ---
>-
>
> Now, to address your concern about the nodes, my users are not "adding
> nodes" to those provided by Torque. They are using a "proper subset" of the
> nodes.  Also,  I believe I read this comment on the OpenMPI web site which
> seems to imply an oversight as far as the "-hostfile" is concerned.
> ---
>
>- Can I specify a hostfile or use
> the --host option to mpirun when running in a Torque / PBS environment?
> As of version v1.2.1, no.
> Open MPI will fail to launch processes properly when a hostfile is specifed
> on the mpirun command line, or if the mpirun [--host] option is used.
>
>
> We're working on correcting the error. A future version of Open MPI will
> likely launch on the hosts specified either in the hostfile or via the
> --host option as long as they are a proper subset of the hosts allocated to
> the Torque / PBS Pro job.
> ---
>
>- Thanks,
>
> J.W. (Pat) O'Bryant,Jr.
> Business Line Infrastructure
> Technical Systems, HPC
> Office: 713-431-7022
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Torque and OpenMPI 1.2

2007-12-18 Thread Jeff Squyres
Well that's fun.  Is this the library location where Torque put them  
by default?  What does "pbs-config --libs" return?


Also -- I second Reuti's question: what is the nature of your  
requirement such that you need to be able to run outside of the nodes  
that have been allocated to a job?  Are you running on multiple  
clusters simultaneously, or something along those lones?



On Dec 18, 2007, at 11:09 AM, pat.o'bry...@exxonmobil.com wrote:


We have Torque as an mpi job scheduler. Additionally,  I have some
users that want to modify the contents of "-hostfile" when they  
execute
"mpirun".  To allow the modification of the hostfile,  I downloaded  
OpenMPI

1.2 and attempted to do a "configure" with the options shown below:

./configure --prefix /opt/openmpi-1.2 --with-openib=/usr/local/ofed
--with-tm=/usr/local/pbs CC=icc CXX=icpc F77=ifort FC=ifort
--with-threads=posix --enable-mpi-threads

The configure fails with the following messages:
--
checking tm.h presence... yes
checking for tm.h... yes
looking for library in lib
checking for tm_finalize in -ltorque... no
looking for library in lib64
checking for tm_finalize in -ltorque... no
configure: error: TM support requested but not found.  Aborting
--

In looking at the configure script there are typos for "hapy" for  
"happy".

Correcting those made no difference. The "config.log" lists an "-L"
parameter that isn't the correct path for Torque. Our release of  
Torque,

2.2.0, contains libraries under "/usr/local/pbs/x86_64" not
"/usr/local/pbs" so the links will fail. I am assuming that the  
"configure"
script does not figure out the correct path for Torque 2.2.0  
libraries.


Config log messages:
--
configure:78250: result: no
configure:78274: result: looking for library in lib64
configure:78276: checking for tm_finalize in -ltorque
configure:78306: icc -o conftest -O3 -DNDEBUG -finline-functions
-fno-strict-aliasing -restrict -pthread  -I/usr/local/pbs/include
-L/usr/local/pbs/lib64 conftest.c -ltorque  -lnsl -lutil   >&5
ld: cannot find -ltorque
configure:78312: $? = 1
configure: failed program was:
| /* confdefs.h.  */
--

So is there a "configure" script that works with Torque 2.2 and  
OpenMPI 1.2

?

Thanks,

J.W. (Pat) O'Bryant,Jr.
Business Line Infrastructure
Technical Systems, HPC
Office: 713-431-7022

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems


Re: [OMPI users] Hi

2007-12-18 Thread Amit Kumar Saha
On 12/18/07, SaiGiridhar Ramasamy  wrote:
>
> Can you elaborate it still?

http://www-fp.mcs.anl.gov/CCST/research/reports_pre1998/comp_bio/stalk/pgapack.html

HTH,
Amit

-- 
Amit Kumar Saha
Writer, Programmer, Researcher
http://amitsaha.in.googlepages.com
http://amitksaha.blogspot.com


Re: [OMPI users] Hi

2007-12-18 Thread SaiGiridhar Ramasamy
Can you elaborate it still?


Re: [OMPI users] Hi

2007-12-18 Thread Amit Kumar Saha
On 12/18/07, SaiGiridhar Ramasamy  wrote:
>
>
> Great.I have few on hand experience with MPI(tracking target) it involved
> GA.We jus had an intro on parallel GA too.I prefer any kind of application
> which can be finished in 2 or 3 months.

How about trying out 'PGAPack' then?


> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


-- 
Amit Kumar Saha
Writer, Programmer, Researcher
http://amitsaha.in.googlepages.com
http://amitksaha.blogspot.com


Re: [OMPI users] Hi

2007-12-18 Thread SaiGiridhar Ramasamy
Great.I have few on hand experience with MPI(tracking target) it involved
GA.We jus had an intro on parallel GA too.I prefer any kind of application
which can be finished in 2 or 3 months.


Re: [OMPI users] Hi

2007-12-18 Thread Amit Kumar Saha
On 12/18/07, SaiGiridhar Ramasamy  wrote:
> hi,
> As a project for final year.Not just to test.

Okay, so are you experienced with parallel programming? What kind of
HPC applications are you looking for?

3-months back, I had no exposure to parallel programming.
Subsequently, I worked on a project titled "Review of Parallel
Computing" where I got familiar with PVM, MPI (Open MPI) and then
implemented a simple Parallel Search algorithm and also reviewed some
stuffs, issues related to Parallel Image processing.

Now, I am starting work with "Parallel Genetic Algorithms", since it
is one of my research areas.

So, it really depends on what you want to do.

There a lot of HPC applications you can work on - Mathematical,
Biological, Computer Vision, etc.

HTH,
Amit

-- 
Amit Kumar Saha
Writer, Programmer, Researcher
http://amitsaha.in.googlepages.com
http://amitksaha.blogspot.com


Re: [OMPI users] Hi

2007-12-18 Thread SaiGiridhar Ramasamy
hi,
As a project for final year.Not just to test.


Re: [OMPI users] Hi

2007-12-18 Thread Amit Kumar Saha
On 12/18/07, SaiGiridhar Ramasamy  wrote:
> Hi all,
>I've an operational cluster and soon about to form a cluster can
> anyone suggest any hpc application.

A HPC application to do a test run on your cluster?

--Amit

-- 
Amit Kumar Saha
Writer, Programmer, Researcher
http://amitsaha.in.googlepages.com
http://amitksaha.blogspot.com


[OMPI users] Hi

2007-12-18 Thread SaiGiridhar Ramasamy
Hi all,
   I've an operational cluster and soon about to form a cluster can
anyone suggest any hpc application.


Re: [OMPI users] Torque and OpenMPI 1.2

2007-12-18 Thread Reuti

Am 18.12.2007 um 17:09 schrieb pat.o'bry...@exxonmobil.com:

 We have Torque as an mpi job scheduler. Additionally,  I have  
some
users that want to modify the contents of "-hostfile" when they  
execute


Why do they want to modify the hostfile? They should stay with the  
granted machines and slots.


-- Reuti


"mpirun".  To allow the modification of the hostfile,  I downloaded  
OpenMPI

1.2 and attempted to do a "configure" with the options shown below:

./configure --prefix /opt/openmpi-1.2 --with-openib=/usr/local/ofed
--with-tm=/usr/local/pbs CC=icc CXX=icpc F77=ifort FC=ifort
--with-threads=posix --enable-mpi-threads

The configure fails with the following messages:
-- 


checking tm.h presence... yes
checking for tm.h... yes
looking for library in lib
checking for tm_finalize in -ltorque... no
looking for library in lib64
checking for tm_finalize in -ltorque... no
configure: error: TM support requested but not found.  Aborting
-- 



In looking at the configure script there are typos for "hapy" for  
"happy".

Correcting those made no difference. The "config.log" lists an "-L"
parameter that isn't the correct path for Torque. Our release of  
Torque,

2.2.0, contains libraries under "/usr/local/pbs/x86_64" not
"/usr/local/pbs" so the links will fail. I am assuming that the  
"configure"
script does not figure out the correct path for Torque 2.2.0  
libraries.


 Config log messages:
-- 


configure:78250: result: no
configure:78274: result: looking for library in lib64
configure:78276: checking for tm_finalize in -ltorque
configure:78306: icc -o conftest -O3 -DNDEBUG -finline-functions
-fno-strict-aliasing -restrict -pthread  -I/usr/local/pbs/include
-L/usr/local/pbs/lib64 conftest.c -ltorque  -lnsl -lutil   >&5
ld: cannot find -ltorque
configure:78312: $? = 1
configure: failed program was:
| /* confdefs.h.  */
-- 



So is there a "configure" script that works with Torque 2.2 and  
OpenMPI 1.2

?

Thanks,

J.W. (Pat) O'Bryant,Jr.
Business Line Infrastructure
Technical Systems, HPC
Office: 713-431-7022

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Bug in oob_tcp_[in|ex]clude?

2007-12-18 Thread Marco Sbrighi
On Mon, 2007-12-17 at 20:58 -0500, Brian Dobbins wrote:
> Hi Marco and Jeff,
> 
>   My own knowledge of OpenMPI's internals is limited, but I thought
> I'd add my less-than-two-cents...
> 
> > I've found only a way in order to have tcp connections
> binded only to
> > the eth1 interface, using both the following MCA directives
> in the
> > command line:
> >
> > mpirun  --mca oob_tcp_include eth1 --mca
> oob_tcp_include 
> > lo,eth0,ib0,ib1 .
> >
> > This sounds me as bug.
> 
> 
> Yes, it does.  Specifying the MCA same param twice on the
> command line
> results in undefined behavior -- it will only take one of
> them, and I 
> assume it'll take the first (but I'd have to check the code to
> be sure).
> 
>   I think that Marco intended to write:
>   mpirun  --mca oob_tcp_include eth1 --mca oob_tcp_exclude
> lo,eth0,ib0,ib1 ... 

no, I intended to write exactly what I wrote. The double statement is
reported by --mca mpi_show_mca_params exactly as I write one statement
only, as follows:

--mca oob_tcp_include eth1,lo,eth0,ib0,ib1

> 
>   Is this correct?  So you're not specifying include twice, you're
> specifying include and exclude, so each interface is explicitly stated
> in one list or the other.  I remember encountering this behaviour as
> well, in a slightly different format, but I can't seem to reproduce it
> now either. 

notice, the two lists are never intersecting.

>  That said, with these options, won't the MPI traffic (as opposed to
> the OOB traffic) still use the eth1,ib0 and ib1 interfaces?  You'd
> need to add '-mca btl_tcp_include eth1' in order to say it should only
> go over that NIC, I think. 

Yes I know, in fact -mca btl_tcp_[if]_exclude lo,eth0,ib0,ib1
works fine (seems). I'm using this MCA parameter since open-mpi 1.2.1
and the trouble with oob_tcp_[if]_[in|ex]clude sounded quite strange to
me, after all the code used for the parser should be more or less the
same . 

> 
>   As for the 'connection errors', two bizarre things to check are,
> first, that all of your nodes using eth1 actually have
> correct /etc/hosts mappings to the other nodes.  One system I ran on
> had this problem when some nodes had an IP address for node002 as one
> thing, and another node had node002's IP address as something
> different.   This should be easy enough by trying to run on one node
> first, then two nodes that you're sure have the correct addresses. 

Yes, I've already verified that. 

> 
>   .. The second situation is if you're launching an MPMD program.
> Here, you need to use '-gmca ' instead of '-mca '.
> 

No, currently I'm using only SPMD ones, and I hope to use them for the
rest of the century :-)

>   Hope some of that is at least a tad useful.  :) 
> 

Thanks you very much Brian,

Marco 

>   Cheers,
>   - Brian
> 
-- 
-
 Marco Sbrighi  m.sbri...@cineca.it

 HPC Group
 CINECA Interuniversity Computing Centre
 via Magnanelli, 6/3
 40033 Casalecchio di Reno (Bo) ITALY
 tel. 051 6171516



Re: [OMPI users] Bug in oob_tcp_[in|ex]clude?

2007-12-18 Thread Marco Sbrighi
On Mon, 2007-12-17 at 17:19 -0500, Jeff Squyres wrote:
> On Dec 17, 2007, at 8:35 AM, Marco Sbrighi wrote:
> 
> > I'm using Open MPI 1.2.2 over OFED 1.2 on an 256 nodes, dual Opteron,
> > dual core, Linux cluster. Of course, with Infiniband 4x interconnect.
> >
> > Each cluster node is equipped with 4 (or more) ethernet interface,
> > namely 2 gigabit ones plus 2 IPoIB. The two gig are named  eth0,eth1,
> > while the two IPoIB are named ib0,ib1.
> >
> > It happens that the eth0 is a management network, with poor
> > performances, and furthermore we wouldn't use the ib* to carry MPI's
> > traffic (neither OOB or TCP), so we would like the eth1 is used for  
> > open
> > MPI OOB and TCP.
> >
> > In order to drive the OOB over only eth1 I've tried various  
> > combinations
> > of oob_tcp_[ex|in]clude MCA statements, starting from the obvious
> >
> > oob_tcp_exclude = lo,eth0,ib0,ib1
> >
> > then trying the othe obvious:
> >
> > oob_tcp_include = eth1
> 
> This one statement (_include) should be sufficient.

I agree with your interpretation, but what I'm experimenting here is "it
should" but in fact it doesn't .

> 
> Assumedly this(these) statement(s) are in a config file that is being  
> read by Open MPI, such as $HOME/.openmpi/mca-params.conf?

I've tried many combinations: only in $HOME/.openmpi/mca-params.conf,
only in command line and both; but none seems to work correctly.
Nevertheless, what I'm expecting is that if something is specified in 
$HOME/.openmpi/mca-params.conf, then if differently specified in command
line, the last should be assumed, I think.
> 
> > and both at the same time.
> >
> > Next I've tried the following:
> >
> > oob_tcp_exclude = eth0
> >
> > but after the job starts, I still have a lot of tcp connections
> > established using eth0 or ib0 or ib1.
> > Furthermore It happens the following error:
> >
> >   [node191:03976] [0,1,14]-[0,1,12] mca_oob_tcp_peer_complete_connect:
> > connection failed: Connection timed out (110) - retrying
> 
> This is quite odd.  :-(
> 
> > I've found only a way in order to have tcp connections binded only to
> > the eth1 interface, using both the following MCA directives in the
> > command line:
> >
> > mpirun  --mca oob_tcp_include eth1 --mca oob_tcp_include  
> > lo,eth0,ib0,ib1 .
> >
> > This sounds me as bug.
> 
> Yes, it does.  Specifying the MCA same param twice on the command line  
> results in undefined behavior -- it will only take one of them, and I  
> assume it'll take the first (but I'd have to check the code to be sure).

OK, I can obtain the same behaviour using only one statement: 
--mca oob_tcp_include eth1,lo,eth0,ib0,ib1

note that using  --mca mpi_show_mca_params what I'm seeing in the report
is the same for both statements (twice and single):

.
 [node255:30188] oob_tcp_debug=0
[node255:30188] oob_tcp_include=eth1,lo,eth0,ib0,ib1
[node255:30188] oob_tcp_exclude=
...


> 
> > Is there someone able to reproduce this behaviour?
> > If this is a bug, are there fixes?
> 
> 
> I'm unfortunately unable to reproduce this behavior.  I have a test  
> cluster with 2 IP interfaces: ib0, eth0.  I have tried several  
> combinations of MCA params with 1.2.2:
> 
> --mca oob_tcp_include ib0
> --mca oob_tcp_include ib0,bogus
> --mca oob_tcp_include eth0
> --mca oob_tcp_include eth0,bogus
> --mca oob_tcp_exclude ib0
> --mca oob_tcp_exclude ib0,bogus
> --mca oob_tcp_exclude eth0
> --mca oob_tcp_exclude eth0,bogus
> 
> All do as they are supposed to -- including or excluding ib0 or eth0.
> 
> I do note, however, that the handling of these parameters changed in  
> 1.2.3 -- as well as their names.  The names changed to  
> "oob_tcp_if_include" and "oob_tcp_if_exclude" to match other MCA  
> parameter name conventions from other components.
> 
> Could you try with 1.2.3 or 1.2.4 (1.2.4 is the most recent; 1.2.5 is  
> due out "soon" -- it *may* get out before the holiday break, but no  
> promises...)?

we have 1.2.3 in another cluster and it performs the same behaviour as
1.2.2  (BTW the other cluster has the same eth ifaces)

> 
> If you can't upgrade, let me know and I can provide a debugging patch  
> that will give us a little more insight into what is happening on your  
> machines.  Thanks.

It is quite difficult for us to upgrade the open-mpi now. We have the
official CISCO packages installed, and I know the 1.2.2-1 is the only
official CISCO's open-mpi distribution today 

In any case I would like to try your debug patch.

Thanks

Marco 

> 
-- 
-
 Marco Sbrighi  m.sbri...@cineca.it

 HPC Group
 CINECA Interuniversity Computing Centre
 via Magnanelli, 6/3
 40033 Casalecchio di Reno (Bo) ITALY
 tel. 051 6171516



Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration

2007-12-18 Thread Ralph H Castain



On 12/18/07 7:35 AM, "Elena Zhebel"  wrote:

> Thanks a lot! Now it works!
> The solution is to use mpirun -n 1 -hostfile my.hosts *.exe and pass MPI_Info
> Key to the Spawn function!
> 
> One more question: is it necessary to start my "master" program with
> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe ?

No, it isn't necessary - assuming that my_master_host is the first host
listed in your hostfile! If you are only executing one my_master.exe (i.e.,
you gave -n 1 to mpirun), then we will automatically map that process onto
the first host in your hostfile.

If you want my_master.exe to go on someone other than the first host in the
file, then you have to give us the -host option.

> 
> Are there other possibilities for easy start?
> I would say just to run ./my_master.exe , but then the master process doesn't
> know about the available in the network hosts.

You can set the hostfile parameter in your environment instead of on the
command line. Just set OMPI_MCA_rds_hostfile_path = my.hosts.

You can then just run ./my_master.exe on the host where you want the master
to reside - everything should work the same.

Just as an FYI: the name of that environmental variable is going to change
in the 1.3 release, but everything will still work the same.

Hope that helps
Ralph


>  
> Thanks and regards,
> Elena
> 
> 
> -Original Message-
> From: Ralph H Castain [mailto:r...@lanl.gov]
> Sent: Monday, December 17, 2007 5:49 PM
> To: Open MPI Users ; Elena Zhebel
> Cc: Ralph H Castain
> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration
> 
> 
> 
> 
> On 12/17/07 8:19 AM, "Elena Zhebel"  wrote:
> 
>> Hello Ralph,
>> 
>> Thank you for your answer.
>> 
>> I'm using OpenMPI 1.2.3. , compiler glibc232, Linux Suse 10.0.
>> My "master" executable runs only on the one local host, then it spawns
>> "slaves" (with MPI::Intracomm::Spawn).
>> My question was: how to determine the hosts where these "slaves" will be
>> spawned?
>> You said: "You have to specify all of the hosts that can be used by
>> your job
>> in the original hostfile". How can I specify the host file? I can not
>> find it
>> in the documentation.
> 
> Hmmm...sorry about the lack of documentation. I always assumed that the MPI
> folks in the project would document such things since it has little to do
> with the underlying run-time, but I guess that fell through the cracks.
> 
> There are two parts to your question:
> 
> 1. how to specify the hosts to be used for the entire job. I believe that is
> somewhat covered here:
> http://www.open-mpi.org/faq/?category=running#simple-spmd-run
> 
> That FAQ tells you what a hostfile should look like, though you may already
> know that. Basically, we require that you list -all- of the nodes that both
> your master and slave programs will use.
> 
> 2. how to specify which nodes are available for the master, and which for
> the slave.
> 
> You would specify the host for your master on the mpirun command line with
> something like:
> 
> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe
> 
> This directs Open MPI to map that specified executable on the specified host
> - note that my_master_host must have been in my_hostfile.
> 
> Inside your master, you would create an MPI_Info key "host" that has a value
> consisting of a string "host1,host2,host3" identifying the hosts you want
> your slave to execute upon. Those hosts must have been included in
> my_hostfile. Include that key in the MPI_Info array passed to your Spawn.
> 
> We don't currently support providing a hostfile for the slaves (as opposed
> to the host-at-a-time string above). This may become available in a future
> release - TBD.
> 
> Hope that helps
> Ralph
> 
>> 
>> Thanks and regards,
>> Elena
>> 
>> -Original Message-
>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
>> Behalf Of Ralph H Castain
>> Sent: Monday, December 17, 2007 3:31 PM
>> To: Open MPI Users 
>> Cc: Ralph H Castain
>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster
>> configuration
>> 
>> On 12/12/07 5:46 AM, "Elena Zhebel"  wrote:
>>> 
>>> 
>>> Hello,
>>> 
>>> I'm working on a MPI application where I'm using OpenMPI instead of
>>> MPICH.
>>> 
>>> In my "master" program I call the function MPI::Intracomm::Spawn which
>> spawns
>>> "slave" processes. It is not clear for me how to spawn the "slave"
>> processes
>>> over the network. Currently "master" creates "slaves" on the same
>>> host.
>>> 
>>> If I use 'mpirun --hostfile openmpi.hosts' then processes are spawn
>>> over
>> the
>>> network as expected. But now I need to spawn processes over the
>>> network
>> from
>>> my own executable using MPI::Intracomm::Spawn, how can I achieve it?
>>> 
>> 
>> I'm not sure from your description exactly what you are trying to do,
>> nor in
>> what environment this is all 

Re: [OMPI users] Gigabit ethernet (PCI Express) and openmpi v1.2.4

2007-12-18 Thread Allan Menezes
Just to add. My whole cluster is intel em64t or x86_64 and with 
openmpiv1.2.4 i was getting for two pc express intel gigabit and a 
pciexpress gigabit ethernet Syskonnect @ 888, 892 and 892 Mbps  measured 
using NPtcp a sum total bandwidth of 1950Mbps on two identical different 
systems connected by three gigabit switches. But by changing to the beta 
version of openmpi, version 1.3a1r16973 nightly and recompiling NPtcp( 
which does not matter since it uses gcc) and NPmpi which uses the newer 
mpicc I get for the same setting between two seperate identical nodes 
2583Mbps which is a 3 fold increase in bandwidth! The MTU in all was 
default of 1500 for all eth cards and both trials. I am using Fedora 
Core 8, x86_64 for the operating system.

Allan Menezes

Hi,
I found the problem. It's a bug with openmpi v 1.2.4 i think. As below 
tests confirm(AND an big THANKS to George!) I compiled openmpi v 
1.3a1r16973 and tried the same tests with the same mca-params.conf file 
and got for three pci express gigabit ethernet cards a total bandwidth 
of 2583Mbps  which is close to 892+892+888=2672Mbps for a linear 
increase in b/w everything else the same except for a recompilation of 
NPmpi and Nptcp of netpipe. NPmpi uses mpicc to compile NPmpi whereas 
NPtcp is compiled with gcc!
I am now going to do some benchmarking with hpl of my basement cluster 
with openmpi v 1.3a1r16973 for increase in performnce and stability. V 
1.2.4 is stable and completes all 18 hpl tests without errors!
With openmpi v1.24 and NPmpi compiled wit's mpicc and using the shared 
memory commands below in --(a) I get for ./NPmpi -u 1 negative 
numbers for performance above approx 200Mbytes.

Some sort of  overflow in v1.2.4.
Thank you,
Regards,
Allan Menezes

Hi George, The following test peaks at 8392Mpbs: mpirun --prefix 
/opt/opnmpi124b --host a1,a1 -mca btl tcp,sm,self -np 2 ./NPmpi on a1 
and on a2


mpirun --prefix /opt/opnmpi124b --host a2,a2 -mca btl tcp,sm,self -np 2 ./NPmpi
gives 8565Mbps 
--(a)

on a1:
mpirun --prefix /opt/opnmpi124b --host a1,a1  -np 2 ./NPmpi

gives 8424Mbps on a2:

mpirun --prefix /opt/opnmpi124b --host a2,a2 -np 2 ./NPmpi

gives 8372Mbps So theres enough memory and processor b/w to give 2.7Gbps 
for 3 pci express eth cards especially from --(a) between a1 and a2? 
Thank you for your help. Any assistance would be greatly apprectiated! 
Regards, Allan Menezes You should run a shared memory test, to see 
what's the max memory bandwidth you can get. Thanks, george. On Dec 17, 
2007, at 7:14 AM, Gleb Natapov wrote:



On Sun, Dec 16, 2007 at 06:49:30PM -0500, Allan Menezes wrote:
 


Hi,
How many PCI-Express Gigabit ethernet cards does OpenMPI version  
1.2.4
support with a corresponding linear increase in bandwith measured  
with

netpipe NPmpi and openmpi mpirun?
With two PCI express cards I get a B/W of 1.75Gbps for 892Mbps each  
ans

for three pci express cards ( one built into the motherboard) i get
1.95Gbps. They all are around 890Mbs indiviually measured with  
netpipe

and NPtcp and NPmpi and openmpi. For two it seems there is a linear
increase in b/w but not for three pci express gigabit eth cards.
I have tune the cards using netpipe and $HOME/.openmpi/mca- 
params.conf

file for latency and percentage b/w .
Please advise.
   

What is in your $HOME/.openmpi/mca-params.conf? May be are hitting  
your

chipset limit here. What is your HW configuration? Can you try to run
NPtcp on each interface simultaneously and see what BW do you get.

--
Gleb.







Re: [OMPI users] Gigabit ethernet (PCI Express) and openmpi v1.2.4

2007-12-18 Thread Allan Menezes

Hi,
 I found the problem. It's a bug with openmpi v 1.2.4 i think. As below 
tests confirm(AND an big THANKS to George!) I compiled openmpi v 
1.3a1r16973 and tried the same tests with the same mca-params.conf file 
and got for three pci express gigabit ethernet cards a total bandwidth 
of 2583Mbps  which is close to 892+892+888=2672Mbps for a linear 
increase in b/w everything else the same except for a recompilation of 
NPmpi and Nptcp of netpipe. NPmpi uses mpicc to compile NPmpi whereas 
NPtcp is compiled with gcc!
I am now going to do some benchmarking with hpl of my basement cluster 
with openmpi v 1.3a1r16973 for increase in performnce and stability. V 
1.2.4 is stable and completes all 18 hpl tests without errors!
With openmpi v1.24 and NPmpi compiled wit's mpicc and using the shared 
memory commands below in --(a) I get for ./NPmpi -u 1 negative 
numbers for performance above approx 200Mbytes.

Some sort of  overflow in v1.2.4.
Thank you,
Regards,
Allan Menezes

Hi George, The following test peaks at 8392Mpbs: mpirun --prefix 
/opt/opnmpi124b --host a1,a1 -mca btl tcp,sm,self -np 2 ./NPmpi on a1 
and on a2


mpirun --prefix /opt/opnmpi124b --host a2,a2 -mca btl tcp,sm,self -np 2 ./NPmpi
gives 8565Mbps 
--(a)

on a1:
mpirun --prefix /opt/opnmpi124b --host a1,a1  -np 2 ./NPmpi

gives 8424Mbps on a2:

mpirun --prefix /opt/opnmpi124b --host a2,a2 -np 2 ./NPmpi

gives 8372Mbps So theres enough memory and processor b/w to give 2.7Gbps 
for 3 pci express eth cards especially from --(a) between a1 and a2? 
Thank you for your help. Any assistance would be greatly apprectiated! 
Regards, Allan Menezes You should run a shared memory test, to see 
what's the max memory bandwidth you can get. Thanks, george. On Dec 17, 
2007, at 7:14 AM, Gleb Natapov wrote:



On Sun, Dec 16, 2007 at 06:49:30PM -0500, Allan Menezes wrote:
 


Hi,
How many PCI-Express Gigabit ethernet cards does OpenMPI version  
1.2.4
support with a corresponding linear increase in bandwith measured  
with

netpipe NPmpi and openmpi mpirun?
With two PCI express cards I get a B/W of 1.75Gbps for 892Mbps each  
ans

for three pci express cards ( one built into the motherboard) i get
1.95Gbps. They all are around 890Mbs indiviually measured with  
netpipe

and NPtcp and NPmpi and openmpi. For two it seems there is a linear
increase in b/w but not for three pci express gigabit eth cards.
I have tune the cards using netpipe and $HOME/.openmpi/mca- 
params.conf

file for latency and percentage b/w .
Please advise.
   

What is in your $HOME/.openmpi/mca-params.conf? May be are hitting  
your

chipset limit here. What is your HW configuration? Can you try to run
NPtcp on each interface simultaneously and see what BW do you get.

--
Gleb.