from:"Lenny Verkhovsky"

Re: [OMPI devel] [OMPI users] OpenMPI fails with np > 65

2014-08-13 Thread Lenny Verkhovsky

Thank Josh,
Then I guess I will solve it internally ☺


Lenny Verkhovsky
SW Engineer,  Mellanox Technologies
www.mellanox.com<http://www.mellanox.com>

Office:+972 74 712 9244
Mobile:  +972 54 554 0233
Fax:+972 72 257 9400

From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Joshua Ladd
Sent: Wednesday, August 13, 2014 7:37 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] [OMPI users] OpenMPI fails with np > 65

Ah, I see. That change didn't make it into the release branch (I don't know if 
it was never CMRed or what, I have a vague recollection of it passing through.) 
If you need that change, then I recommend checking out the trunk at r30875. 
This was back when the trunk was in a more stable state.

Best,

Josh

On Wed, Aug 13, 2014 at 9:29 AM, Lenny Verkhovsky 
<len...@mellanox.com<mailto:len...@mellanox.com>> wrote:
Hi,
I needed the following commit

r30875 | vasily | 2014-02-27 13:29:47 +0200 (Thu, 27 Feb 2014) | 3 lines
OPENIB BTL/CONNECT: Add support for AF_IB addressing in rdmacm.

Following Gilles’s  mail about known #4857 issue I got update and now I can run 
with more than 65 hosts.
( thanks,  Gilles )

Since I am facing another problem, I probably should try 1.8rc as you suggested.
Thanks.
Lenny Verkhovsky
SW Engineer,  Mellanox Technologies
www.mellanox.com<http://www.mellanox.com>

Office:+972 74 712 9244<tel:%2B972%2074%20712%209244>
Mobile:  +972 54 554 0233<tel:%2B972%2054%20554%200233>
Fax:+972 72 257 9400<tel:%2B972%2072%20257%209400>

From: devel 
[mailto:devel-boun...@open-mpi.org<mailto:devel-boun...@open-mpi.org>] On 
Behalf Of Joshua Ladd
Sent: Wednesday, August 13, 2014 4:20 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] [OMPI users] OpenMPI fails with np > 65

Lenny,
Is there any particular reason that you're using the trunk? The reason I ask is 
because the trunk is in an unusually high state of flux at the moment with a 
major move underway. If you're trying to use OMPI for production grade runs, I 
would strongly advise picking up one of the stable releases in the 1.8.x 
series. At this time,1.8.1 is available as the most current stable release. The 
1.8.2rc3 prerelease candidate is also available:

http://www.open-mpi.org/software/ompi/v1.8/
Best,
Josh



On Wed, Aug 13, 2014 at 5:19 AM, Gilles Gouaillardet 
<gilles.gouaillar...@iferc.org<mailto:gilles.gouaillar...@iferc.org>> wrote:
Lenny,

that looks related to #4857 which has been fixed in trunk since r32517

could you please update your openmpi library and try again ?

Gilles


On 2014/08/13 17:00, Lenny Verkhovsky wrote:

Following Jeff's suggestion adding devel mailing list.



Hi All,

I am currently facing strange situation that I can't run OMPI on more than 65 
nodes.

It seems like environmental issue that does not allow me to open more 
connections.

Any ideas ?

Log attached, more info below in the mail.



Running OMPI from trunk

[node-119.ssauniversal.ssa.kodiak.nx:02996] [[56978,0],65] ORTE_ERROR_LOG: 
Error in file base/ess_base_std_orted.c at line 288



Thanks.

Lenny Verkhovsky

SW Engineer,  Mellanox Technologies

www.mellanox.com<http://www.mellanox.com><http://www.mellanox.com><http://www.mellanox.com>





Office:+972 74 712 9244<tel:%2B972%2074%20712%209244>

Mobile:  +972 54 554 0233<tel:%2B972%2054%20554%200233>

Fax:+972 72 257 9400<tel:%2B972%2072%20257%209400>



From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Lenny Verkhovsky

Sent: Tuesday, August 12, 2014 1:13 PM

To: Open MPI Users

Subject: Re: [OMPI users] OpenMPI fails with np > 65





Hi,



Config:

./configure --enable-openib-rdmacm-ibaddr --prefix /home/sources/ompi-bin 
--enable-mpirun-prefix-by-default --with-openib=/usr/local --enable-debug 
--disable-openib-connectx-xrc



Run:

/home/sources/ompi-bin/bin/mpirun -np 65 --host 
ko0067,ko0069,ko0070,ko0074,ko0076,ko0079,ko0080,ko0082,ko0085,ko0087,ko0088,ko0090,ko0096,ko0098,ko0099,ko0101,ko0103,ko0107,ko0111,ko0114,ko0116,ko0125,ko0128,ko0134,ko0141,ko0144,ko0145,ko0148,ko0149,ko0150,ko0152,ko0154,ko0156,ko0157,ko0158,ko0162,ko0164,ko0166,ko0168,ko0170,ko0174,ko0178,ko0181,ko0185,ko0190,ko0192,ko0195,ko0197,ko0200,ko0203,ko0205,ko0207,ko0209,ko0210,ko0211,ko0213,ko0214,ko0217,ko0218,ko0223,ko0228,ko0229,ko0231,ko0235,ko0237
 --mca btl openib,self  --mca btl_openib_cpc_include rdmacm --mca pml ob1 --mca 
btl_openib_if_include mthca0:1 --mca plm_base_verbose 5 --debug-daemons 
hostname 2>&1|tee > /tmp/mpi.log



Environment:

 According to the attached log it's rsh environment





Output attached



Notes:

The problem is always with tha last node, 64 connections work, 65 connections 
fail.

node-119.ssauniversal.ssa.kodiak.nx == ko0237



mpi.log line 1034:

--

An invalid value was supplied for an enum variable.

Re: [OMPI devel] [OMPI users] OpenMPI fails with np > 65

2014-08-13 Thread Lenny Verkhovsky

Hi,
I needed the following commit

r30875 | vasily | 2014-02-27 13:29:47 +0200 (Thu, 27 Feb 2014) | 3 lines
OPENIB BTL/CONNECT: Add support for AF_IB addressing in rdmacm.

Following Gilles’s  mail about known #4857 issue I got update and now I can run 
with more than 65 hosts.
( thanks,  Gilles )

Since I am facing another problem, I probably should try 1.8rc as you suggested.
Thanks.
Lenny Verkhovsky
SW Engineer,  Mellanox Technologies
www.mellanox.com<http://www.mellanox.com>

Office:+972 74 712 9244
Mobile:  +972 54 554 0233
Fax:+972 72 257 9400

From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Joshua Ladd
Sent: Wednesday, August 13, 2014 4:20 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] [OMPI users] OpenMPI fails with np > 65

Lenny,
Is there any particular reason that you're using the trunk? The reason I ask is 
because the trunk is in an unusually high state of flux at the moment with a 
major move underway. If you're trying to use OMPI for production grade runs, I 
would strongly advise picking up one of the stable releases in the 1.8.x 
series. At this time,1.8.1 is available as the most current stable release. The 
1.8.2rc3 prerelease candidate is also available:

http://www.open-mpi.org/software/ompi/v1.8/
Best,
Josh




On Wed, Aug 13, 2014 at 5:19 AM, Gilles Gouaillardet 
<gilles.gouaillar...@iferc.org<mailto:gilles.gouaillar...@iferc.org>> wrote:
Lenny,

that looks related to #4857 which has been fixed in trunk since r32517

could you please update your openmpi library and try again ?

Gilles


On 2014/08/13 17:00, Lenny Verkhovsky wrote:

Following Jeff's suggestion adding devel mailing list.



Hi All,

I am currently facing strange situation that I can't run OMPI on more than 65 
nodes.

It seems like environmental issue that does not allow me to open more 
connections.

Any ideas ?

Log attached, more info below in the mail.



Running OMPI from trunk

[node-119.ssauniversal.ssa.kodiak.nx:02996] [[56978,0],65] ORTE_ERROR_LOG: 
Error in file base/ess_base_std_orted.c at line 288



Thanks.

Lenny Verkhovsky

SW Engineer,  Mellanox Technologies

www.mellanox.com<http://www.mellanox.com><http://www.mellanox.com><http://www.mellanox.com>





Office:+972 74 712 9244<tel:%2B972%2074%20712%209244>

Mobile:  +972 54 554 0233<tel:%2B972%2054%20554%200233>

Fax:+972 72 257 9400<tel:%2B972%2072%20257%209400>



From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Lenny Verkhovsky

Sent: Tuesday, August 12, 2014 1:13 PM

To: Open MPI Users

Subject: Re: [OMPI users] OpenMPI fails with np > 65





Hi,



Config:

./configure --enable-openib-rdmacm-ibaddr --prefix /home/sources/ompi-bin 
--enable-mpirun-prefix-by-default --with-openib=/usr/local --enable-debug 
--disable-openib-connectx-xrc



Run:

/home/sources/ompi-bin/bin/mpirun -np 65 --host 
ko0067,ko0069,ko0070,ko0074,ko0076,ko0079,ko0080,ko0082,ko0085,ko0087,ko0088,ko0090,ko0096,ko0098,ko0099,ko0101,ko0103,ko0107,ko0111,ko0114,ko0116,ko0125,ko0128,ko0134,ko0141,ko0144,ko0145,ko0148,ko0149,ko0150,ko0152,ko0154,ko0156,ko0157,ko0158,ko0162,ko0164,ko0166,ko0168,ko0170,ko0174,ko0178,ko0181,ko0185,ko0190,ko0192,ko0195,ko0197,ko0200,ko0203,ko0205,ko0207,ko0209,ko0210,ko0211,ko0213,ko0214,ko0217,ko0218,ko0223,ko0228,ko0229,ko0231,ko0235,ko0237
 --mca btl openib,self  --mca btl_openib_cpc_include rdmacm --mca pml ob1 --mca 
btl_openib_if_include mthca0:1 --mca plm_base_verbose 5 --debug-daemons 
hostname 2>&1|tee > /tmp/mpi.log



Environment:

 According to the attached log it's rsh environment





Output attached



Notes:

The problem is always with tha last node, 64 connections work, 65 connections 
fail.

node-119.ssauniversal.ssa.kodiak.nx == ko0237



mpi.log line 1034:

--

An invalid value was supplied for an enum variable.

  Variable : orte_debug_daemons

  Value: 1,1

  Valid values : 0: f|false|disabled, 1: t|true|enabled

--



mpi.log line 1059:

[node-119.ssauniversal.ssa.kodiak.nx:02996] [[56978,0],65] ORTE_ERROR_LOG: 
Error in file base/ess_base_std_orted.c at line 288







Lenny Verkhovsky

SW Engineer,  Mellanox Technologies

www.mellanox.com<http://www.mellanox.com><http://www.mellanox.com><http://www.mellanox.com>





Office:+972 74 712 9244<tel:%2B972%2074%20712%209244>

Mobile:  +972 54 554 0233<tel:%2B972%2054%20554%200233>

Fax:+972 72 257 9400<tel:%2B972%2072%20257%209400>



From: users [mailto:users-boun...@open-mpi.org

] On Behalf Of Ralph Castain

Sent: Monday, August 11, 2014 4:53 PM

To: Open MPI Users

Subject: Re: [OMPI users] OpenMPI fails with np > 65



Okay, let's start with the basics :-)



How was this configured? What environment are you running in (rsh, slurm, ??)? 
If y

Re: [OMPI devel] [OMPI users] OpenMPI fails with np > 65

2014-08-13 Thread Lenny Verkhovsky

Following Jeff's suggestion adding devel mailing list.

Hi All,
I am currently facing strange situation that I can't run OMPI on more than 65 
nodes.
It seems like environmental issue that does not allow me to open more 
connections.
Any ideas ?
Log attached, more info below in the mail.

Running OMPI from trunk
[node-119.ssauniversal.ssa.kodiak.nx:02996] [[56978,0],65] ORTE_ERROR_LOG: 
Error in file base/ess_base_std_orted.c at line 288

Thanks.
Lenny Verkhovsky
SW Engineer,  Mellanox Technologies
www.mellanox.com<http://www.mellanox.com>

Office:+972 74 712 9244
Mobile:  +972 54 554 0233
Fax:+972 72 257 9400

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Lenny Verkhovsky
Sent: Tuesday, August 12, 2014 1:13 PM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI fails with np > 65


Hi,

Config:
./configure --enable-openib-rdmacm-ibaddr --prefix /home/sources/ompi-bin 
--enable-mpirun-prefix-by-default --with-openib=/usr/local --enable-debug 
--disable-openib-connectx-xrc

Run:
/home/sources/ompi-bin/bin/mpirun -np 65 --host 
ko0067,ko0069,ko0070,ko0074,ko0076,ko0079,ko0080,ko0082,ko0085,ko0087,ko0088,ko0090,ko0096,ko0098,ko0099,ko0101,ko0103,ko0107,ko0111,ko0114,ko0116,ko0125,ko0128,ko0134,ko0141,ko0144,ko0145,ko0148,ko0149,ko0150,ko0152,ko0154,ko0156,ko0157,ko0158,ko0162,ko0164,ko0166,ko0168,ko0170,ko0174,ko0178,ko0181,ko0185,ko0190,ko0192,ko0195,ko0197,ko0200,ko0203,ko0205,ko0207,ko0209,ko0210,ko0211,ko0213,ko0214,ko0217,ko0218,ko0223,ko0228,ko0229,ko0231,ko0235,ko0237
 --mca btl openib,self  --mca btl_openib_cpc_include rdmacm --mca pml ob1 --mca 
btl_openib_if_include mthca0:1 --mca plm_base_verbose 5 --debug-daemons 
hostname 2>&1|tee > /tmp/mpi.log

Environment:
 According to the attached log it's rsh environment


Output attached

Notes:
The problem is always with tha last node, 64 connections work, 65 connections 
fail.
node-119.ssauniversal.ssa.kodiak.nx == ko0237

mpi.log line 1034:
--
An invalid value was supplied for an enum variable.
  Variable : orte_debug_daemons
  Value: 1,1
  Valid values : 0: f|false|disabled, 1: t|true|enabled
--

mpi.log line 1059:
[node-119.ssauniversal.ssa.kodiak.nx:02996] [[56978,0],65] ORTE_ERROR_LOG: 
Error in file base/ess_base_std_orted.c at line 288



Lenny Verkhovsky
SW Engineer,  Mellanox Technologies
www.mellanox.com<http://www.mellanox.com>

Office:+972 74 712 9244
Mobile:  +972 54 554 0233
Fax:+972 72 257 9400

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
Sent: Monday, August 11, 2014 4:53 PM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI fails with np > 65

Okay, let's start with the basics :-)

How was this configured? What environment are you running in (rsh, slurm, ??)? 
If you configured --enable-debug, then please run it with

--mca plm_base_verbose 5 --debug-daemons

and send the output


On Aug 11, 2014, at 12:07 AM, Lenny Verkhovsky 
<len...@mellanox.com<mailto:len...@mellanox.com>> wrote:

I don't think so,
It's always the 66th node, even if I swap between 65th and 66th
I also get the same error when setting np=66, while having only 65 hosts in 
hostfile
(I am using only tcp btl )


Lenny Verkhovsky
SW Engineer,  Mellanox Technologies
www.mellanox.com<http://www.mellanox.com/>

Office:+972 74 712 9244
Mobile:  +972 54 554 0233
Fax:+972 72 257 9400

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
Sent: Monday, August 11, 2014 1:07 AM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI fails with np > 65

Looks to me like your 65th host is missing the dstore library - is it possible 
you don't have your paths set correctly on all hosts in your hostfile?


On Aug 10, 2014, at 1:13 PM, Lenny Verkhovsky 
<len...@mellanox.com<mailto:len...@mellanox.com>> wrote:


Hi all,

Trying to run OpenMPI ( trunk Revision: 32428 ) I faced the problem running 
OMPI with more than 65 procs.
It looks like MPI failes to open 66th connection even with running `hostname` 
over tcp.
It also seems to unrelated to specific host.
All hosts are Ubuntu 12.04.1 LTS

mpirun -np 66 --hostfile /proj/SSA/Mellanox/tmp//20140810_070156_hostfile.txt 
--mca btl tcp,self hostname
[nodename] [[4452,0],65] ORTE_ERROR_LOG: Error in file 
base/ess_base_std_orted.c at line 288

...
It looks like environment issue, but I can't find any limit related.
Any ideas ?
Thanks.
Lenny Verkhovsky
SW Engineer,  Mellanox Technologies
www.mellanox.com<http://www.mellanox.com/>

Office:+972 74 712 9244
Mobile:  +972 54 554 0233
Fax:+972 72 257 9400

___
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listi

Re: [OMPI devel] crash when using coll_tuned_use_dynamic_rules option with 1.4

2010-01-24 Thread Lenny Verkhovsky

It's a known issue.
try to provide file with rules.
https://svn.open-mpi.org/trac/ompi/ticket/2087
Lenny.
On Fri, Jan 22, 2010 at 8:25 PM, Holger Berger  wrote:

> Hi,
>
> I tracked this down a bit, and my impression is that this piece of code in
> coll_tuned_component.c
>
>if (ompi_coll_tuned_use_dynamic_rules) {
>
>  mca_base_param_reg_string(_coll_tuned_component.super.collm_version,
>  "dynamic_rules_filename",
>  "Filename of configuration file that
> contains the dynamic (@runtime) decision function rules",
>  false, false,
> ompi_coll_tuned_dynamic_rules_filename,
>  _coll_tuned_dynamic_rules_filename);
>if( ompi_coll_tuned_dynamic_rules_filename ) {
>OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:component_open
> Reading collective rules file [%s]",
> ompi_coll_tuned_dynamic_rules_filename));
>rc = ompi_coll_tuned_read_rules_config_file(
> ompi_coll_tuned_dynamic_rules_filename,
>
> &(mca_coll_tuned_component.all_base_rules), COLLCOUNT);
>if( rc >= 0 ) {
>OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:module_open
> Read %d valid rules\n", rc));
>} else {
>OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:module_open
> Reading collective rules file failed\n"));
>mca_coll_tuned_component.all_base_rules = NULL;
>}
>}
>
>}
>
> Does not initialize the msg_rules as ompi_coll_tuned_read_rules_config_file
> does it by calling
> ompi_coll_tuned_mk_msg_rules in the case that
>
> ompi_coll_tuned_use_dynamic_rules is TRUE
> and
> ompi_coll_tuned_dynamic_rules_filename is FALSE
>
> which leads to a crash in line
>  if( (NULL == base_com_rule) || (0 == base_com_rule->n_msg_sizes))
> in coll_tuned_dynamic_rules.c:361
> as base_com_rule seems to unitialized, but NOT zero, and points
> somewhere...
>
>
> That is probably not inteneded, as it prohibits the selection of an
> algorithm
> by switch like -mca coll_tuned_alltoall_algorithm 2.
>
> Hope that helps fixing it...
>
>
>
>
>
> --
> Holger Berger
> System Integration and Support
> HPCE Division NEC Deutschland GmbH
> Tel: +49-711-6877035 hber...@hpce.nec.com
> Fax: +49-711-6877145 http://www.nec.com/de
> NEC Deutschland GmbH, Hansaallee 101, 40549 Düsseldorf
> Geschäftsführer Yuya Momose
> Handelsregister Düsseldorf HRB 57941; VAT ID DE129424743
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] [OMPI users] openmpi 1.4 broken -mca coll_tuned_use_dynamic_rules 1

2009-12-30 Thread Lenny Verkhovsky

The only workaround that I found is a file with dynamic rules.
This is an example that George sent me once. It helped for me, until it will
be fixed.

" Lenny,

You asked for dynamic rules but it looks like you didn't provide them.
Dynamic rules allow the user to specify which algorithm to be used for each
collective based on a set of rules. I corrected the current behavior, so it
will not crash. However, as you didn't provide dynamic rules, it will just
switch back to default behavior (i.e. ignore the
coll_tuned_use_dynamic_rules MCA parameter).

As an example, here is a set of dynamic rules. I added some comment to
clarify it, but if you have any questions please ask.

2 # num of collectives
3 # ID = 3 Alltoall collective (ID in coll_tuned.h)
1 # number of com sizes
64 # comm size 64
2 # number of msg sizes
0 3 0 0 # for message size 0, bruck 1, topo 0, 0 segmentation
8192 2 0 0 # 8k+, pairwise 2, no topo or segmentation
# end of collective rule
#
2 # ID = 2 Allreduce collective (ID in coll_tuned.h)
1 # number of com sizes
1 # comm size 2
2 # number of msg sizes
0 1 0 0 # for message size 0, basic linear 1, topo 0, 0 segmentation
1024 2 0 0 # for messages size > 1024, nonoverlapping 2, topo 0, 0
segmentation
# end of collective rule
#

And here is what I have in my $(HOME)/.openmpi/mca-params.conf to activate
them:
#
# Dealing with collective
#
coll_base_verbose = 0

coll_tuned_use_dynamic_rules = 1
coll_tuned_dynamic_rules_filename = **the name of the file where you saved
the rules **

"

On Wed, Dec 30, 2009 at 4:44 PM, Daniel Spångberg <dani...@mkem.uu.se>wrote:

> Interesting. I found your issue before I sent my report, but I did not
> realise that this was the same problem. I see now that your example is
> really for openmpi 1.3.4++
>
> Do you know of a work around? I have not used a rule file before and seem
> to be unable to find the documentation for how to use one, unfortunately.
>
> Daniel
>
> Den 2009-12-30 15:17:17 skrev Lenny Verkhovsky <lenny.verkhov...@gmail.com
> >:
>
>
>  This is the a knowing issue,
>>https://svn.open-mpi.org/trac/ompi/ticket/2087
>> Maybe it's priority should be raised up.
>> Lenny.
>>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI devel] [OMPI users] openmpi 1.4 broken -mca coll_tuned_use_dynamic_rules 1

2009-12-30 Thread Lenny Verkhovsky

This is the a knowing issue,
https://svn.open-mpi.org/trac/ompi/ticket/2087
Maybe it's priority should be raised up.
Lenny.

On Wed, Dec 30, 2009 at 12:13 PM, Daniel Spångberg wrote:

> Dear OpenMPI list,
>
> I have used the dynamic rules for collectives to be able to select one
> specific algorithm. With the latest versions of openmpi this seems to be
> broken. Just enabling coll_tuned_use_dynamic_rules causes the code to
> segfault. However, I do not provide a file with rules, since I just want to
> modify the behavior of one routine.
>
> I have tried the below example code on openmpi 1.3.2, 1.3.3, 1.3.4, and
> 1.4. It *works* on 1.3.2, 1.3.3, but segfaults on 1.3.4 and 1.4. I have
> confirmed this on Scientific Linux 5.2, and 5.4. I have also successfully
> reproduced the crash using version 1.4 running on debian etch. All running
> on amd64, compiled from source without other options to configure than
> --prefix. The crash occurs whether I use the intel 11.1 compiler (via env
> CC) or gcc. It also occurs no matter the btl is set to openib,self tcp,self
> sm,self or combinations of those. See below for ompi_info and other info. I
> have tried MPI_Alltoall, MPI_Alltoallv, and MPI_Allreduce which behave the
> same.
>
> #include 
> #include 
>


>
> int main(int argc, char **argv)
> {
>  int rank,size;
>  char *buffer, *buffer2;
>
>  MPI_Init(,);
>
>  MPI_Comm_size(MPI_COMM_WORLD,);
>  MPI_Comm_rank(MPI_COMM_WORLD,);
>
>  buffer=calloc(100*size,1);
>  buffer2=calloc(100*size,1);
>
>  MPI_Alltoall(buffer,100,MPI_BYTE,buffer2,100,MPI_BYTE,MPI_COMM_WORLD);
>
>  MPI_Finalize();
>  return 0;
> }
>
> Demonstrated behaviour:
>
> $ ompi_info
> Package: Open MPI daniels@arthur Distribution
>Open MPI: 1.4
>   Open MPI SVN revision: r22285
>   Open MPI release date: Dec 08, 2009
>Open RTE: 1.4
>   Open RTE SVN revision: r22285
>   Open RTE release date: Dec 08, 2009
>OPAL: 1.4
>   OPAL SVN revision: r22285
>   OPAL release date: Dec 08, 2009
>Ident string: 1.4
>  Prefix:
> /home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install
>  Configured architecture: x86_64-unknown-linux-gnu
>  Configure host: arthur
>   Configured by: daniels
>   Configured on: Tue Dec 29 16:54:37 CET 2009
>  Configure host: arthur
>Built by: daniels
>Built on: Tue Dec 29 17:04:36 CET 2009
>  Built host: arthur
>  C bindings: yes
>C++ bindings: yes
>  Fortran77 bindings: yes (all)
>  Fortran90 bindings: yes
>  Fortran90 bindings size: small
>  C compiler: gcc
> C compiler absolute: /usr/bin/gcc
>C++ compiler: g++
>   C++ compiler absolute: /usr/bin/g++
>  Fortran77 compiler: gfortran
>  Fortran77 compiler abs: /usr/bin/gfortran
>  Fortran90 compiler: gfortran
>  Fortran90 compiler abs: /usr/bin/gfortran
> C profiling: yes
>   C++ profiling: yes
> Fortran77 profiling: yes
> Fortran90 profiling: yes
>  C++ exceptions: no
>  Thread support: posix (mpi: no, progress: no)
>   Sparse Groups: no
>  Internal debug support: no
> MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
> libltdl support: yes
>   Heterogeneous support: no
>  mpirun default --prefix: no
> MPI I/O support: yes
>   MPI_WTIME support: gettimeofday
> Symbol visibility support: yes
>   FT Checkpoint support: no  (checkpoint thread: no)
>   MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.4)
>  MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.4)
>   MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.4)
>
>   MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.4)
>   MCA carto: file (MCA v2.0, API v2.0, Component v1.4)
>   MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.4)
>   MCA timer: linux (MCA v2.0, API v2.0, Component v1.4)
> MCA installdirs: env (MCA v2.0, API v2.0, Component v1.4)
> MCA installdirs: config (MCA v2.0, API v2.0, Component v1.4)
> MCA dpm: orte (MCA v2.0, API v2.0, Component v1.4)
>  MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.4)
>   MCA allocator: basic (MCA v2.0, API v2.0, Component v1.4)
>   MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.4)
>MCA coll: basic (MCA v2.0, API v2.0, Component v1.4)
>MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.4)
>MCA coll: inter (MCA v2.0, API v2.0, Component v1.4)
>MCA coll: self (MCA v2.0, API v2.0, Component v1.4)
>MCA coll: sm (MCA v2.0, API v2.0, Component v1.4)
>MCA coll: sync (MCA v2.0, API v2.0, Component v1.4)
>MCA coll: tuned (MCA v2.0,

Re: [OMPI devel] SEGFAULT in mpi_init from paffinity with intel 11.1.059 compiler

2009-12-16 Thread Lenny Verkhovsky

Hi,
can you provide $cat /proc/cpuinfo
I am not optimistic that it will help, but still...
thanks
Lenny.

On Wed, Dec 16, 2009 at 6:01 PM, Daan van Rossum wrote:

> Hi Terry,
>
> Thanks for your hint. I tried configure --enable-debug and even compiled it
> with all kind of manual debug flags turned on, but it doesn't help to get
> rid of this problem. So it definitively is not an optimization flaw.
> One more interesting test would be to try an older version of the Intel
> compiler. But the next older version that I have is 10.0.015, which is too
> old for the operating system (must be >10.1).
>
>
> A good thing is that this bug is very easy to test. You only need one line
> of MPI code and one process in the execution.
>
> A few more test cases:
>  rank 0=node01 slot=1-7
> and
>  rank 0=node01 slot=0,2-7
> and
>  rank 0=node01 slot=0-1,3-7
> work WELL.
> But
>  rank 0=node01 slot=0-2,4-7
> FAILS.
>
> As long as either slot 0, 1, OR 2 is excluded from the list it's allright.
> Excluding a different slot, like slot 3, does not help.
>
>
> I'll try to get hold of an Intel v10.1 compiler version.
>
> Best,
> Daan
>
> * on Monday, 14.12.09 at 14:57, Terry Dontje  wrote:
>
> > I don't really want to throw fud on this list but we've seen all
> > sorts of oddities with OMPI 1.3.4 being built with Intel's 11.1
> > compiler versus their 11.0 or other compilers (gcc, Sun Studio, pgi,
> > and pathscale).  I have not tested your specific failing case but
> > considering your issue doesn't show up with gcc I am wondering if
> > there is some sort of optimization issue with the 11.1 compiler.
> >
> > It might be interesting to see if using certain optimization levels
> > with the Intel 11.1 compiler produces a working OMPI library.
> >
> > --td
> >
> > Daan van Rossum wrote:
> > >Hi Ralph,
> > >
> > >I took the Dec 10th snapshot, but got exactly the same behavior as with
> version 1.3.4.
> > >
> > >I just noticed that even this rankfile doesn't work, with a single
> process:
> > > rank 0=node01 slot=0-3
> > >
> > >
> > >[node01:31105] mca:base:select:(paffinity) Querying component [linux]
> > >[node01:31105] mca:base:select:(paffinity) Query of component [linux]
> set priority to 10
> > >[node01:31105] mca:base:select:(paffinity) Selected component [linux]
> > >[node01:31105] paffinity slot assignment: slot_list == 0-3
> > >[node01:31105] paffinity slot assignment: rank 0 runs on cpu #0 (#0)
> > >[node01:31105] paffinity slot assignment: rank 0 runs on cpu #1 (#1)
> > >[node01:31105] paffinity slot assignment: rank 0 runs on cpu #2 (#2)
> > >[node01:31105] paffinity slot assignment: rank 0 runs on cpu #3 (#3)
> > >[node01:31106] mca:base:select:(paffinity) Querying component [linux]
> > >[node01:31106] mca:base:select:(paffinity) Query of component [linux]
> set priority to 10
> > >[node01:31106] mca:base:select:(paffinity) Selected component [linux]
> > >[node01:31106] paffinity slot assignment: slot_list == 0-3
> > >[node01:31106] paffinity slot assignment: rank 0 runs on cpu #0 (#0)
> > >[node01:31106] paffinity slot assignment: rank 0 runs on cpu #1 (#1)
> > >[node01:31106] paffinity slot assignment: rank 0 runs on cpu #2 (#2)
> > >[node01:31106] paffinity slot assignment: rank 0 runs on cpu #3 (#3)
> > >[node01:31106] *** An error occurred in MPI_Comm_rank
> > >[node01:31106] *** on a NULL communicator
> > >[node01:31106] *** Unknown error
> > >[node01:31106] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> > >forrtl: severe (174): SIGSEGV, segmentation fault occurred
> > >
> > >
> > >The spawned compute process doesn't sense that it should skip the
> setting paffinity...
> > >
> > >
> > >I saw the posting from last July about a similar problem (the problem
> that I mentioned on the bottom, with the slot=0:* notation not working). But
> that is a different problem (besides, that is still not working as it
> seems).
> > >
> > >Best,
> > >Daan
> > >
> > >* on Saturday, 12.12.09 at 18:48, Ralph Castain 
> wrote:
> > >
> > >>This looks like an uninitialized variable that gnu c handles one way
> and intel another. Someone recently contributed a patch to the ompi trunk to
> fix just such a  thing in this code area - don't know if it addresses this
> problem or not.
> > >>
> > >>Can you try the ompi trunk (a nightly tarball from the last day or so
> forward) and see if this still occurs?
> > >>
> > >>Thanks
> > >>Ralph
> > >>
> > >>On Dec 11, 2009, at 4:06 PM, Daan van Rossum wrote:
> > >>
> > >>>Hi all,
> > >>>
> > >>>There's a problem with ompi 1.3.4 when compiled with the intel
> 11.1.059 c compiler, related with the built in processor binding
> functionallity. The problem does not occur when ompi is compiled with the
> gnu c compiler.
> > >>>
> > >>>A mpi program execution fails (segfault) on mpi_init() when the
> following rank file is used:
> > >>>rank 0=node01 slot=0-3
> > >>>rank 1=node01 slot=0-3
> > >>>but runs

Re: [OMPI devel] [PATCH] Improving heterogeneous IB clusterssupport.

2009-11-29 Thread Lenny Verkhovsky

We are currently out of heterogeneous cluster and can't really check this
patch.
Lenny.

On Wed, Nov 25, 2009 at 12:14 AM, Jeff Squyres  wrote:

> On Nov 16, 2009, at 10:46 AM, Vasily Philipov wrote:
>
>  Here is new patch for heterogeneous clusters supporting.
>>
>>
> Voltaire / IBM / Sun -- please review and test this patch.  You guys care
> about this stuff more than I do.  :-)
>
> My comments below.
>
>  diff -r 521e5f4b161a ompi/mca/btl/openib/btl_openib.c
>> --- a/ompi/mca/btl/openib/btl_openib.c  Fri Nov 06 12:00:16 2009 -0800
>> +++ b/ompi/mca/btl/openib/btl_openib.c  Mon Nov 16 17:41:48 2009 +0200
>> @@ -39,6 +39,8 @@
>> #include "ompi/runtime/ompi_cr.h"
>> #endif
>>
>> +#include "btl_openib_ini.h"
>> +
>> #include "btl_openib.h"
>> #include "btl_openib_frag.h"
>> #include "btl_openib_proc.h"
>> @@ -287,6 +289,158 @@
>>return rc;
>> }
>>
>> +const char* btl_openib_get_transport_name(mca_btl_openib_transport_type_t
>> transport_type)
>> +{
>> +switch(transport_type) {
>> +case MCA_BTL_OPENIB_TRANSPORT_RDMAOE:
>> +return "MCA_BTL_OPENIB_TRANSPORT_RDMAOE";
>> +
>> +case MCA_BTL_OPENIB_TRANSPORT_IB:
>> +return "MCA_BTL_OPENIB_TRANSPORT_IB";
>> +
>> +case MCA_BTL_OPENIB_TRANSPORT_IWARP:
>> +return "MCA_BTL_OPENIB_TRANSPORT_IWARP";
>> +
>> +case MCA_BTL_OPENIB_TRANSPORT_UNKNOWN:
>> +default:
>> +return "MCA_BTL_OPENIB_TRANSPORT_UNKNOWN";
>> +}
>> +}
>>
>
> Do you want to make a char** array of these names rather than a function?
>  Doesn't really matter too much to me, but I thought I'd ask.
>
>  +mca_btl_openib_transport_type_t
>> mca_btl_openib_get_transport_type(mca_btl_openib_module_t* openib_btl)
>> +{
>> +#ifdef OMPI_HAVE_RDMAOE
>> +switch(openib_btl->ib_port_attr.transport) {
>>
>
> Are you 100% sure that all the other device drivers will fill in
> ib_port_attr.transport?  That's new in Mellanox's RDMAoE support, right?
>
>  +case RDMA_TRANSPORT_IB:
>> +return MCA_BTL_OPENIB_TRANSPORT_IB;
>> +
>> +case RDMA_TRANSPORT_IWARP:
>> +return MCA_BTL_OPENIB_TRANSPORT_IWARP;
>> +
>> +case RDMA_TRANSPORT_RDMAOE:
>> +return MCA_BTL_OPENIB_TRANSPORT_RDMAOE;
>> +
>> +default:
>> +return MCA_BTL_OPENIB_TRANSPORT_UNKNOWN;
>> +}
>> +#else
>> +#ifdef HAVE_STRUCT_IBV_DEVICE_TRANSPORT_TYPE
>> +switch(openib_btl->device->ib_dev->transport_type) {
>> +case IBV_TRANSPORT_IB:
>> +return MCA_BTL_OPENIB_TRANSPORT_IB;
>> +
>> +case IBV_TRANSPORT_IWARP:
>> +return MCA_BTL_OPENIB_TRANSPORT_IWARP;
>> +
>> +case IBV_TRANSPORT_UNKNOWN:
>> +default:
>> +return MCA_BTL_OPENIB_TRANSPORT_UNKNOWN;
>> +}
>> +#endif
>> +return MCA_BTL_OPENIB_TRANSPORT_IB;
>> +#endif
>> +}
>>
>
> Can you put in some comments explain the above logic -- i.e., the rules
> about how transport_type and transport (what horrible names :-( ) are filled
> in, and why you check them in the order that you check them?
>
>  +static int mca_btl_openib_tune_endpoint(mca_btl_openib_module_t*
>> openib_btl,
>> +mca_btl_base_endpoint_t*
>> endpoint)
>> +{
>> +int ret = OMPI_SUCCESS;
>> +
>> +char* recv_qps = NULL;
>> +
>> +ompi_btl_openib_ini_values_t values;
>> +
>> +if(mca_btl_openib_get_transport_type(openib_btl) !=
>> endpoint->rem_info.rem_transport_type) {
>> +orte_show_help("help-mpi-btl-openib.txt",
>> +"conflicting transport types", true,
>> +orte_process_info.nodename,
>> +ibv_get_device_name(openib_btl->device->ib_dev),
>> +(openib_btl->device->ib_dev_attr).vendor_id,
>> +(openib_btl->device->ib_dev_attr).vendor_part_id,
>> +
>>  
>> btl_openib_get_transport_name(mca_btl_openib_get_transport_type(openib_btl)),
>> +
>>  endpoint->endpoint_proc->proc_ompi->proc_hostname,
>> +endpoint->rem_info.rem_vendor_id,
>> +endpoint->rem_info.rem_vendor_part_id,
>> +
>>  btl_openib_get_transport_name(endpoint->rem_info.rem_transport_type));
>> +
>> +return OMPI_ERROR;
>>
>
> I *love* the consistent use of show_help().  Bravo!  :-)
>
> Can you put in some comments about what exactly you're checking for?  For
> example, I see that the abov elogic is checking for if the transport types
> are different.  How exactly would we get to this point if the transport
> types are different?  Wouldn't we simply not try to connect them?  I.e., why
> is this an *error* rather than a "OMPI won't try to connect these
> endpoints"?
>
>  +}
>> +
>> +memset(, 0, sizeof(ompi_btl_openib_ini_values_t));
>> +ret = ompi_btl_openib_ini_query(endpoint->rem_info.rem_vendor_id,
>> +  endpoint->rem_info.rem_vendor_part_id,
>> );
>> +
>> +if (OMPI_SUCCESS

Re: [OMPI devel] segv in coll tuned

2009-10-12 Thread Lenny Verkhovsky

not since I started testing it :)
it failes somewhere in ompi_coll_tuned_get_target_method_params function, I
am taking a look right now.

On Mon, Oct 12, 2009 at 3:33 PM, Terry Dontje <terry.don...@sun.com> wrote:

> Does that test also pass sometimes?  I am seeing some random set of tests
> segv'ing in the SM btl, using a v1.3 derivative.
>
> --td
> Lenny Verkhovsky wrote:
>
>> Hi,
>> I experience the following error with current trunk r22090. It also
>> occures in 1.3 branch.
>> #~/work/svn/ompi/branches/1.3//build_x86-64/install/bin/mpirun -H witch21
>> -np 4 -mca coll_tuned_use_dynamic_rules 1 ./IMB-MPI1 Sometimes it's error,
>> and sometimes it's segv. It recreates with np>4.
>> [witch21:26540] *** An error occurred in MPI_Barrier
>> [witch21:26540] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
>> [witch21:26540] *** MPI_ERR_ARG: invalid argument of some other kind
>> [witch21:26540] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>> --
>> mpirun has exited due to process rank 0 with PID 26540 on
>> node witch21 exiting without calling "finalize". This may
>> have caused other processes in the application to be
>> terminated by signals sent by mpirun (as reported here).
>> --
>> 3 total processes killed (some possibly by mpirun during cleanup)
>>
>> thanks
>> Lenny.
>> 
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

[OMPI devel] segv in coll tuned

2009-10-12 Thread Lenny Verkhovsky

Hi,
I experience the following error with current trunk r22090. It also occures
in 1.3 branch.
#~/work/svn/ompi/branches/1.3//build_x86-64/install/bin/mpirun -H witch21
-np 4 -mca coll_tuned_use_dynamic_rules 1 ./IMB-MPI1
Sometimes it's error, and sometimes it's segv. It recreates with np>4.
[witch21:26540] *** An error occurred in MPI_Barrier
[witch21:26540] *** on communicator MPI COMMUNICATOR 3 SPLIT FROM 0
[witch21:26540] *** MPI_ERR_ARG: invalid argument of some other kind
[witch21:26540] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
--
mpirun has exited due to process rank 0 with PID 26540 on
node witch21 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--
3 total processes killed (some possibly by mpirun during cleanup)

thanks
Lenny.

Re: [OMPI devel] bug?

2009-10-01 Thread Lenny Verkhovsky

I will take a look,
originally it supposed to bind process to CPU#1 and CPU #3.

On Fri, Sep 25, 2009 at 4:57 PM, Eugene Loh  wrote:

> Thanks, filed as https://svn.open-mpi.org/trac/ompi/ticket/2030
>
> Ralph Castain wrote:
>
>  Circling some off-list comments back to the list...while we could and
>>  should error-out easier, this really isn't a supportable operation.  What
>> the cmd
>>
>> mpirun -n 2 -slot-list 1,3 foo
>>
>> appears to do is cause us to launch a 2-process job consisting of  vpid=1
>> and vpid=3, as opposed to the normal vpid=0 and 1.
>>
>> Not only is ORTE not prepared to handle this scenario, I believe it  will
>> cause problems in some areas within OMPI.
>>
>> I can try to make it fail nicer - someone with more knowledge of the
>>  intended slot-list behavior would have to make it do what they  actually
>> intended, or at least explain what is supposed o happen.
>>
>> Ralph
>>
>> On Sep 24, 2009, at 7:03 PM, Eugene Loh wrote:
>>
>>  mpirun -V
>>> mpirun (Open MPI) 1.4a1-1
>>>
>>> Ralph Castain wrote:
>>>
>>>  Sigh - you really need to remember to tell us what version you're
 talking about.

 On Sep 24, 2009, at 5:39 PM, Eugene Loh wrote:

  I assume this is a bug?
>
> % mpirun -np 2 -slot-list 1,3 hostname
> [saem9:10337] [[455,0],0] ORTE_ERROR_LOG: Not found in file base/
>  odls_base_default_fns.c at line 875
> [saem9:10337] *** Process received signal ***
> [saem9:10337] Signal: Segmentation fault (11)
> [saem9:10337] Signal code: Address not mapped (1)
> [saem9:10337] Failing at address: 0x4c
> [saem9:10337] [ 0] [0xe600]
> [saem9:10337] [ 1] /home/eugene/CTperf/test-CT821/paff_bug2/src/
>  myopt/lib/libopen-rte.so.0(orte_plm_base_launch_apps+0x78a)   
> [0xf7f8c206]
> [saem9:10337] [ 2] /home/eugene/CTperf/test-CT821/paff_bug2/src/
>  myopt/lib/openmpi/mca_plm_rsh.so [0xf7d13564]
> [saem9:10337] [ 3] mpirun [0x804b49d]
> [saem9:10337] [ 4] mpirun [0x804a456]
> [saem9:10337] [ 5] /lib/libc.so.6(__libc_start_main+0xdc)  [0xf7d348ac]
> [saem9:10337] [ 6] mpirun(orte_daemon_recv+0x201) [0x804a3b1]
> [saem9:10337] *** End of error message ***
> Segmentation fault
>


  ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Error message improvement

2009-09-09 Thread Lenny Verkhovsky

fixed in r21956
__FUNCTION__ was replaced with __func__
thanks.
Lenny.

On Wed, Sep 9, 2009 at 2:59 PM, N.M. Maclaren <n...@cam.ac.uk> wrote:

> On Sep 9 2009, George Bosilca wrote:
>
>> On Sep 9, 2009, at 14:16 , Lenny Verkhovsky wrote:
>>
>>  does C99 complient compiler is something unusual
>>> or is there a policy among OMPI developers/users that prevent me f
>>> rom using __func__  instead of hardcoded strings in the code ?
>>>
>>
>> __func__ is what you should use. We take care of having it defined in
>>  _all_ cases. If the compiler doesn't support it we define it manually  (to
>> __FUNCTION__ or to __FILE__ in the worst case), so it is always  available
>> (even if it doesn't contain what one might expect such in  the case of
>> __FILE__).
>>
>
> That's a good, practical solution.  A slight rider is that you shouldn't
> be clever with it - such as using it in preprocessor statements.  I tried
> some tests at one stage, and there were 'interesting' variations on how
> different compilers interpreted C99.  Let alone the fact that it might
> map to something else, with different rules.  If you need to play such
> games, use hard-coded names.
>
> Things may have stabilised since then, but I wouldn't bet on it.
>
> Regards,
> Nick Maclaren.
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Error message improvement

2009-09-09 Thread Lenny Verkhovsky

Hi All,
does C99 complient compiler is something unusual
or is there a policy among OMPI developers/users that prevent me f
rom using __func__  instead of hardcoded strings in the code ?
Thanks.
Lenny.

On Wed, Sep 9, 2009 at 1:48 PM, Nysal Jan <jny...@gmail.com> wrote:

> __FUNCTION__ is not portable.
> __func__ is but it needs a C99 compliant compiler.
>
> --Nysal
>
> On Tue, Sep 8, 2009 at 9:06 PM, Lenny Verkhovsky <
> lenny.verkhov...@gmail.com> wrote:
>
>> fixed in r21952
>> thanks.
>>
>> On Tue, Sep 8, 2009 at 5:08 PM, Arthur Huillet 
>> <arthur.huil...@bull.net>wrote:
>>
>>> Lenny Verkhovsky wrote:
>>>
>>>> Why not using __FUNCTION__  in all our error messages ???
>>>>
>>>
>>> Sounds good, this way the function names are always correct.
>>>
>>> --
>>> Greetings, A. Huillet
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Error message improvement

2009-09-08 Thread Lenny Verkhovsky

fixed in r21952
thanks.

On Tue, Sep 8, 2009 at 5:08 PM, Arthur Huillet <arthur.huil...@bull.net>wrote:

> Lenny Verkhovsky wrote:
>
>> Why not using __FUNCTION__  in all our error messages ???
>>
>
> Sounds good, this way the function names are always correct.
>
> --
> Greetings, A. Huillet
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Error message improvement

2009-09-08 Thread Lenny Verkhovsky

Why not using __FUNCTION__  in all our error messages ???
diff -r 686ec286164a ompi/communicator/communicator.h
--- a/ompi/communicator/communicator.h Tue Sep 08 14:39:03 2009 +0200
+++ b/ompi/communicator/communicator.h Tue Sep 08 15:48:06 2009 +0200
@@ -313,7 +313,7 @@
 {
 #if OPAL_ENABLE_DEBUG
  if(peer_id >= comm->c_remote_group->grp_proc_count) {
- opal_output(0, "ompi_comm_lookup_peer: invalid peer index (%d)", peer_id);
+ opal_output(0, "%s: invalid peer index (%d)", __FUNCTION__,peer_id);
  return (struct ompi_proc_t *) NULL;
  }
 #endif


On Tue, Sep 8, 2009 at 4:49 PM, Arthur Huillet wrote:

> Hi,
>
> please find attached a patch to ompi/communicator/communicator.h that fixes
> the error message displayed by ompi_comm_peer_lookup() so the function name
> that appears is correct.
>
> --
> Greetings, A. Huillet
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

[OMPI devel] VMware and OpenMPI

2009-08-27 Thread Lenny Verkhovsky

Hi all,
Does OpenMPI support VMware ?
I am trying to run OpenMPI 1.3.3 on VMware and it got stacked during OSU
benchmarks and IMB.
looks like random deadlock, I wander if anyone have ever tried it ?
thanks,
Lenny.

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Lenny Verkhovsky

In the multi job environment, can't we just start binding processes on the
first avaliable and unused socket?
I mean first job/user will start binding itself from socket 0,
the next job/user will start binding itself from socket 2, for instance .
Lenny.

On Mon, Aug 17, 2009 at 6:02 AM, Ralph Castain  wrote:

>
> On Aug 16, 2009, at 8:16 PM, Eugene Loh wrote:
>
>  Chris Samuel wrote:
>
> - "Eugene Loh"   wrote:
>
>
>  This is an important discussion.
>
>
>  Indeed! My big fear is that people won't pick up the significance
> of the change and will complain about performance regressions
> in the middle of an OMPI stable release cycle.
>
>  2) The proposed OMPI bind-to-socket default is less severe. In the
> general case, it would allow multiple jobs to bind in the same way
> without oversubscribing any core or socket. (This comment added to
> the trac ticket.)
>
>
>  That's a nice clarification, thanks. I suspect though that the
> same issue we have with MVAPICH would occur if two 4 core jobs
> both bound themselves to the first socket.
>
>
>  Okay, so let me point out a second distinction from MVAPICH:  the default
> policy would be to spread out over sockets.
>
> Let's say you have two sockets, with four cores each.  Let's say you submit
> two four-core jobs.  The first job would put two processes on the first
> socket and two processes on the second.  The second job would do the same.
> The loading would be even.
>
> I'm not saying there couldn't be problems.  It's just that MVAPICH2 (at
> least what I looked at) has multiple shortfalls.  The binding is to fill up
> one socket after another (which decreases memory bandwidth per process and
> increases chances of collisions with other jobs) and binding is to core
> (increasing chances of oversubscribing cores).  The proposed OMPI behavior
> distributes over sockets (improving memory bandwidth per process and
> reducing collisions with other jobs) and binding is to sockets (reducing
> changes of oversubscribing cores, whether due to other MPI jobs or due to
> multithreaded processes).  So, the proposed OMPI behavior mitigates the
> problems.
>
> It would be even better to have binding selections adapt to other bindings
> on the system.
>
> In any case, regardless of what the best behavior is, I appreciate the
> point about changing behavior in the middle of a stable release.  Arguably,
> leaving significant performance on the table in typical situations is a bug
> that warrants fixing even in the middle of a release, but I won't try to
> settle that debate here.
>
>
> I think the problem here, Eugene, is that performance benchmarks are far
> from the typical application. We have repeatedly seen this - optimizing for
> benchmarks frequently makes applications run less efficiently. So I concur
> with Chris on this one - let's not go -too- benchmark happy and hurt the
> regular users.
>
> Here at LANL, binding to-socket instead of to-core hurts performance by
> ~5-10%, depending on the specific application. Of course, either binding
> method is superior to no binding at all...
>
> UNLESS you have a threaded application, in which case -any- binding can be
> highly detrimental to performance.
>
> So going slow on this makes sense. If we provide the capability, but leave
> it off by default, then people can test it against real applications and see
> the impact. Then we can better assess the right default settings.
>
> Ralph
>
>
>  3) Defaults (if I understand correctly) can be set differently
> on each cluster.
>
>
>  Yes, but the defaults should be sensible for the majority of
> clusters.  If the majority do indeed share nodes between jobs
> then I would suggest that the default should be off and the
> minority who don't share nodes should have to enable it.
>
>
>  In debates on this subject, I've heard people argue that:
>
> *) Though nodes are getting fatter, most are still thin.
>
> *) Resource managers tend to space share the cluster.
>  ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] sm_coll segv

2009-08-10 Thread Lenny Verkhovsky

I also have another question
$ompi_info -aa|grep mpool |grep sm
  MCA coll: parameter "coll_sm_mpool" (current value: "sm", data source:
default value)
  MCA mpool: parameter "mpool_sm_allocator" (current value: "bucket", data
source: default value)

what do these names mean, and dont they have to be the same ?
Lenny.

On Mon, Aug 10, 2009 at 5:11 PM, Lenny Verkhovsky <
lenny.verkhov...@gmail.com> wrote:

> Don't these allocations of  bshe->smbhe_keys require some kind of memory
> translation from 1 proc's memory space to another ( in bootstrap_init
> function /ompi/mca/coll/sm/coll_sm_module.c )
> If local rank0 allocates ( get attached to ) memory, others can't read it
> without proper tranlsation.
> Lenny
>
> On Mon, Aug 10, 2009 at 2:26 PM, Lenny Verkhovsky <
> lenny.verkhov...@gmail.com> wrote:
>
>> We saw these seqv too with and without setting sm btl .
>>
>> On Fri, Aug 7, 2009 at 10:51 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>
>>>
>>>
>>> On Thu, Aug 6, 2009 at 3:18 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
>>>
>>>> Ok, with Terry's help, I found a segv in the coll sm.  If you run
>>>> without the sm btl, there's an obvious bad parameter that we're passing 
>>>> that
>>>> results in a segv.
>>>>
>>>> LANL -- can you confirm / deny that these are the segv's that you were
>>>> seeing?
>>>
>>>
>>> Yes we can deny that those are the segv's we were seeing - we definitely
>>> had the sm btl active. I'll rerun the test on Monday and add the stacktrace
>>> to your ticket.
>>>
>>> Ralph
>>>
>>>
>>>>
>>>> While fixing this, I noticed that the sm btl and sm coll are sharing an
>>>> mpool when both are running.  This probably used to be a good idea way back
>>>> when (e.g., when we were using a lot more shmem than we needed and core
>>>> counts were lower), but it seems like a bad idea now (e.g., the btl/sm is
>>>> fairly specific about the size of the mpool that is created -- it's just 
>>>> big
>>>> enough for its data structures).
>>>>
>>>> I'm therefore going to change the mpool string names that btl/sm and
>>>> coll/sm are looking for so that they get unique sm mpool modules.
>>>>
>>>> --
>>>> Jeff Squyres
>>>> jsquy...@cisco.com
>>>>
>>>> ___
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>

Re: [OMPI devel] sm_coll segv

2009-08-10 Thread Lenny Verkhovsky

Don't these allocations of  bshe->smbhe_keys require some kind of memory
translation from 1 proc's memory space to another ( in bootstrap_init
function /ompi/mca/coll/sm/coll_sm_module.c )
If local rank0 allocates ( get attached to ) memory, others can't read it
without proper tranlsation.
Lenny

On Mon, Aug 10, 2009 at 2:26 PM, Lenny Verkhovsky <
lenny.verkhov...@gmail.com> wrote:

> We saw these seqv too with and without setting sm btl .
>
> On Fri, Aug 7, 2009 at 10:51 AM, Ralph Castain <r...@open-mpi.org> wrote:
>
>>
>>
>> On Thu, Aug 6, 2009 at 3:18 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
>>
>>> Ok, with Terry's help, I found a segv in the coll sm.  If you run without
>>> the sm btl, there's an obvious bad parameter that we're passing that results
>>> in a segv.
>>>
>>> LANL -- can you confirm / deny that these are the segv's that you were
>>> seeing?
>>
>>
>> Yes we can deny that those are the segv's we were seeing - we definitely
>> had the sm btl active. I'll rerun the test on Monday and add the stacktrace
>> to your ticket.
>>
>> Ralph
>>
>>
>>>
>>> While fixing this, I noticed that the sm btl and sm coll are sharing an
>>> mpool when both are running.  This probably used to be a good idea way back
>>> when (e.g., when we were using a lot more shmem than we needed and core
>>> counts were lower), but it seems like a bad idea now (e.g., the btl/sm is
>>> fairly specific about the size of the mpool that is created -- it's just big
>>> enough for its data structures).
>>>
>>> I'm therefore going to change the mpool string names that btl/sm and
>>> coll/sm are looking for so that they get unique sm mpool modules.
>>>
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>

Re: [OMPI devel] sm_coll segv

2009-08-10 Thread Lenny Verkhovsky

We saw these seqv too with and without setting sm btl .

On Fri, Aug 7, 2009 at 10:51 AM, Ralph Castain  wrote:

>
>
> On Thu, Aug 6, 2009 at 3:18 PM, Jeff Squyres  wrote:
>
>> Ok, with Terry's help, I found a segv in the coll sm.  If you run without
>> the sm btl, there's an obvious bad parameter that we're passing that results
>> in a segv.
>>
>> LANL -- can you confirm / deny that these are the segv's that you were
>> seeing?
>
>
> Yes we can deny that those are the segv's we were seeing - we definitely
> had the sm btl active. I'll rerun the test on Monday and add the stacktrace
> to your ticket.
>
> Ralph
>
>
>>
>> While fixing this, I noticed that the sm btl and sm coll are sharing an
>> mpool when both are running.  This probably used to be a good idea way back
>> when (e.g., when we were using a lot more shmem than we needed and core
>> counts were lower), but it seems like a bad idea now (e.g., the btl/sm is
>> fairly specific about the size of the mpool that is created -- it's just big
>> enough for its data structures).
>>
>> I'm therefore going to change the mpool string names that btl/sm and
>> coll/sm are looking for so that they get unique sm mpool modules.
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] rankfile relative host claiming option patch

2009-06-26 Thread Lenny Verkhovsky

Thanks, Ralph,

So, if there are no other comments,
I will commit it on Sunday.

Thanks,
Lenny.

On Fri, Jun 26, 2009 at 6:37 AM, Ralph Castain <r...@open-mpi.org> wrote:

> Forget that comment, Lenny - I think this actually looks fine. The relative
> notation currently is only used in the allocators, not the mappers, so this
> is fine.
>
> Sorry for the confusion.
> Ralph
>
>
> On Jun 25, 2009, at 2:50 PM, Ralph Castain wrote:
>
> Question: for all other mappers, the relative rank is given with respect to
>> the allocation. It looks here like you are doing it relative to the list of
>> nodes, which is compiled from the allocation passed through hostfile and
>> -host options.
>>
>> Do you want to conform to the behavior of the other mappers? Or do
>> something different here?
>>
>> On Jun 25, 2009, at 10:10 AM, Lenny Verkhovsky wrote:
>>
>> Hi,
>>> Proposed small patch to extend current rankfile syntax to be compliant
>>> with orte_hosts syntax
>>> making it possible to claim relative hosts from the hostfile/scheduler
>>> by using +n# hostname, where  0 <= # < np
>>> ex:
>>> cat ~/work/svn/hpc/dev/test/Rankfile/rankfile
>>> rank 0=+n0 slot=0
>>> rank 1=+n0 slot=1
>>> rank 2=+n1 slot=2
>>> rank 3=+n1 slot=1
>>> for your review and blessing before I commit it to the trunk.
>>> I also ask to add it to 1.3 branch.
>>> thanks,
>>> Lenny.
>>>
>>>
>>> Index: orte/mca/rmaps/rank_file/help-rmaps_rank_file.txt
>>> ===
>>> --- orte/mca/rmaps/rank_file/help-rmaps_rank_file.txt (revision 21529)
>>> +++ orte/mca/rmaps/rank_file/help-rmaps_rank_file.txt (working copy)
>>> @@ -56,6 +56,9 @@
>>> Please review your rank-slot assignments and your host allocation to
>>> ensure
>>> a proper match.
>>>
>>> +[bad-index]
>>> +Rankfile claimed host %s by index that is bigger than number of
>>> allocated hosts.
>>> +
>>> [orte-rmaps-rf:alloc-error]
>>> There are not enough slots available in the system to satisfy the %d
>>> slots
>>> that were requested by the application:
>>> Index: orte/mca/rmaps/rank_file/rmaps_rank_file_lex.h
>>> ===
>>> --- orte/mca/rmaps/rank_file/rmaps_rank_file_lex.h (revision 21529)
>>> +++ orte/mca/rmaps/rank_file/rmaps_rank_file_lex.h (working copy)
>>> @@ -75,6 +75,7 @@
>>> #define ORTE_RANKFILE_NEWLINE 13
>>> #define ORTE_RANKFILE_IPV6 14
>>> #define ORTE_RANKFILE_SLOT 15
>>> +#define ORTE_RANKFILE_RELATIVE 16
>>>
>>> #if defined(c_plusplus) || defined(__cplusplus)
>>> }
>>> Index: orte/mca/rmaps/rank_file/rmaps_rank_file.c
>>> ===
>>> --- orte/mca/rmaps/rank_file/rmaps_rank_file.c (revision 21529)
>>> +++ orte/mca/rmaps/rank_file/rmaps_rank_file.c (working copy)
>>> @@ -273,11 +273,11 @@
>>>  orte_vpid_t total_procs;
>>>  opal_list_t node_list;
>>>  opal_list_item_t *item;
>>> - orte_node_t *node, *nd;
>>> + orte_node_t *node, *nd, *root_node;
>>>  orte_vpid_t rank, vpid_start;
>>>  orte_std_cntr_t num_nodes, num_slots;
>>>  orte_rmaps_rank_file_map_t *rfmap;
>>> - orte_std_cntr_t slots_per_node;
>>> + orte_std_cntr_t slots_per_node, relative_index, tmp_cnt;
>>>  int rc;
>>>
>>>  /* convenience def */
>>> @@ -411,7 +411,25 @@
>>>  0 == strcmp(nd->name, rfmap->node_name)) {
>>>  node = nd;
>>>  break;
>>> - }
>>> + } else if (NULL != rfmap->node_name &&
>>> + (('+' == rfmap->node_name[0]) &&
>>> + (('n' == rfmap->node_name[1]) ||
>>> + ('N' == rfmap->node_name[1] {
>>> +
>>> + relative_index=atoi(strtok(rfmap->node_name,"+n"));
>>> + if ( relative_index >= opal_list_get_size (_list) || ( 0 >
>>> relative_index)){
>>> + orte_show_help("help-rmaps_rank_file.txt","bad-index",
>>> true,rfmap->node_name);
>>> + ORTE_ERROR_LOG(ORTE_ERR_BAD_PARAM);
>>> + return ORTE_ERR_BAD_PARAM;
>>> + }
>>> + root_node = (orte_node_t*) opal_list_get_first(_list);
>>> + for(tmp_cnt=0; tmp_cnt<relative_index; tmp_cnt++) {
>>> + root_node = (orte_node_t*) opal_list_get_

[OMPI devel] rankfile relative host claiming option patch

2009-06-25 Thread Lenny Verkhovsky

Hi,
Proposed small patch to extend current rankfile syntax to be compliant with
orte_hosts syntax
making it possible to claim relative hosts from the hostfile/scheduler
by using +n# hostname, where  0 <= # < np
ex:
cat ~/work/svn/hpc/dev/test/Rankfile/rankfile
rank 0=+n0 slot=0
rank 1=+n0 slot=1
rank 2=+n1 slot=2
rank 3=+n1 slot=1
for your review and blessing before I commit it to the trunk.
I also ask to add it to 1.3 branch.
thanks,
Lenny.


Index: orte/mca/rmaps/rank_file/help-rmaps_rank_file.txt
===
--- orte/mca/rmaps/rank_file/help-rmaps_rank_file.txt (revision 21529)
+++ orte/mca/rmaps/rank_file/help-rmaps_rank_file.txt (working copy)
@@ -56,6 +56,9 @@
 Please review your rank-slot assignments and your host allocation to ensure
 a proper match.

+[bad-index]
+Rankfile claimed host %s by index that is bigger than number of allocated
hosts.
+
 [orte-rmaps-rf:alloc-error]
 There are not enough slots available in the system to satisfy the %d slots
 that were requested by the application:
Index: orte/mca/rmaps/rank_file/rmaps_rank_file_lex.h
===
--- orte/mca/rmaps/rank_file/rmaps_rank_file_lex.h (revision 21529)
+++ orte/mca/rmaps/rank_file/rmaps_rank_file_lex.h (working copy)
@@ -75,6 +75,7 @@
 #define ORTE_RANKFILE_NEWLINE 13
 #define ORTE_RANKFILE_IPV6 14
 #define ORTE_RANKFILE_SLOT 15
+#define ORTE_RANKFILE_RELATIVE 16

 #if defined(c_plusplus) || defined(__cplusplus)
 }
Index: orte/mca/rmaps/rank_file/rmaps_rank_file.c
===
--- orte/mca/rmaps/rank_file/rmaps_rank_file.c (revision 21529)
+++ orte/mca/rmaps/rank_file/rmaps_rank_file.c (working copy)
@@ -273,11 +273,11 @@
  orte_vpid_t total_procs;
  opal_list_t node_list;
  opal_list_item_t *item;
- orte_node_t *node, *nd;
+ orte_node_t *node, *nd, *root_node;
  orte_vpid_t rank, vpid_start;
  orte_std_cntr_t num_nodes, num_slots;
  orte_rmaps_rank_file_map_t *rfmap;
- orte_std_cntr_t slots_per_node;
+ orte_std_cntr_t slots_per_node, relative_index, tmp_cnt;
  int rc;

  /* convenience def */
@@ -411,7 +411,25 @@
  0 == strcmp(nd->name, rfmap->node_name)) {
  node = nd;
  break;
- }
+ } else if (NULL != rfmap->node_name &&
+ (('+' == rfmap->node_name[0]) &&
+ (('n' == rfmap->node_name[1]) ||
+ ('N' == rfmap->node_name[1] {
+
+ relative_index=atoi(strtok(rfmap->node_name,"+n"));
+ if ( relative_index >= opal_list_get_size (_list) || ( 0 >
relative_index)){
+ orte_show_help("help-rmaps_rank_file.txt","bad-index",
true,rfmap->node_name);
+ ORTE_ERROR_LOG(ORTE_ERR_BAD_PARAM);
+ return ORTE_ERR_BAD_PARAM;
+ }
+ root_node = (orte_node_t*) opal_list_get_first(_list);
+ for(tmp_cnt=0; tmp_cntnode_name);
@@ -631,6 +649,7 @@
  case ORTE_RANKFILE_IPV6:
  case ORTE_RANKFILE_STRING:
  case ORTE_RANKFILE_INT:
+ case ORTE_RANKFILE_RELATIVE:
  if(ORTE_RANKFILE_INT == token) {
  sprintf(buff,"%d", orte_rmaps_rank_file_value.ival);
  value = buff;
Index: orte/mca/rmaps/rank_file/rmaps_rank_file_lex.l
===
--- orte/mca/rmaps/rank_file/rmaps_rank_file_lex.l (revision 21529)
+++ orte/mca/rmaps/rank_file/rmaps_rank_file_lex.l (working copy)
@@ -111,6 +111,9 @@
  orte_rmaps_rank_file_value.sval = yytext;
  return ORTE_RANKFILE_HOSTNAME; }

+\+n[0-9]+ { orte_rmaps_rank_file_value.sval = yytext;
+ return ORTE_RANKFILE_RELATIVE; }
+
 . { orte_rmaps_rank_file_value.sval = yytext;
  return ORTE_RANKFILE_ERROR; }


rankfile.patch
Description: Binary data

Re: [OMPI devel] why does --rankfile need hostlist?

2009-06-22 Thread Lenny Verkhovsky

I personally prefer the way it's now.
This way guaranties me total control over mapping and allocating slots.
When I am using rankfile mapper, I know exactly what and where I am putting,
OS can easily oversubscribe my CPU with unmapped by rankfile processes. I am
also not sure how it will effect users that have schedulers.
I am also not sure that users, who got used to work with hostfile would
change their scripts according to the mapper.
Lenny.

On Mon, Jun 22, 2009 at 1:23 AM, Ralph Castain <r...@open-mpi.org> wrote:

> Had a chance to think about how this might be done, and looked at it for
> awhile after getting home. I -think- I found a way to do it, but there are a
> couple of caveats:
> 1. Len's point about oversubscribing without warning would definitely hold
> true - this would positively be a "user beware" option
>
> 2. there could be no RM-provided allocation, hostfile, or -host options
> specified. Basically, I would be adding the "read rankfile" option to the
> end of the current allocation determination procedure
>
> I would still allow more procs than shown in the rankfile (mapping the rest
> bynode on the nodes specified in the rankfile - can't do byslot because I
> don't know how many slots are on each node), which means the only change in
> behavior would be the forced bynode mapping of unspecified procs.
>
> So use of this option will entail some risks and a slight difference in
> behavior, but would relieve you from the burden of having to provide a
> hostfile. I'm not personally convinced it is worth the risk and probable
> user complaints of "it didn't work", but since we don't use this option, I
> don't have a strong opinion on the matter.
>
> Let's just avoid going back-and-forth over wanting it, or how it should be
> implemented - let's get it all ironed out, and then implement it once, like
> we finally did at the end with the whole hostfile thing.
>
> Let me know if you want me to do this - it obviously isn't at the top of my
> priority list, but still could be done in the next few weeks.
>
> Ralph
>
>
> On Jun 21, 2009, at 9:00 AM, Lenny Verkhovsky wrote:
>
> Sorry for the delay in response,
> I totally agree with Ralph that it's not as easy as it seems,
> 1. rankfile mapper uses already allocated machines ( by scheduler or
> hostfile ), by using rankfile as a hostfile we can run into problem where
> trying to use unallocated nodes, what can hang the run.
> 2. we can't define in rankfile number of slots on each machine, which means
> oversubscribing can take place without any warning.
> 3. I personally dont see any problem using hostfile, even if it has
> redundant info, hostfile and rankfile belong to different layers in the
> system and solve different problems. The original hostfile ( if I recall
> correctly ) could bind rank to the node, but the syntax wasn't very flexible
> and clear.
> Lenny.
>
> On Sun, Jun 21, 2009 at 5:15 PM, Ralph Castain <r...@open-mpi.org> wrote:
>
>> Let me suggest a two-step process, then:
>> 1. let's change the error message as this is easily done and thus can be
>> done now
>>
>> 2. I can look at how to eat the rankfile as a hostfile. This may not even
>> be possible - the problem is that the entire system is predicated on certain
>> ordering due to our framework architecture. So we get an allocation, and
>> then do a mapping against that allocation, filtering the allocation through
>> hostfiles, -host, and other options.
>>
>> By the time we reach the rankfile mapper, we have already determined that
>> we don't have an allocation and have to abort. It is the rankfile mapper
>> itself that looks for the -rankfile option, so the system can have no
>> knowledge that someone has specified that option before that point - and
>> thus, even if I could parse the rankfile, I don't know it was given!
>>
>> What will take time is to figure out a way to either:
>>
>> (a) allow us to run the mapper even though we don't have any nodes we know
>> about, and allow the mapper to insert the nodes itself - without causing
>> non-rankfile uses to break (which could be a major feat); or
>>
>> (b) have the overall system check for the rankfile option and pass it as a
>> hostfile as well, assuming that a hostfile wasn't also given, no RM-based
>> allocation exists, etc. - which breaks our abstraction rules and also opens
>> a possible can of worms.
>>
>> Either way, I also then have to teach the hostfile parser how to realize
>> it is a rankfile format and convert the info in it into what we expected to
>> receive from a hostfile - another non-trivial problem.
>>
>> I'm willing to give

Re: [OMPI devel] why does --rankfile need hostlist?

2009-06-21 Thread Lenny Verkhovsky

Sorry for the delay in response,
I totally agree with Ralph that it's not as easy as it seems,
1. rankfile mapper uses already allocated machines ( by scheduler or
hostfile ), by using rankfile as a hostfile we can run into problem where
trying to use unallocated nodes, what can hang the run.
2. we can't define in rankfile number of slots on each machine, which means
oversubscribing can take place without any warning.
3. I personally dont see any problem using hostfile, even if it has
redundant info, hostfile and rankfile belong to different layers in the
system and solve different problems. The original hostfile ( if I recall
correctly ) could bind rank to the node, but the syntax wasn't very flexible
and clear.
Lenny.

On Sun, Jun 21, 2009 at 5:15 PM, Ralph Castain  wrote:

> Let me suggest a two-step process, then:
> 1. let's change the error message as this is easily done and thus can be
> done now
>
> 2. I can look at how to eat the rankfile as a hostfile. This may not even
> be possible - the problem is that the entire system is predicated on certain
> ordering due to our framework architecture. So we get an allocation, and
> then do a mapping against that allocation, filtering the allocation through
> hostfiles, -host, and other options.
>
> By the time we reach the rankfile mapper, we have already determined that
> we don't have an allocation and have to abort. It is the rankfile mapper
> itself that looks for the -rankfile option, so the system can have no
> knowledge that someone has specified that option before that point - and
> thus, even if I could parse the rankfile, I don't know it was given!
>
> What will take time is to figure out a way to either:
>
> (a) allow us to run the mapper even though we don't have any nodes we know
> about, and allow the mapper to insert the nodes itself - without causing
> non-rankfile uses to break (which could be a major feat); or
>
> (b) have the overall system check for the rankfile option and pass it as a
> hostfile as well, assuming that a hostfile wasn't also given, no RM-based
> allocation exists, etc. - which breaks our abstraction rules and also opens
> a possible can of worms.
>
> Either way, I also then have to teach the hostfile parser how to realize it
> is a rankfile format and convert the info in it into what we expected to
> receive from a hostfile - another non-trivial problem.
>
> I'm willing to give it a try - just trying to make clear why my response
> was negative. It isn't as simple as it sounds...which is why Len and I
> didn't pursue it when this was originally developed.
>
> Ralph
>
>
> On Sun, Jun 21, 2009 at 5:28 AM, Terry Dontje wrote:
>
>> Being a part of these discussions I can understand your reticence to
>> reopen this discussion.  However, I think this is a major usability issue
>> with this feature which actually is fairly important in order to get things
>> to run performant. Which IMO is important.
>>
>> That being said I think there are one of two things that could be done to
>> mitigate the issue.
>>
>> 1.  To eliminate the element of surprise by changing mpirun to eat
>> rankfile without the hostfile.
>> 2.  To change the error message to something understandable by the user
>> such that they
>> know they might be missing the hostfile option.
>>
>> Again I understand this topic is frustrating and there are some boundaries
>> with the design that make these two option orthogonal to each other but I
>> really believe we need to make the rankfile option something that is easily
>> usable by our users.
>>
>>
>> --td
>>
>> Ralph Castain wrote:
>>
>>> Having gone around in circles on hostfile-related issues for over five
>>> years now, I honestly have little motivation to re-open the entire
>>> discussion again. It doesn't seem to be that daunting a requirement for
>>> those who are using it, so I'm inclined to just leave well enough alone.
>>>
>>> :-)
>>>
>>>
>>> On Fri, Jun 19, 2009 at 2:21 PM, Eugene Loh > eugene@sun.com>> wrote:
>>>
>>>Ralph Castain wrote:
>>>
The two files have a slightly different format

>>>Agreed.
>>>
and completely different meaning.

>>>Somewhat agreed.  They're both related to mapping processes onto a
>>>cluster.
>>>
>>> The hostfile specifies how many slots are on a node. The rankfile
specifies a rank and what node/slot it is to be mapped onto.

>>>Agreed.
>>>
>>> Rankfiles can use relative node indexing and refer to nodes
received from a resource manager - i.e., without any hostfile.

>>>This is the main part I'm concerned about.  E.g.,
>>>
>>>% cat rankfile
>>>rank 0=node0 slot=0
>>>rank 1=node1 slot=0
>>>% mpirun -np 2 -rf rankfile ./a.out
>>>
>>>  --
>>>Rankfile claimed host node1 that was not allocated or
>>>oversubscribed it's slots:
>>>
>>>
>>>

Re: [OMPI devel] Mtt Fails

2009-04-22 Thread Lenny Verkhovsky

As I understood it will be fixed in 1.3.2.

thanks, Ralph.

On Wed, Apr 22, 2009 at 4:15 PM, Ralph Castain <r...@open-mpi.org> wrote:

> If you look at the code in that test, it has a --openmpi option you are
> supposed to set so that it runs properly for OMPI. Not sure if that's the
> problem here or not.
>
> Did this used to run?
>
> Note also that the test has a hardcoded version of 2.0 in it. I'm not sure
> if that could also be part of the problem.
>
>
>
>   On Wed, Apr 22, 2009 at 6:04 AM, Lenny Verkhovsky <
> lenny.verkhov...@gmail.com> wrote:
>
>>   Hi all,
>>
>> I have MTT failures complaining about MPI2, but before I am opening a
>> ticket, pls, have a look.
>>
>> $/hpc/home/USERS/mtt/mtt-scratch/20090421220402_moo1_17859/installs/oma-nightly-1.3--gcc--1.3r404/install/bin/mpirun
>> --host moo1,moo1,moo2,moo2,moo3,moo3,moo4,moo4 -np 8 --mca
>> btl_openib_use_eager_rdma 1 --mca btl self,sm,openib
>> /hpc/home/USERS/mtt/mtt-scratch/20090421220402_moo1_17859/installs/ogHK/tests/mpicxx/cxx-test-suite/src/mpi2c++_dynamics_test
>>
>> MPI-2 C++ bindings MPI-2 dynamics test suite
>> --
>> Open MPI Version 2.0
>>
>> *** There are delays built into some of the tests
>> *** Please let them complete
>> *** No test should take more than 10 seconds
>>
>> Test suite running on 8 nodes
>>
>> * MPI-2 Dynamics...
>>   - Looking for "connect" program... PASS
>>   - MPI::Get_version... FAIL
>>
>> MPI2C++ test suite: NODE 0 - 2) ERROR in MPI::Get_version should be 2.1
>> MPI2C++ test suite: all ranks failed
>> MPI2C++ test suite: minor error
>> MPI2C++ test suite: attempting to finalize...
>> MPI2C++ test suite: terminated
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

[OMPI devel] Mtt Fails

2009-04-22 Thread Lenny Verkhovsky

Hi all,

I have MTT failures complaining about MPI2, but before I am opening a
ticket, pls, have a look.

$/hpc/home/USERS/mtt/mtt-scratch/20090421220402_moo1_17859/installs/oma-nightly-1.3--gcc--1.3r404/install/bin/mpirun
--host moo1,moo1,moo2,moo2,moo3,moo3,moo4,moo4 -np 8 --mca
btl_openib_use_eager_rdma 1 --mca btl self,sm,openib
/hpc/home/USERS/mtt/mtt-scratch/20090421220402_moo1_17859/installs/ogHK/tests/mpicxx/cxx-test-suite/src/mpi2c++_dynamics_test

MPI-2 C++ bindings MPI-2 dynamics test suite
--
Open MPI Version 2.0

*** There are delays built into some of the tests
*** Please let them complete
*** No test should take more than 10 seconds

Test suite running on 8 nodes

* MPI-2 Dynamics...
  - Looking for "connect" program... PASS
  - MPI::Get_version... FAIL

MPI2C++ test suite: NODE 0 - 2) ERROR in MPI::Get_version should be 2.1
MPI2C++ test suite: all ranks failed
MPI2C++ test suite: minor error
MPI2C++ test suite: attempting to finalize...
MPI2C++ test suite: terminated

Re: [OMPI devel] trac 1857: SM btl hangs when msg >=4k, Performance degradation ???

2009-04-12 Thread Lenny Verkhovsky

Sorry, guys, I tested it on 1.3 branch, trunk version(1.4a1r20980) seems to
be fixed.

BUT,

the default value of mpool_sm_min_size in 1.4a1r20980 is 67108864

when I set it to 0, there is a performance degradation, is it OK ?

$LD_LIBRARY_PATH=~/work/svn/ompi/trunk/build_x86-64/install/lib/
install/bin/mpirun -np 2 -mca btl sm,self -mca mpool_sm_min_size 0
~/work/svn/hpc/tools/benchmarks/OMB-3.1.1/osu_bw
# OSU MPI Bandwidth Test v3.1.1
# Size Bandwidth (MB/s)
1 1.20
2 3.39
4 6.93
8 14.09
16 27.80
32 50.58
64 101.08
128 173.23
256 257.81
512 436.86
1024 674.51
2048 856.80
4096 573.87
8192 607.55
16384 660.58
32768 685.23
65536 687.45
131072 690.52
262144 687.48
524288 676.77
1048576 675.74
2097152 676.89
4194304 677.28
lennyb@dellix7 ~/work/svn/ompi/trunk/build_x86-64
$LD_LIBRARY_PATH=~/work/svn/ompi/trunk/build_x86-64/install/lib/
install/bin/mpirun -np 2 -mca btl sm,self
~/work/svn/hpc/tools/benchmarks/OMB-3.1.1/osu_bw
# OSU MPI Bandwidth Test v3.1.1
# Size Bandwidth (MB/s)
1 1.72
2 3.70
4 7.43
8 13.45
16 29.83
32 52.66
64 105.08
128 181.16
256 288.16
512 426.83
1024 690.21
2048 867.00
4096 567.53
8192 667.35
16384 806.97
32768 892.95
65536 989.62
131072 1009.25
262144 1018.35
524288 1037.32
1048576 1048.75
2097152 1057.51
4194304 1062.16

Lenny.

On 4/12/09, Lenny Verkhovsky <lenny.verkhov...@gmail.com> wrote:
>
> r20980 It still get stacked
>
> LD_LIBRARY_PATH=~/work/svn/hpc/dev/ompi_1_3_trunk/build_x86-64/install/lib/
> ~/work/svn/hpc/dev/ompi_1_3_trunk/build_x86-64/install/bin/mpirun -np 2 -mca
> btl self,sm ./osu_bw
>
> # OSU MPI Bandwidth Test v3.1.1
> # Size Bandwidth (MB/s)
> 1 1.46
> 2 3.66
> 4 7.29
> 8 14.64
> 16 29.44
> 32 56.94
> 64 112.25
> 128 189.02
> 256 278.26
> 512 448.58
> 1024 686.25
> 2048 865.27
>
>
>
> On 4/8/09, Jeff Squyres <jsquy...@cisco.com> wrote:
>>
>> Ditto -- works for me too.  Huzzah!
>>
>>
>> On Apr 7, 2009, at 8:39 PM, Eugene Loh wrote:
>>
>>  George Bosilca wrote:
>>>
>>> > This is interesting. I cannot trigger this deadlock on my AMD cluster
>>> > even when I set the sm_min_size to zero. However, on a Intel cluster
>>> > this can be triggered pretty easily.
>>> >
>>> > Anyway, I think I finally understood where the problem is coming
>>> > from.  r20952 and r20953 are commits that, in addition to the ones
>>> > from  yesterday, fix the problem. Please read the log on r20953 to see
>>> > how  this happens.
>>> >
>>> > Of course, please stress it before we move it to the 1.3 branch.
>>>
>>> Okay, this fix works for me.
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>

Re: [OMPI devel] trac 1857: SM btl hangs when msg >=4k

2009-04-12 Thread Lenny Verkhovsky

r20980 It still get stacked

LD_LIBRARY_PATH=~/work/svn/hpc/dev/ompi_1_3_trunk/build_x86-64/install/lib/
~/work/svn/hpc/dev/ompi_1_3_trunk/build_x86-64/install/bin/mpirun -np 2 -mca
btl self,sm ./osu_bw

# OSU MPI Bandwidth Test v3.1.1
# Size Bandwidth (MB/s)
1 1.46
2 3.66
4 7.29
8 14.64
16 29.44
32 56.94
64 112.25
128 189.02
256 278.26
512 448.58
1024 686.25
2048 865.27



On 4/8/09, Jeff Squyres  wrote:
>
> Ditto -- works for me too.  Huzzah!
>
>
> On Apr 7, 2009, at 8:39 PM, Eugene Loh wrote:
>
>  George Bosilca wrote:
>>
>> > This is interesting. I cannot trigger this deadlock on my AMD cluster
>> > even when I set the sm_min_size to zero. However, on a Intel cluster
>> > this can be triggered pretty easily.
>> >
>> > Anyway, I think I finally understood where the problem is coming
>> > from.  r20952 and r20953 are commits that, in addition to the ones
>> > from  yesterday, fix the problem. Please read the log on r20953 to see
>> > how  this happens.
>> >
>> > Of course, please stress it before we move it to the 1.3 branch.
>>
>> Okay, this fix works for me.
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] trac 1857: SM btl hangs when msg >=4k

2009-04-07 Thread Lenny Verkhovsky

r20948 still hangs, changing mpool_sm_min_size solves it.

Lenny.

On Tue, Apr 7, 2009 at 3:42 AM, Eugene Loh  wrote:

> George Bosilca wrote:
>
> You're right, the sentence was messed-up. My intent was to say that I
>>  found the problem, made a fix and once this fix applied to the trunk I  was
>> not able to reproduce the deadlock.
>>
>
> But you were able to reproduce the deadlock before you made the fix?
>
> Anyhow, if I get fresh bits (through r20947) and I back out r20944 (either
> in the source code or simply by setting the mpool_sm_min_size MCA parameter
> to 0), I get deadlock.
>
> Based on your description of the bug I forced osu_bw to send 1024 non-
>> blocking sends (instead of the default 64), and I still don't get the
>>  deadlock. I'm trilled ...
>>
>
> Yes, that's a good test.  You're sure you had mpool_sm_min_size set to 0?
>  I just don't have the same luck you do.  I get the hang even with your
> fixes.
>
>
> On Apr 6, 2009, at 19:56 , Eugene Loh wrote:
>>
>> George Bosilca wrote:
>>>
>>> I got some free time (yeh haw) and took a look at the OB1 PML in  order
  to fix the issue. I think I found the problem, as I'm unable  to reproduce
 this error.

>>>
>>> Sorry, this sentence has me baffled.  Are you unable to reproduce  the
>>> problem before the fixes or afterwards?  The first step is to  reproduce the
>>> problem, right?  To do so:
>>>
>>> A) Back out r20944.  Easy way to do that is just
>>>
>>>  % setenv OMPI_MCA_mpool_sm_min_size 0
>>>
>>> B)  Check that osu_bw.c hangs when using sm and you reach rendezvous
>>>  message size.
>>>
>>> C)  Introduce your changes and make sure that osu_bw.c runs to
>>>  completion.
>>>
>>> Can you please give it a try with 20946 and  20947 but without 20944?

>>>
>>> osu_bw.c hangs for me.  The PML fix did not seem to work.
>>>
>>
>> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] trac 1857: SM btl hangs when msg >=4k

2009-04-06 Thread Lenny Verkhovsky

Hi,

Changing default value is an easy fix. This fix will not add new possible
bugs/dead locks/pathes where noone has gone before on the PML level.
This fix can be added to Open MPI 1.3 that currently is blocked due to OSU
failure.

PML fix can be done later (IMHO)

Lenny.

On Sat, Apr 4, 2009 at 1:46 AM, Eugene Loh  wrote:

> What's next on this ticket?  It's supposed to be a blocker.  Again, the
> issue is that osu_bw deluges a receiver with rendezvous messages, but the
> receiver does not have enough eager frags to acknowledge them all.  We see
> this now that the sizing of the mmap file has changed and there's less
> headroom to grow the free lists.  Possible fixes are:
>
> A) Just make the mmap file default size larger (though less overkill than
> we used to have).
> B) Fix the PML code that is supposed to deal with cases like this.  (At
> least I think the PML has code that's intended for this purpose.)
>
>
> Eugene Loh wrote:
>
> In osu_bw, process 0 pumps lots of Isend's to process 1, and process 1 in
>> turn sets up lots of matching Irecvs.  Many messages are in flight.  The
>> question is what happens when resources are exhausted and OMPI cannot handle
>> so much in-flight traffic.  Let's specifically consider the case of long,
>> rendezvous messages.  There are at least two situations.
>>
>> 1) When the sender no longer has any fragments (nor can grow its free list
>> any more), it queues a send up with add_request_to_send_pending() and
>> somehow life is good.  The PML seems to handle this case "correctly".
>>
>> 2) When the receiver -- specifically
>> mca_pml_ob1_recv_request_ack_send_btl() -- no longer has any fragments to
>> send ACKs back to confirm readiness for rendezvous, the resource-exhaustion
>> signal travels up the call stack to mca_pml_ob1_recv_request_ack_send(), who
>> does a MCA_PML_OB1_ADD_ACK_TO_PENDING().  In short, the PML adds the ACK to
>> pckt_pending.  Somehow, this code path doesn't work.
>>
>> The reason we see the problem now is that I added "autosizing" of the
>> shared-memory area.  We used to mmap *WAY* too much shared-memory for
>> small-np jobs.  (Yes, that's a subjective statement.)  Meanwhile, at
>> large-np, we didn't mmap enough and jobs wouldn't start.  (Objective
>> statement there.)  So, I added heuristics to size the shared area
>> "appropriately".  The heuristics basically targetted the needs of
>> MPI_Init().  If you want fragment free lists to grow on demand after
>> MPI_Init(), you now basically have to bump mpool_sm_min_size up explicitly.
>>
>> I'd like feedback on a fix.  Here are two options:
>>
>> A) Someone (could be I) increases the default resources.  E.g., we could
>> start with a larger eager free list.  Or, I could change those "heuristics"
>> to allow some amount of headroom for free lists to grow on demand.  Either
>> way, I'd appreciate feedback on how big to set these things.
>>
>> B) Someone (not I, since I don't know how) fixes the ob1 PML to handle
>> scenario 2 above correctly.
>>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-26 Thread Lenny Verkhovsky

What is the error that you are getting from compilation failure?

Lenny.

On 3/23/09, Timothy Hayes  wrote:
>
> That's a relief to know, although I'm still a bit concerned. I'm looking at
> the code for the OpenMPI 1.3 trunk and in the ob1 component I can see the
> following sequence:
>
> mca_pml_ob1_recv_frag_callback_match -> append_frag_to_list ->
> MCA_PML_OB1_RECV_FRAG_ALLOC -> OMPI_FREE_LIST_WAIT -> __ompi_free_list_wait
>
> so I'm guessing unless the deadlock issue has been resolved for that
> function, it will still fail non deterministically. I'm quite eager to give
> it a try, but my component doesn't compile as is with the 1.3 source. Is it
> trivial to convert it?
>
> Or maybe you were suggesting that I go into the code of ob1 myself and
> manually change every _wait to _get?
>
> Kind regards
> Tim
>
> 2009/3/23 George Bosilca 
>
>> It is a known problem. When the freelist is empty going in the
>> ompi_free_list_wait will block the process until at least one fragment
>> became available. As a fragment can became available only when returned by
>> the BTL, this can lead to deadlocks in some cases. The workaround is to ban
>> the usage of the blocking _wait function, and replace it with the
>> non-blocking version _get. The PML has all the required logic to deal with
>> the cases where a fragment cannot be allocated. We changed most of the BTLs
>> to use _get instead of _wait few months ago.
>>
>>  Thanks,
>>george.
>>
>> On Mar 23, 2009, at 11:58 , Timothy Hayes wrote:
>>
>>  Hello,
>>>
>>> I'm working on an OpenMPI BTL component and am having a recurring
>>> problem, I was wondering if anyone could shed some light on it. I have a
>>> component that's quite straight forward, it uses a pair of lightweight
>>> sockets to take advantage of being in a virtualised environment
>>> (specifically Xen). My code is a bit messy and has lots of inefficiencies,
>>> but the logic seems sound enough. I've been able to execute a few simple
>>> programs successfully using the component, and they work most of the time.
>>>
>>> The problem I'm having is actually happening in higher layers,
>>> specifically in my asynchronous receive handler, when I call the callback
>>> function (cbfunc) that was set by the PML in the BTL initialisation phase.
>>> It seems to be getting stuck in an infinite loop at __ompi_free_list_wait(),
>>> in this function there is a condition variable which should get set
>>> eventually but just doesn't. I've stepped through it with GDB and I get a
>>> backtrace of something like this:
>>>
>>> mca_btl_xen_endpoint_recv_handler -> mca_btl_xen_endpoint_start_recv ->
>>> mca_pml_ob1_recv_frag_callback -> mca_pml_ob1_recv_frag_match ->
>>> __ompi_free_list_wait -> opal_condition_wait
>>>
>>> and from there it just loops. Although this is happening in higher
>>> levels, I haven't noticed something like this happening in any of the other
>>> BTL components so chances are there's something in my code that's causing
>>> this. I very much doubt that it's actually waiting for a list item to be
>>> returned since this infinite loop can occur non deterministically and
>>> sometimes even on the first receive callback.
>>>
>>> I'm really not too sure what else to include with this e-mail. I could
>>> send my source code (a bit nasty right now) if it would be helpful, but I'm
>>> hoping that someone might have noticed this problem before or something
>>> similar. Maybe I'm making a common mistake. Any advice would be really
>>> appreciated!
>>>
>>> I'm using OpenMPI 1.2.9 from the SVN tag repository.
>>>
>>> Kind regards
>>> Tim Hayes
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Infinite Loop: ompi_free_list_wait

2009-03-23 Thread Lenny Verkhovsky

did you try it with OpenMPI 1.3.1 version?

There have been few changes and bug fixes (example  r20591, fix in ob1 PML)
. 

Lenny.

2009/3/23 Timothy Hayes 

> Hello,
>
> I'm working on an OpenMPI BTL component and am having a recurring problem,
> I was wondering if anyone could shed some light on it. I have a component
> that's quite straight forward, it uses a pair of lightweight sockets to take
> advantage of being in a virtualised environment (specifically Xen). My code
> is a bit messy and has lots of inefficiencies, but the logic seems sound
> enough. I've been able to execute a few simple programs successfully using
> the component, and they work most of the time.
>
> The problem I'm having is actually happening in higher layers, specifically
> in my asynchronous receive handler, when I call the callback function
> (cbfunc) that was set by the PML in the BTL initialisation phase. It seems
> to be getting stuck in an infinite loop at __ompi_free_list_wait(), in this
> function there is a condition variable which should get set eventually but
> just doesn't. I've stepped through it with GDB and I get a backtrace of
> something like this:
>
> mca_btl_xen_endpoint_recv_handler -> mca_btl_xen_endpoint_start_recv ->
> mca_pml_ob1_recv_frag_callback -> mca_pml_ob1_recv_frag_match ->
> __ompi_free_list_wait -> opal_condition_wait
>
> and from there it just loops. Although this is happening in higher levels,
> I haven't noticed something like this happening in any of the other BTL
> components so chances are there's something in my code that's causing this.
> I very much doubt that it's actually waiting for a list item to be returned
> since this infinite loop can occur non deterministically and sometimes even
> on the first receive callback.
>
> I'm really not too sure what else to include with this e-mail. I could send
> my source code (a bit nasty right now) if it would be helpful, but I'm
> hoping that someone might have noticed this problem before or something
> similar. Maybe I'm making a common mistake. Any advice would be really
> appreciated!
>
> I'm using OpenMPI 1.2.9 from the SVN tag repository.
>
> Kind regards
> Tim Hayes
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] PML/ob1 problem

2009-03-03 Thread Lenny Verkhovsky

sorry, missed this commit.
Thanks, George,

On 3/3/09, George Bosilca <bosi...@eecs.utk.edu> wrote:
> Which solution seems to be working ?
>
>  This bug was fixed a while ago in the trunk
> (https://svn.open-mpi.org/trac/ompi/changeset/20591) and in
> the 1.3 branch. It even made it in the 1.3.2.
>
>   george.
>
>
>  On Mar 3, 2009, at 05:01 , Lenny Verkhovsky wrote:
>
>
> > Seems to be working.
> > George, can you commit it, pls.
> >
> > Thanks
> > Lenny.
> >
> >
> > On Thu, Feb 19, 2009 at 3:05 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
> >
> > > George -- any thoughts on this one?
> > >
> > > On Feb 11, 2009, at 1:01 AM, Mike Dubman wrote:
> > >
> > >
> > > >
> > > > Hello guys,
> > > >
> > > > I'm running some experimental tcp btl which implements rdma GET method
> and
> > > > advertises it in its flags of the btl API.
> > > > The btl`s send() method returns rc=1 to select fast path for PML.
> (this
> > > > optimization was added in revision 18551 in v1.3)
> > > >
> > > > It seems that in PML/ob1,
> mca_pml_ob1_send_request_start_rdma() function
> > > > does not treat right such combination (btl GET + fastpath rc>0) and
> going
> > > > into deadlock, i.e.
> > > >
> > > > +++ pml_ob1_sendreq.c +670
> > > > At this line, sendreq->req_state is 0
> > > >
> > > > +++ pml_ob1_sendreq.c +800
> > > > At this line, if btl has GET method and btl`s send() returned fastpath
> > > > hint - the call to
> mca_pml_ob1_rndv_completion_request() will decrement
> > > > sendreq->req_state by one, leaving it to -1.
> > > >
> > > > This value of -1 will keep
> send_request_pml_complete_check() from
> > > > completing request on PML level.
> > > >
> > > > The PML logic (in
> mca_pml_ob1_send_request_start_rdma) for PUT operation
> > > > initializes req_state to "2" in pml_ob1_sendreq.c +791, but leaves
> req_state
> > > > to 0 for GET operations.
> > > >
> > > > Please suggest.
> > > >
> > > > Thanks
> > > >
> > > > Mike.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > ___
> > > > devel mailing list
> > > > de...@open-mpi.org
> > > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > > >
> > >
> > >
> > > --
> > > Jeff Squyres
> > > Cisco Systems
> > >
> > > ___
> > > devel mailing list
> > > de...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > >
> > >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
>
>  ___
>  devel mailing list
>  de...@open-mpi.org
>  http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

[OMPI devel] r20436 brakes

2009-02-05 Thread Lenny Verkhovsky

Hi,
I think this is the fix for broken trunk
submitted in r20439.



Index: orte/tools/orte-bootproxy/Makefile.am
===
--- orte/tools/orte-bootproxy/Makefile.am   (revision 20438)
+++ orte/tools/orte-bootproxy/Makefile.am   (working copy)
@@ -25,7 +25,7 @@

 install-exec-hook:
test -z "$(bindir)" || $(mkdir_p) "$(DESTDIR)$(bindir)"
-   cp $(top_builddir)/orte/tools/orte-bootproxy/orte-bootproxy.sh
$(DESTDIR)$(bindir)
+   cp $(top_srcdir)/orte/tools/orte-bootproxy/orte-bootproxy.sh
$(DESTDIR)$(bindir)
chmod 755 $(DESTDIR)$(bindir)/orte-bootproxy.sh

 endif # OMPI_INSTALL_BINARIES



Lenny.

Re: [OMPI devel] BTL/sm meeting on Wed after Forum

2009-01-28 Thread Lenny Verkhovsky

Any chance of a conference call, this is very interesting issue , but
we can't attend in person :(

On Tue, Jan 27, 2009 at 10:38 PM, Jeff Squyres  wrote:
> On Jan 27, 2009, at 3:37 PM, Eugene Loh wrote:
>
>>> Jeff Squyres
>>> Rich Graham
>>> Brian Barrett
>>> George Bosilca
>>> Thomas Herault
>>> Terry Dontje
>>
>> Is this Feb 11?  How would it be if I attended?
>
> I'm a bozo -- I forgot to list you (even though Terry and I explicitly
> discussed this); sorry!  Yes, it would be great if you attended.
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] OpenMPI rpm build 1.3rc3r20226 build failed

2009-01-13 Thread Lenny Verkhovsky

I don't want to move changes ( default value of the flag), since there
are important people, for whom it works :)
I also think that this is VT issue, but I guess we are the only one
who experience the errors.

we can now overwrite this params from the environment as a workaround,
Mike comitted buildrpm.sh script to the trunk r20253 that allows
overwriting params from the environment.

we observed the problem on CentOS 5.2 with boundled gcc and RedHat 5.2
with boundled gcc.

#uname -a
Linux elfit1 2.6.18-92.el5 #1 SMP Tue Jun 10 18:51:06 EDT 2008 x86_64
x86_64 x86_64 GNU/Linux

#lsb_release -a
LSB Version:
:core-3.1-amd64:core-3.1-ia32:core-3.1-noarch:graphics-3.1-amd64:graphics-3.1-ia32:graphics-3.1-noarch
Distributor ID: CentOS
Description:CentOS release 5.2 (Final)
Release:5.2
Codename:   Final

gcc version 4.1.2 20071124 (Red Hat 4.1.2-42)

Best regards,
Lenny.


On Tue, Jan 13, 2009 at 4:40 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
> I'm still guessing that this is a distro / compiler issue -- I can build
> with the default flags just fine...?
>
> Can you specify what distro / compiler you were using?
>
> Also, if you want to move the changes that have been made to buildrpm.sh to
> the v1.3 branch, just file a CMR.  That file is not included in release
> tarballs, so Tim can move it over at any time.
>
>
>
> On Jan 13, 2009, at 6:35 AM, Lenny Verkhovsky wrote:
>
>> it seems that setting use_default_rpm_opt_flags to 0 solves the problem.
>> Maybe vt developers should take a look on it.
>>
>> Lenny.
>>
>>
>> On Sun, Jan 11, 2009 at 2:40 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
>>>
>>> This sounds like a distro/compiler version issue.
>>>
>>> Can you narrow down the issue at all?
>>>
>>>
>>> On Jan 11, 2009, at 3:23 AM, Lenny Verkhovsky wrote:
>>>
>>>> it doesnt happen if I do autogen, configure and make install,
>>>> only when I try to make an rpm from the tar file.
>>>>
>>>>
>>>>
>>>> On Thu, Jan 8, 2009 at 9:43 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
>>>>>
>>>>> This doesn't happen in a normal build of the same tree?
>>>>>
>>>>> I ask because both 1.3r20226 builds fine for me manually (i.e.,
>>>>> ./configure;make and buildrpm.sh).
>>>>>
>>>>>
>>>>> On Jan 8, 2009, at 8:15 AM, Lenny Verkhovsky wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am trying to build rpm from nightly snaposhots of 1.3
>>>>>>
>>>>>> with the downloaded buildrpm.sh and ompi.spec file from
>>>>>> http://svn.open-mpi.org/svn/ompi/branches/v1.3/contrib/dist/linux/
>>>>>>
>>>>>> I am getting this error
>>>>>> .
>>>>>> Making all in vtlib
>>>>>> make[5]: Entering directory
>>>>>>
>>>>>> `/hpc/home/USERS/lennyb/work/svn/release/scripts/dist-1.3--1/OMPI/BUILD/
>>>>>> openmpi-1.3rc3r20226/ompi/contrib/vt/vt/vtlib'
>>>>>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>>>>>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>>>>>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>>>>>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>>>>>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>>>>>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>>>>>> -mtune=generic -MT vt_comp_gnu.o -MD -MP -MF .deps/vt_comp_gnu.Tpo -c
>>>>>> -o
>>>>>> vt_comp_gnu.o vt_comp_gnu.c
>>>>>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>>>>>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>>>>>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>>>>>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>>>>>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>>>>>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>>>>>> -mtune=generic -MT vt_memhook.o -MD -MP -MF .deps/vt_memhook.Tpo -c -o
>>>>>> vt_memhook.o vt_memhook.c
>>>>>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>>>>>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>>>>>> -DBINDIR=\"/opt/openm

Re: [OMPI devel] OpenMPI rpm build 1.3rc3r20226 build failed

2009-01-13 Thread Lenny Verkhovsky

it seems that setting use_default_rpm_opt_flags to 0 solves the problem.
Maybe vt developers should take a look on it.

Lenny.


On Sun, Jan 11, 2009 at 2:40 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
> This sounds like a distro/compiler version issue.
>
> Can you narrow down the issue at all?
>
>
> On Jan 11, 2009, at 3:23 AM, Lenny Verkhovsky wrote:
>
>> it doesnt happen if I do autogen, configure and make install,
>> only when I try to make an rpm from the tar file.
>>
>>
>>
>> On Thu, Jan 8, 2009 at 9:43 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
>>>
>>> This doesn't happen in a normal build of the same tree?
>>>
>>> I ask because both 1.3r20226 builds fine for me manually (i.e.,
>>> ./configure;make and buildrpm.sh).
>>>
>>>
>>> On Jan 8, 2009, at 8:15 AM, Lenny Verkhovsky wrote:
>>>
>>>> Hi,
>>>>
>>>> I am trying to build rpm from nightly snaposhots of 1.3
>>>>
>>>> with the downloaded buildrpm.sh and ompi.spec file from
>>>> http://svn.open-mpi.org/svn/ompi/branches/v1.3/contrib/dist/linux/
>>>>
>>>> I am getting this error
>>>> .
>>>> Making all in vtlib
>>>> make[5]: Entering directory
>>>> `/hpc/home/USERS/lennyb/work/svn/release/scripts/dist-1.3--1/OMPI/BUILD/
>>>> openmpi-1.3rc3r20226/ompi/contrib/vt/vt/vtlib'
>>>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>>>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>>>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>>>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>>>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>>>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>>>> -mtune=generic -MT vt_comp_gnu.o -MD -MP -MF .deps/vt_comp_gnu.Tpo -c -o
>>>> vt_comp_gnu.o vt_comp_gnu.c
>>>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>>>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>>>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>>>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>>>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>>>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>>>> -mtune=generic -MT vt_memhook.o -MD -MP -MF .deps/vt_memhook.Tpo -c -o
>>>> vt_memhook.o vt_memhook.c
>>>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>>>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>>>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>>>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>>>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>>>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>>>> -mtune=generic -MT vt_memreg.o -MD -MP -MF .deps/vt_memreg.Tpo -c -o
>>>> vt_memreg.o vt_memreg.c
>>>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>>>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>>>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>>>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>>>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>>>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>>>> -mtune=generic -MT vt_iowrap.o -MD -MP -MF .deps/vt_iowrap.Tpo -c -o
>>>> vt_iowrap.o vt_iowrap.c
>>>> mv -f .deps/vt_memreg.Tpo .deps/vt_memreg.Po
>>>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>>>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>>>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>>>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>>>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>>>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>>>> -mtune=generic -MT vt_iowrap_helper.o -MD -MP -MF
>>>> .deps/vt_iowrap_helper.Tpo -c -o vt_iowrap_helper.o vt_iowrap_helper.c
>>>> mv -f .deps/vt_memhook.Tpo .deps/vt_memhook.Po
>>>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>>>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>>>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\&

Re: [OMPI devel] size of shared-memory backing file + maffinity

2009-01-13 Thread Lenny Verkhovsky

Actually the size is suppose to be the same,
It just suppose to bind process to it's closer memory node, instead of
leaving it to OS.

see:
mpool_sm_module.c:82:opal_maffinity_base_bind(, 1,
mpool_sm->mem_node);


Best regards
Lenny.

On Mon, Jan 12, 2009 at 10:02 PM, Eugene Loh  wrote:
> I'm trying to understand how much shared memory is allocated when maffinity
> is on.
>
> The sm BTL sets up a file that is mmapped into each local process's address
> space so that the processes on a node can communicate via shared memory.
>
> Actually, when maffinity indicates that there are multiple "memory nodes" on
> the node, then a separate file is set up and mmapped for each "memory node".
>
> There is an MCA parameter named "[mpool_sm_per_]peer_size", which by default
> is 32 Mbytes.  The idea is that there are n processes on the node, then the
> size of the file to be mmapped in is n*32M.
>
> But, if there are multiple "memory nodes", will there be that much more
> shared memory?  That is, is the total amount of shared memory that's mmapped
> into all the processes:
>
>  mem_nodes * num_local_procs * peer_size
>
> Or, should the file for a memory node be created with size proportional to
> the number of processes that correspond to that memory node?
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] OpenMPI rpm build 1.3rc3r20226 build failed

2009-01-11 Thread Lenny Verkhovsky

it doesnt happen if I do autogen, configure and make install,
only when I try to make an rpm from the tar file.



On Thu, Jan 8, 2009 at 9:43 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
> This doesn't happen in a normal build of the same tree?
>
> I ask because both 1.3r20226 builds fine for me manually (i.e.,
> ./configure;make and buildrpm.sh).
>
>
> On Jan 8, 2009, at 8:15 AM, Lenny Verkhovsky wrote:
>
>> Hi,
>>
>> I am trying to build rpm from nightly snaposhots of 1.3
>>
>> with the downloaded buildrpm.sh and ompi.spec file from
>> http://svn.open-mpi.org/svn/ompi/branches/v1.3/contrib/dist/linux/
>>
>> I am getting this error
>> .
>> Making all in vtlib
>> make[5]: Entering directory
>> `/hpc/home/USERS/lennyb/work/svn/release/scripts/dist-1.3--1/OMPI/BUILD/
>> openmpi-1.3rc3r20226/ompi/contrib/vt/vt/vtlib'
>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>> -mtune=generic -MT vt_comp_gnu.o -MD -MP -MF .deps/vt_comp_gnu.Tpo -c -o
>> vt_comp_gnu.o vt_comp_gnu.c
>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>> -mtune=generic -MT vt_memhook.o -MD -MP -MF .deps/vt_memhook.Tpo -c -o
>> vt_memhook.o vt_memhook.c
>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>> -mtune=generic -MT vt_memreg.o -MD -MP -MF .deps/vt_memreg.Tpo -c -o
>> vt_memreg.o vt_memreg.c
>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>> -mtune=generic -MT vt_iowrap.o -MD -MP -MF .deps/vt_iowrap.Tpo -c -o
>> vt_iowrap.o vt_iowrap.c
>> mv -f .deps/vt_memreg.Tpo .deps/vt_memreg.Po
>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>> -mtune=generic -MT vt_iowrap_helper.o -MD -MP -MF
>> .deps/vt_iowrap_helper.Tpo -c -o vt_iowrap_helper.o vt_iowrap_helper.c
>> mv -f .deps/vt_memhook.Tpo .deps/vt_memhook.Po
>> gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
>> -I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
>> -DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
>> -DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
>> -DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>> -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
>> -mtune=generic -MT rfg_regions.o -MD -MP -MF .deps/rfg_regions.Tpo -c -o
>> rfg_regions.o rfg_regions.c
>> vt_iowrap.c:1242: error: expected declaration specifiers or '...' before
>> numeric constant
>> vt_iowrap.c:1243: error: conflicting types for '__fprintf_chk'
>> mv -f .deps/vt_comp_gnu.Tpo .deps/vt_comp_gnu.Po
>> make[5]: *** [vt_iowrap.o] Error 1
>> make[5]: *** Waiting for unfinished jobs
>> mv -f .deps/vt_iowrap_helper.Tpo .deps/vt_iowrap_helper.Po
>> mv -f .deps/rfg_regions.Tpo .deps/rfg_regions.Po
>> make[5]: Leaving directory
>> `/hpc/home/USERS/lennyb/work/svn/release/scripts/dist-1.3--1/OMPI/BUILD/
>> openmpi-1.3rc3r20226/ompi/contrib/vt/vt/

[OMPI devel] OpenMPI rpm build 1.3rc3r20226 build failed

2009-01-08 Thread Lenny Verkhovsky

Hi,

I am trying to build rpm from nightly snaposhots of 1.3

with the downloaded buildrpm.sh and ompi.spec file from
http://svn.open-mpi.org/svn/ompi/branches/v1.3/contrib/dist/linux/

I am getting this error
.
Making all in vtlib
make[5]: Entering directory
`/hpc/home/USERS/lennyb/work/svn/release/scripts/dist-1.3--1/OMPI/BUILD/
openmpi-1.3rc3r20226/ompi/contrib/vt/vt/vtlib'
gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
-I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
-DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
-DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
-DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
-mtune=generic -MT vt_comp_gnu.o -MD -MP -MF .deps/vt_comp_gnu.Tpo -c -o
vt_comp_gnu.o vt_comp_gnu.c
gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
-I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
-DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
-DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
-DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
-mtune=generic -MT vt_memhook.o -MD -MP -MF .deps/vt_memhook.Tpo -c -o
vt_memhook.o vt_memhook.c
gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
-I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
-DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
-DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
-DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
-mtune=generic -MT vt_memreg.o -MD -MP -MF .deps/vt_memreg.Tpo -c -o
vt_memreg.o vt_memreg.c
gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
-I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
-DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
-DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
-DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
-mtune=generic -MT vt_iowrap.o -MD -MP -MF .deps/vt_iowrap.Tpo -c -o
vt_iowrap.o vt_iowrap.c
mv -f .deps/vt_memreg.Tpo .deps/vt_memreg.Po
gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
-I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
-DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
-DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
-DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
-mtune=generic -MT vt_iowrap_helper.o -MD -MP -MF
.deps/vt_iowrap_helper.Tpo -c -o vt_iowrap_helper.o vt_iowrap_helper.c
mv -f .deps/vt_memhook.Tpo .deps/vt_memhook.Po
gcc  -DHAVE_CONFIG_H -I. -I.. -I../tools/opari/lib
-I../extlib/otf/otflib -I../extlib/otf/otflib -D_GNU_SOURCE
-DBINDIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/bin\"
-DDATADIR=\"/opt/openmpi/1.3rc3r20226-V00/gcc/share\" -DRFG
-DVT_MEMHOOK -DVT_IOWRAP  -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
-fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64
-mtune=generic -MT rfg_regions.o -MD -MP -MF .deps/rfg_regions.Tpo -c -o
rfg_regions.o rfg_regions.c
vt_iowrap.c:1242: error: expected declaration specifiers or '...' before
numeric constant
vt_iowrap.c:1243: error: conflicting types for '__fprintf_chk'
mv -f .deps/vt_comp_gnu.Tpo .deps/vt_comp_gnu.Po
make[5]: *** [vt_iowrap.o] Error 1
make[5]: *** Waiting for unfinished jobs
mv -f .deps/vt_iowrap_helper.Tpo .deps/vt_iowrap_helper.Po
mv -f .deps/rfg_regions.Tpo .deps/rfg_regions.Po
make[5]: Leaving directory
`/hpc/home/USERS/lennyb/work/svn/release/scripts/dist-1.3--1/OMPI/BUILD/
openmpi-1.3rc3r20226/ompi/contrib/vt/vt/vtlib'
make[4]: *** [all-recursive] Error 1
make[4]: Leaving directory
`/hpc/home/USERS/lennyb/work/svn/release/scripts/dist-1.3--1/OMPI/BUILD/
openmpi-1.3rc3r20226/ompi/contrib/vt/vt'
make[3]: *** [all] Error 2
make[3]: Leaving directory
`/hpc/home/USERS/lennyb/work/svn/release/scripts/dist-1.3--1/OMPI/BUILD/
openmpi-1.3rc3r20226/ompi/contrib/vt/vt'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory
`/hpc/home/USERS/lennyb/work/svn/release/scripts/dist-1.3--1/OMPI/BUILD/
openmpi-1.3rc3r20226/ompi/contrib/vt'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory
`/hpc/home/USERS/lennyb/work/svn/release/scripts/dist-1.3--1/OMPI/BUILD/
openmpi-1.3rc3r20226/ompi'
make: *** [all-recursive] Error 1
error: Bad exit status from /var/tmp/rpm-tmp.32080 (%build)


RPM build errors:
Bad exit status from /var/tmp/rpm-tmp.32080 (%build)



RPM build errors:
Bad exit status from /var/tmp/rpm-tmp.32080 (%build)


full error.log attached


thanks,
Lenny.
Installing 
/hpc/home/USERS/lennyb/work/svn/release/scripts/dist-1.3--1/rpmroot/SRPMS/openmpi-1.3rc3r20226-V00.src.rpm
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.32623
+ umask 022
+ cd

Re: [OMPI devel] Disappearing for US holidays...

2008-12-22 Thread Lenny Verkhovsky

Happy holidays

On Mon, Dec 22, 2008 at 4:09 PM, Ralph Castain  wrote:
> Ditto here. In fact, LANL is once again shutting off external email access
> beginning the evening of the 24th, resuming on Jan 5th.
>
> Those of you who need me know how to reach me via alternative channels  :-))
>
> Ralph
>
> On Dec 22, 2008, at 7:04 AM, Jeff Squyres wrote:
>
>> FYI: I'm going mostly offline for the next two weeks.  Depending on how
>> stir-crazy I get, I may still surface periodically for some email.  So don't
>> expect quick replies from me until January.  :-)
>>
>> Due to Cisco corporate policy, I'll be shutting down most of my MTT runs
>> as well (power savings, yadda yadda yadda -- we're not likely to be changing
>> much in the code base over the holidays, anyway).
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] [OMPI users] OpenMPI with openib partitions

2008-10-07 Thread Lenny Verkhovsky

Hi Matt,

It seems that the right way to do it is the fallowing:

-mca btl openib,self -mca btl_openib_ib_pkey_val 33033

when the value is a decimal number of the pkey, in your case 0x8109 = 33033,
and no need for btl_openib_ib_pkey_ix value.

ex.

mpirun -np 2 -H witch2,witch3 -mca btl openib,self -mca
btl_openib_ib_pkey_val 32769 ./mpi_p1_4_1_2 -t lt
LT (2) (size min max avg) 1 3.511429 3.511429 3.511429

if it's not working check cat /sys/class/infiniband/mthca0/ports/1/pkeys/*
for pkeys ans SM, maybe it's a setup.

Pasha is currently checking this issue.

Best regards,

Lenny.





On 10/7/08, Jeff Squyres <jsquy...@cisco.com> wrote:
>
> FWIW, if this configuration is for all of your users, you might want to
> specify these MCA params in the default MCA param file, or the environment,
> ...etc.  Just so that you don't have to specify it on every mpirun command
> line.
>
> See http://www.open-mpi.org/faq/?category=tuning#setting-mca-params.
>
>
> On Oct 7, 2008, at 5:43 AM, Lenny Verkhovsky wrote:
>
>  Sorry, misunderstood the question,
>>
>> thanks for Pasha the right command line will be
>>
>> -mca btl openib,self -mca btl_openib_of_pkey_val 0x8109 -mca
>> btl_openib_of_pkey_ix 1
>>
>> ex.
>>
>> #mpirun -np 2 -H witch2,witch3 -mca btl openib,self -mca
>> btl_openib_of_pkey_val 0x8001 -mca btl_openib_of_pkey_ix 1 ./mpi_p1_4_TRUNK
>> -t lt
>> LT (2) (size min max avg) 1 3.443480 3.443480 3.443480
>>
>>
>> Best regards
>>
>> Lenny.
>>
>>
>> On 10/6/08, Jeff Squyres <jsquy...@cisco.com> wrote: On Oct 5, 2008, at
>> 1:22 PM, Lenny Verkhovsky wrote:
>>
>> you should probably use -mca tcp,self  -mca btl_openib_if_include ib0.8109
>>
>>
>> Really?  I thought we only took OpenFabrics device names in the
>> openib_if_include MCA param...?  It looks like ib0.8109 is an IPoIB device
>> name.
>>
>>
>>
>> Lenny.
>>
>>
>> On 10/3/08, Matt Burgess <burgess.m...@gmail.com> wrote:
>> Hi,
>>
>>
>> I'm trying to get openmpi working over openib partitions. On this cluster,
>> the partition number is 0x109. The ib interfaces are pingable over the
>> appropriate ib0.8109 interface:
>>
>> d2:/opt/openmpi-ib # ifconfig ib0.8109
>> ib0.8109  Link encap:UNSPEC  HWaddr
>> 80-00-00-4A-FE-80-00-00-00-00-00-00-00-00-00-00
>> inet addr:10.21.48.2  Bcast:10.21.255.255  Mask:255.255.0.0
>> inet6 addr: fe80::202:c902:26:ca01/64 Scope:Link
>> UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
>> RX packets:16811 errors:0 dropped:0 overruns:0 frame:0
>> TX packets:15848 errors:0 dropped:1 overruns:0 carrier:0
>> collisions:0 txqueuelen:256
>> RX bytes:102229428 (97.4 Mb)  TX bytes:102324172 (97.5 Mb)
>>
>>
>> I have tried the following:
>>
>> /opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl
>> openib,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109
>> -mca btl_openib_ib_pkey_ix 1 /cluster/pallas/x86_64-ib/IMB-MPI1
>>
>> but I just get a RETRY EXCEEDED ERROR. Is there a MCA parameter I am
>> missing?
>>
>> I was successful using tcp only:
>>
>> /opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl
>> tcp,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109
>> /cluster/pallas/x86_64-ib/IMB-MPI1
>>
>>
>>
>> Thanks,
>> Matt Burgess
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>
> --
> Jeff Squyres
> Cisco Systems
>
>

Re: [OMPI devel] RDMA_CM

2008-09-25 Thread Lenny Verkhovsky

I think it's a sm bug again I tested with the latest revision, I think it
was r19588 ( before Jeff shuted the svn down).
I run the mpi_p test ( BW between pairs of nodes ) with many nodes and it
got stacked, it also works without sm.  I am sorry I couldn't test it
earlier.
# i=1 ; while [ 1 ] ; do echo " ** i=$i  ";
/home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun -np 84 -hostfile hostfile
/home/USERS/lenny/TESTS/TRUNK/mpi_p1_4_TRUNK -t bw ; let i=i+1; sleep 1 ;
done
  ** i=1 
BW (84) (size min max avg) 1048576 660.152249 2075.115025 1325.838953
  ** i=2 
[stucked]

p.s. I will be on vacation until 5-Oct, I hope to fallow mails and run few
tests.
Best Regards
Lenny.
On Thu, Sep 25, 2008 at 6:44 PM, Jeff Squyres <jsquy...@cisco.com> wrote:

> Note that there *are* other changes to the openib BTL in that branch
> besides just the CPC (meaning: changing the CPC meant changing other things
> as well).
>
> So if you can run with the trunk and you can't run with this branch, then
> there may be something different wrong with the hg tree other than just the
> RDMA CM stuff...
>
> Let me know what you find.
>
>
> On Sep 25, 2008, at 9:21 AM, Lenny Verkhovsky wrote:
>
>  after few more tests is seems like -mca btl_openib_cpc_include oob hangs
>> too.
>>
>> so, maybe it's something environmental.
>>
>> let me recheck it.
>>
>>
>> On 9/25/08, Jeff Squyres <jsquy...@cisco.com> wrote: On Sep 25, 2008, at
>> 7:25 AM, Lenny Verkhovsky wrote:
>>
>> I have RDMACM got hanged on np=16 ( dual core dual cpu).
>>
>>
>> Yuck.  I've run all of the intel tests at 32 procs (4ppn).  What exactly
>> did you run and where exactly did it hang?  Can you get stack traces?
>>
>> it seems like it got hanged on the last machine (
>> witch1,witch2,witch3,witch4)
>>
>> when I ctrl-c the mpirun, I got defunct procs on the last machine.
>>
>> #ps -ef |grep  mpi
>> root 5321 5320 98 14:09 ? 00:03:47 [mpi_p_TRUNK_rdm] 
>> root 5322 5320 98 14:09 ? 00:03:47 [mpi_p_TRUNK_rdm] 
>> root 5323 5320 98 14:09 ? 00:03:47 [mpi_p_TRUNK_rdm] 
>> root 5324 5320 98 14:09 ? 00:03:47 [mpi_p_TRUNK_rdm] 
>>
>>
>> Are you seeing ORTE problems?
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>>
>>
>
> --
> Jeff Squyres
> Cisco Systems
>
>

[OMPI devel] Fwd: svn error

2008-09-25 Thread Lenny Verkhovsky

sorry, didnt see Jeff's mail

-- Forwarded message --
From: Lenny Verkhovsky <lenny.verkhov...@gmail.com>
List-Post: devel@lists.open-mpi.org
Date: Thu, Sep 25, 2008 at 9:07 PM
Subject: svn error
To: Open MPI Developers <de...@open-mpi.org>


I experience

#svn up
svn: PROPFIND request failed on '/svn/ompi/trunk'
svn: PROPFIND of '/svn/ompi/trunk': 403 Forbidden (http://svn.open-mpi.org)

Lenny

[OMPI devel] svn error

2008-09-25 Thread Lenny Verkhovsky

I experience

#svn up
svn: PROPFIND request failed on '/svn/ompi/trunk'
svn: PROPFIND of '/svn/ompi/trunk': 403 Forbidden (http://svn.open-mpi.org)

Lenny

[OMPI devel] #1506

2008-09-23 Thread Lenny Verkhovsky

Hi George,

It seems like some data corruption in Reduce_scatter function

I discovered it when added -DCHECK to IMB benchmark, and it seemed to be
there for ages.

it runs with voltaire MPI, but failes with OMPI. you will get a seqv with
IMB3.1 and error with IMB3.0

host#VER=TRUNK ; /home/USERS/lenny/OMPI_ORTE_${VER}/bin/mpirun -np 2 -H
witch8 /home/BENCHMARKS/PALLAS/IMB_3.0v/src/IMB-MPI1_${VER} Reduce_scatter

#---
# Intel (R) MPI Benchmark Suite V3.0v modified by Voltaire, MPI-1 part
#---
# Date : Tue Sep 23 18:05:35 2008
# Machine : x86_64
# System : Linux
# Release : 2.6.16.46-0.12-smp
# Version : #1 SMP Thu May 17 14:00:09 UTC 2007
# MPI Version : 2.0
# MPI Thread Environment: MPI_THREAD_SINGLE

#
# Minimum message length in bytes: 0
# Maximum message length in bytes: 67108864
#
# MPI_Datatype : MPI_BYTE
# MPI_Datatype for reductions : MPI_FLOAT
# MPI_Op : MPI_SUM
#
#

# List of Benchmarks to run:

# Reduce_scatter

#-
# Benchmarking Reduce_scatter
# #processes = 2
#-
#Benchmarking #procs #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
defects
Reduce_scatter 2 0 1000 0.05 0.05 0.05 0.00
0: Error Reduce_scatter, size = 4, sample #0
Process 0: Got invalid buffer:
Buffer entry: 817291591680.00
pos: 0
Process 0: Expected buffer:
Buffer entry: 0.00
Reduce_scatter 2 4 1000 0.98 1.06 1.02 1.00
Application error code 1 occurred
[witch8:10190] MPI_ABORT invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 17
--
mpirun has exited due to process rank 0 with PID 10190 on
node witch8 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--

[OMPI devel] Any problems with https://svn.open-mpi.org/trac/ompi/ ??

2008-09-18 Thread Lenny Verkhovsky

Any problems with https://svn.open-mpi.org/trac/ompi/ ??

I can open a new ticket :(

Internal Server Error
The server encountered an internal error or misconfiguration and was unable
to complete your request.

Please contact the server administrator, osl-sysad...@osl.iu.edu and inform
them of the time the error occurred, and anything you might have done that
may have caused the error.

More information about this error may be available in the server error log.




Apache/2.0.52 (Red Hat) Server at svn.open-mpi.org Port 443

L.

[OMPI devel] mtt IBM reduce_scatter_in_place test failure

2008-09-16 Thread Lenny Verkhovsky

I am running mtt test on our  cluster and I found error for
IBM reduce_scatter_in_place  test for np>8

/home/USERS/lenny/OMPI_1_3_TRUNK/bin/mpirun -np 10 -H witch2
./reduce_scatter_in_place

**WARNING**]: MPI_COMM_WORLD rank 4, file reduce_scatter_in_place.c:80:
bad answer (0) at index 0 of 1000 (should be 4)
[**WARNING**]: MPI_COMM_WORLD rank 3, file reduce_scatter_in_place.c:80:
[**WARNING**]: MPI_COMM_WORLD rank 2, file reduce_scatter_in_place.c:80:
bad answer (20916) at index 0 of 1000 (should be 2)
bad answer (0) at index 0 of 1000 (should be 3)
[**WARNING**]: MPI_COMM_WORLD rank 5, file reduce_scatter_in_place.c:80:
bad answer (0) at index 0 of 1000 (should be 5)
[**WARNING**]: MPI_COMM_WORLD rank 6, file reduce_scatter_in_place.c:80:
bad answer (0) at index 0 of 1000 (should be 6)
[**WARNING**]: MPI_COMM_WORLD rank 7, file reduce_scatter_in_place.c:80:
[**WARNING**]: MPI_COMM_WORLD rank 8, file reduce_scatter_in_place.c:80:
bad answer (0) at index 0 of 1000 (should be 8)
bad answer (0) at index 0 of 1000 (should be 7)
[**WARNING**]: MPI_COMM_WORLD rank 9, file reduce_scatter_in_place.c:80:
bad answer (0) at index 0 of 1000 (should be 9)
[**WARNING**]: MPI_COMM_WORLD rank 0, file reduce_scatter_in_place.c:80:
bad answer (-516024720) at index 0 of 1000 (should be 0)
[**WARNING**]: MPI_COMM_WORLD rank 1, file reduce_scatter_in_place.c:80:
bad answer (28112) at index 0 of 1000 (should be 1)

I think that the error is in the test itself.

--- sources/test_get__ibm/ibm/collective/reduce_scatter_in_place.c
2005-09-28 18:11:37.0 +0300
+++ installs/LKcC/tests/ibm/ibm/collective/reduce_scatter_in_place.c
2008-09-16 19:32:48.0 +0300
@@ -64,7 +64,7 @@ int main(int argc, char **argv)
  ompitest_error(__FILE__, __LINE__, "Doh! Rank %d was not able to allocate
enough memory. MPI test aborted!\n", myself);
  }

- for (j = 1; j <= MAXLEN; j *= 10) {
+ for (j = 1; j < MAXLEN; j *= 10) {
  for (i = 0; i < tasks; i++) {
  recvcounts[i] = j;
  }

I am not sure if this is right fix and who can review/commit it to the test
trunk.

Best regards

Lenny.

Re: [OMPI devel] ticket #1469

2008-09-11 Thread Lenny Verkhovsky

now seems to be fixed with r19538.

On 9/10/08, Ralph Castain <r...@lanl.gov> wrote:
>
> I'm sorry - I can't even make sense of this. If you think you can reproduce
> it, then you are welcome to fix it. I cannot reproduce it, and hence can do
> nothing further about it.
>
> Ralph
>
>
> On Sep 10, 2008, at 2:01 AM, Lenny Verkhovsky wrote:
>
>  Hi Ralph,
>>
>> I can recreate this failure, I think it caused by the fact that we do not
>> open orted on the last node( also I didnt check it ), since np < number of
>> hosts.
>>
>> I used the falowing configure line ../configure
>> --prefix=/home/USERS/lenny/OMPI_ORTE_TRUNK
>>
>> on OMPI 1.4a1r19522
>> Hope it helped.
>>
>> #mpirun -np 3 -H witch2 ./spawn_multiple
>> Parent: 1 of 3, witch2 (1 in init)
>> Parent: 0 of 3, witch2 (1 in init)
>> Parent: 2 of 3, witch2 (1 in init)
>> #mpirun -np 3 -H witch2,witch3 ./spawn_multiple
>> Parent: 0 of 3, witch2 (0 in init)
>> Parent: 2 of 3, witch2 (0 in init)
>> Parent: 1 of 3, witch3 (0 in init)
>> #mpirun -np 3 -H witch2,witch3,witch4 ./spawn_multiple
>> Parent: 0 of 3, witch2 (0 in init)
>> Parent: 1 of 3, witch3 (0 in init)
>> Parent: 2 of 3, witch4 (0 in init)
>> #mpirun -np 3 -H witch2,witch3,witch4,witch5 ./spawn_multiple
>> Parent: 0 of 3, witch2 (0 in init)
>> Parent: 1 of 3, witch3 (0 in init)
>> Parent: 2 of 3, witch4 (0 in init)
>> [witch1:04806] *** Process received signal ***
>> [witch1:04806] Signal: Segmentation fault (11)
>> [witch1:04806] Signal code: Address not mapped (1)
>> [witch1:04806] Failing at address: 0x38
>> [witch1:04806] [ 0] /lib64/libpthread.so.0 [0x2af5324e9c10]
>> [witch1:04806] [ 1]
>> /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libopen-rte.so.0(orte_plm_base_app_report_launch+0x27a)
>> [0x2af531de3dca]
>> [witch1:04806] [ 2] /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libopen-pal.so.0
>> [0x2af531f161bb]
>> [witch1:04806] [ 3] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
>> [0x40378f]
>> [witch1:04806] [ 4] /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libopen-pal.so.0
>> [0x2af531f161bb]
>> [witch1:04806] [ 5]
>> /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libopen-pal.so.0(opal_progress+0x9e)
>> [0x2af531f0bf5e]
>> [witch1:04806] [ 6]
>> /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libopen-rte.so.0(orte_trigger_event+0x44)
>> [0x2af531dc6c84]
>> [witch1:04806] [ 7]
>> /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libopen-rte.so.0(orte_plm_base_app_report_launch+0x20b)
>> [0x2af531de3d5b]
>> [witch1:04806] [ 8] /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libopen-pal.so.0
>> [0x2af531f161bb]
>> [witch1:04806] [ 9]
>> /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libopen-pal.so.0(opal_progress+0x9e)
>> [0x2af531f0bf5e]
>> [witch1:04806] [10]
>> /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libopen-rte.so.0(orte_plm_base_launch_apps+0x227)
>> [0x2af531de47e7]
>> [witch1:04806] [11]
>> /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/openmpi/mca_plm_rsh.so
>> [0x2af532c38d3d]
>> [witch1:04806] [12]
>> /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libopen-rte.so.0(orte_plm_base_receive_process_msg+0x456)
>> [0x2af531de3086]
>> [witch1:04806] [13] /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libopen-pal.so.0
>> [0x2af531f161bb]
>> [witch1:04806] [14] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
>> [0x4033bc]
>> [witch1:04806] [15] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
>> [0x402c23]
>> [witch1:04806] [16] /lib64/libc.so.6(__libc_start_main+0xf4)
>> [0x2af532610154]
>> [witch1:04806] [17] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
>> [0x402b79]
>> [witch1:04806] *** End of error message ***
>> Segmentation fault
>>
>> Lenny.
>>
>>
>

Re: [OMPI devel] mtt IBM SPAWN error

2008-09-04 Thread Lenny Verkhovsky

isn't it related to https://svn.open-mpi.org/trac/ompi/ticket/1469 ?

On 6/30/08, Lenny Verkhovsky <lenny.verkhov...@gmail.com> wrote:
>
> I am not familiar with spawn test of IBM, but maybe this is right
> behavior,
> if spawn test allocates 3 ranks on the node, and then allocates another 3
> then this test suppose to fail due to max_slots=4.
>
> But it fails with the fallowing hostfile as well BUT WITH A DIFFERENT
> ERROR.
>
> #cat hostfile2
> witch2 slots=4 max_slots=4
> witch3 slots=4 max_slots=4
> witch1:/home/BENCHMARKS/IBM # /home/USERS/lenny/OMPI_ORTE_18772/bin/mpirun
> -np 3 -hostfile hostfile2 dynamic/spawn
> bash: orted: command not found
> [witch1:22789]
> --
> A daemon (pid 22791) died unexpectedly with status 127 while attempting
> to launch so we are aborting.
> There may be more information reported by the environment (see above).
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --
> [witch1:22789]
> --
> mpirun was unable to cleanly terminate the daemons on the nodes shown
> below. Additional manual cleanup may be required - please refer to
> the "orte-clean" tool for assistance.
> ------
> witch3 - daemon did not report back when launched
>
> On Mon, Jun 30, 2008 at 9:38 AM, Lenny Verkhovsky <
> lenny.verkhov...@gmail.com> wrote:
>
>> Hi,
>> trying to run mtt I failed to run IBM spawn test. It fails only when using
>> hostfile, and not when using host list.
>> ( OMPI from TRUNK )
>>
>> This is working :
>> #mpirun -np 3 -H witch2 dynamic/spawn
>>
>> This Fails:
>> # cat hostfile
>> witch2 slots=4 max_slots=4
>>
>> #mpirun -np 3 -hostfile hostfile dynamic/spawn
>> [witch1:12392]
>> --
>> There are not enough slots available in the system to satisfy the 3 slots
>> that were requested by the application:
>>   dynamic/spawn
>>
>> Either request fewer slots for your application, or make more slots
>> available
>> for use.
>> --
>> [witch1:12392]
>> --
>> A daemon (pid unknown) died unexpectedly on signal 1  while attempting to
>> launch so we are aborting.
>>
>> There may be more information reported by the environment (see above).
>> This may be because the daemon was unable to find all the needed shared
>> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
>> location of the shared libraries on the remote nodes and this will
>> automatically be forwarded to the remote nodes.
>> --
>> mpirun: clean termination accomplished
>>
>>
>> Using hostfile1 also works
>> #cat hostfile1
>> witch2
>> witch2
>> witch2
>>
>>
>> Best Regards
>> Lenny.
>>
>
>

Re: [OMPI devel] TRUNK error ( MAN page ) since r19120

2008-08-04 Thread Lenny Verkhovsky

automake (GNU automake) 1.10

autoconf (GNU Autoconf) 2.61

ltmain.sh (GNU libtool) 1.5.24 (1.1220.2.455 2007/06/24 02:13:29)


On 8/3/08, Jeff Squyres <jsquy...@cisco.com> wrote:
>
> What version of the GNU auto tools are you using?
>
> On Aug 3, 2008, at 9:42 AM, Lenny Verkhovsky wrote:
>
>  I downloaded from the trunk,
>>
>> ./autogen.sh
>>
>> ./configure .
>>
>> make all install
>>
>> config.log attached
>>
>>
>> On 8/3/08, Jeff Squyres <jsquy...@cisco.com> wrote: We are not seeing
>> this error; can you please send all the required info?  This is not enough
>> info to diagnose why you are seeing it and we are not.
>>
>>   http://www.open-mpi.org/community/help/
>>
>>
>>
>> On Aug 3, 2008, at 8:53 AM, Lenny Verkhovsky wrote:
>>
>> Hi,
>>
>> I experience this error since r19120
>>
>> #make all install
>>
>> ( a lot of output )
>>
>> creating libmpi.la
>> (cd .libs && rm -f libmpi.la && ln -s ../libmpi.la libmpi.la)
>> Creating mpi/man/man3/MPI.3 man page...
>> /bin/sh: mpi/man/man3/MPI.3: No such file or directory
>> make[2]: *** [mpi/man/man3/MPI.3] Error 1
>> make[2]: Leaving directory
>> `/home/USERS/lenny/OMPI_ORTE_CODE/ompi-trunk_19120/build/ompi'
>> make[1]: *** [all-recursive] Error 1
>> make[1]: Leaving directory
>> `/home/USERS/lenny/OMPI_ORTE_CODE/ompi-trunk_19120/build/ompi'
>> make: *** [all-recursive] Error 1
>>
>>
>> p.s. of course I  run autogen and configure.
>>
>>
>> Lenny.
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> *
>> **
>> **
>> ** WARNING:  This email contains an attachment of a very suspicious type.
>>  **
>> ** You are urged NOT to open this attachment unless you are absolutely
>> **
>> ** sure it is legitimate.  Opening this attachment may cause irreparable
>> **
>> ** damage to your computer and your files.  If you have any questions
>>  **
>> ** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT.
>> **
>> **
>> **
>> ** This warning was added by the IU Computer Science Dept. mail scanner.
>> **
>>
>> *
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

[OMPI devel] new Open MPI team member

2008-08-04 Thread Lenny Verkhovsky

Hi, all,
I would like to introduce to you Mike Dubman, He recently joined Voltaire
and will replace Sharon Melamed as HPC team manager who moved to another
position within Voltaire.

Mike will need login and psw to the Open MPI SVN repository.

His email is mike.o...@gmail.com

We are sending updated http://www.open-mpi.org/community/contribute/open-
pi-corporate-contributor-agrement.pdf by mail as well.


Best Regards,
Lenny.

Re: [OMPI devel] TRUNK error ( MAN page ) since r19120

2008-08-03 Thread Lenny Verkhovsky

I downloaded from the trunk,

./autogen.sh

./configure .

make all install

config.log attached

On 8/3/08, Jeff Squyres <jsquy...@cisco.com> wrote:
>
> We are not seeing this error; can you please send all the required info?
>  This is not enough info to diagnose why you are seeing it and we are not.
>
>http://www.open-mpi.org/community/help/
>
>
> On Aug 3, 2008, at 8:53 AM, Lenny Verkhovsky wrote:
>
>  Hi,
>>
>> I experience this error since r19120
>>
>> #make all install
>>
>> ( a lot of output )
>>
>> creating libmpi.la
>> (cd .libs && rm -f libmpi.la && ln -s ../libmpi.la libmpi.la)
>> Creating mpi/man/man3/MPI.3 man page...
>> /bin/sh: mpi/man/man3/MPI.3: No such file or directory
>> make[2]: *** [mpi/man/man3/MPI.3] Error 1
>> make[2]: Leaving directory
>> `/home/USERS/lenny/OMPI_ORTE_CODE/ompi-trunk_19120/build/ompi'
>> make[1]: *** [all-recursive] Error 1
>> make[1]: Leaving directory
>> `/home/USERS/lenny/OMPI_ORTE_CODE/ompi-trunk_19120/build/ompi'
>> make: *** [all-recursive] Error 1
>>
>>
>> p.s. of course I  run autogen and configure.
>>
>>
>> Lenny.
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>


*
** **
** WARNING:  This email contains an attachment of a very suspicious type.  **
** You are urged NOT to open this attachment unless you are absolutely **
** sure it is legitimate.  Opening this attachment may cause irreparable   **
** damage to your computer and your files.  If you have any questions  **
** about the validity of this message, PLEASE SEEK HELP BEFORE OPENING IT. **
** **
** This warning was added by the IU Computer Science Dept. mail scanner.   **
*


<>

[OMPI devel] TRUNK error ( MAN page ) since r19120

2008-08-03 Thread Lenny Verkhovsky

Hi,

I experience this error since r19120

#make all install

( a lot of output )

creating libmpi.la
(cd .libs && rm -f libmpi.la && ln -s ../libmpi.la libmpi.la)
Creating mpi/man/man3/MPI.3 man page...
/bin/sh: mpi/man/man3/MPI.3: No such file or directory
make[2]: *** [mpi/man/man3/MPI.3] Error 1
make[2]: Leaving directory
`/home/USERS/lenny/OMPI_ORTE_CODE/ompi-trunk_19120/build/ompi'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory
`/home/USERS/lenny/OMPI_ORTE_CODE/ompi-trunk_19120/build/ompi'
make: *** [all-recursive] Error 1

p.s. of course I  run autogen and configure.


Lenny.

Re: [OMPI devel] Change in slot_list specification

2008-07-30 Thread Lenny Verkhovsky

patch to 1.3 attached

On 7/30/08, Lenny Verkhovsky <lenny.verkhov...@gmail.com> wrote:
>
> few more details:
>
> 1. added new mca param rmaps_base_slot_list (r19062)
>
> 2. new -slot-list option to mpirun ( r19062)
>
> 3. old opal_paffinity_base_slot_list will be invisible ( r19096)
>
> 4. few bug fixes ( r19004)
>
> On 7/30/08, Lenny Verkhovsky <lenny.verkhov...@gmail.com> wrote:
>>
>> if there is no objection I want to bring it to 1.3
>>
>> ( r19062)
>>
>> On 7/28/08, Ralph Castain <r...@lanl.gov> wrote:
>>>
>>> Just an FYI for those of you working with slot_lists.
>>>
>>> Lenny, Jeff and I have changed the mca param associated with how you
>>> specify the slot list you want the rank_file mapper to use. This was done to
>>> avoid the possibility of ORTE processes such as mpirun and orted
>>> accidentally binding themselves to cores. The prior param was identical to
>>> the one used to tell MPI procs their core bindings - so if someone ever
>>> modified the paffinity system to detect the param and automatically
>>> perform the binding, mpirun and orted could both bind themselves to the
>>> specified cores...which isn't what we would want.
>>>
>>> The new param is "rmaps_base_slot_list". To make life easier, we also
>>> added a new orterun cmd line option --slot-list which acts as a shorthand
>>> for the new mca param.
>>>
>>> Ralph
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>


patch.patch
Description: Binary data

Re: [OMPI devel] Change in slot_list specification

2008-07-30 Thread Lenny Verkhovsky

few more details:

1. added new mca param rmaps_base_slot_list (r19062)

2. new -slot-list option to mpirun ( r19062)

3. old opal_paffinity_base_slot_list will be invisible ( r19096)

4. few bug fixes ( r19004)

On 7/30/08, Lenny Verkhovsky <lenny.verkhov...@gmail.com> wrote:
>
> if there is no objection I want to bring it to 1.3
>
> ( r19062)
>
> On 7/28/08, Ralph Castain <r...@lanl.gov> wrote:
>>
>> Just an FYI for those of you working with slot_lists.
>>
>> Lenny, Jeff and I have changed the mca param associated with how you
>> specify the slot list you want the rank_file mapper to use. This was done to
>> avoid the possibility of ORTE processes such as mpirun and orted
>> accidentally binding themselves to cores. The prior param was identical to
>> the one used to tell MPI procs their core bindings - so if someone ever
>> modified the paffinity system to detect the param and automatically
>> perform the binding, mpirun and orted could both bind themselves to the
>> specified cores...which isn't what we would want.
>>
>> The new param is "rmaps_base_slot_list". To make life easier, we also
>> added a new orterun cmd line option --slot-list which acts as a shorthand
>> for the new mca param.
>>
>> Ralph
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>

Re: [OMPI devel] Change in slot_list specification

2008-07-30 Thread Lenny Verkhovsky

if there is no objection I want to bring it to 1.3

( r19062)

On 7/28/08, Ralph Castain  wrote:
>
> Just an FYI for those of you working with slot_lists.
>
> Lenny, Jeff and I have changed the mca param associated with how you
> specify the slot list you want the rank_file mapper to use. This was done to
> avoid the possibility of ORTE processes such as mpirun and orted
> accidentally binding themselves to cores. The prior param was identical to
> the one used to tell MPI procs their core bindings - so if someone ever
> modified the paffinity system to detect the param and automatically
> perform the binding, mpirun and orted could both bind themselves to the
> specified cores...which isn't what we would want.
>
> The new param is "rmaps_base_slot_list". To make life easier, we also added
> a new orterun cmd line option --slot-list which acts as a shorthand for the
> new mca param.
>
> Ralph
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Change in hostfile behavior

2008-07-29 Thread Lenny Verkhovsky

for two separate runs we can use slot_list parameter (
opal_paffinity_base_slot_list ) to have paffinity

1: mpirun -mca opal_paffinity_base_slot_list "0-1"

2 :mpirun -mca opal_paffinity_base_slot_list "2-3"

On 7/28/08, Ralph Castain  wrote:
>
> Actually, this is true today regardless of this change. If two separate
> mpirun invocations share a node and attempt to use paffinity, they will
> conflict with each other. The problem isn't caused by the hostfile
> sub-allocation. The problem is that the two mpiruns have no knowledge of
> each other's actions, and hence assign node ranks to each process
> independently.
>
> Thus, we would have two procs that think they are node rank=0 and should
> therefore bind to the 0 processor, and so on up the line.
>
> Obviously, if you run within one mpirun and have two app_contexts, the
> hostfile sub-allocation is fine - mpirun will track node rank across the
> app_contexts. It is only the use of multiple mpiruns that share nodes that
> causes the problem.
>
> Several of us have discussed this problem and have a proposed solution for
> 1.4. Once we get past 1.3 (someday!), we'll bring it to the group.
>
>
> On Jul 28, 2008, at 10:44 AM, Tim Mattox wrote:
>
>  My only concern is how will this interact with PLPA.
>> Say two Open MPI jobs each use "half" the cores (slots) on a
>> particular node...  how would they be able to bind themselves to
>> a disjoint set of cores?  I'm not asking you to solve this Ralph, I'm
>> just pointing it out so we can maybe warn users that if both jobs sharing
>> a node try to use processor affinity, we don't make that magically work
>> well,
>> and that we would expect it to do quite poorly.
>>
>> I could see disabling paffinity and/or warning if it was enabled for
>> one of these "fractional" nodes.
>>
>> On Mon, Jul 28, 2008 at 11:43 AM, Ralph Castain  wrote:
>>
>>> Per an earlier telecon, I have modified the hostfile behavior slightly to
>>> allow hostfiles to subdivide allocations.
>>>
>>> Briefly: given an allocation, we allow users to specify --hostfile on a
>>> per-app_context basis. In this mode, the hostfile info is used to filter
>>> the
>>> nodes that will be used for that app_context. However, the prior
>>> implementation only filtered the nodes themselves - i.e., it was a binary
>>> filter that allowed you to include or exclude an entire node.
>>>
>>> The change now allows you to include a specified #slots for a given node
>>> as
>>> opposed to -all- slots from that node. You are limited to the #slots
>>> included in the original allocation. I just realized that I hadn't output
>>> a
>>> warning if you attempt to violate this condition - will do so shortly.
>>> Rather than just abort if this happens, I set the allocation to that of
>>> the
>>> original - please let me know if you would prefer it to abort.
>>>
>>> If you have interest in this behavior, please check it out and let me
>>> know
>>> if this meets needs.
>>>
>>> Ralph
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>
>>
>> --
>> Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
>> tmat...@gmail.com || timat...@open-mpi.org
>> I'm a bright... http://www.the-brights.net/
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] trunk hangs since r19010

2008-07-28 Thread Lenny Verkhovsky

only openib works for me too,

but Glebs said to me once that it's illigal and I always need to use self
btl.

On 7/28/08, Jeff Squyres <jsquy...@cisco.com> wrote:
>
> FWIW, all my MTT runs are hanging as well.
>
>
> On Jul 28, 2008, at 10:37 AM, Brad Benton wrote:
>
>  My experience is the same a Lenny's.  I've tested on x86_64 and ppc64
>> systems and tests using --mca btl  openib,self hang in all cases.
>>
>> --brad
>>
>>
>> 2008/7/28 Lenny Verkhovsky <lenny.verkhov...@gmail.com>
>> I failed to run on different nodes or on the same node via self,openib
>>
>>
>>
>>
>> On 7/28/08, Ralph Castain <r...@lanl.gov> wrote:
>> I checked this out some more and I believe it is ticket #1378 related. We
>> lock up if SM is included in the BTL's, which is what I had done on my test.
>> If I ^sm, I can run fine.
>>
>>
>> On Jul 28, 2008, at 6:41 AM, Ralph Castain wrote:
>>
>>  It could also be something new. Brad and I noted on Fri that IB was
>>> locking up as soon as we tried any cross-node communications. Hadn't seen
>>> that before, and at least I haven't explored it further - planned to do so
>>> today.
>>>
>>>
>>> On Jul 28, 2008, at 6:01 AM, Lenny Verkhovsky wrote:
>>>
>>>  I believe it it.
>>>>
>>>> On 7/28/08, Jeff Squyres <jsquy...@cisco.com> wrote: On Jul 28, 2008,
>>>> at 7:51 AM, Jeff Squyres wrote:
>>>>
>>>> Is this related to r1378?
>>>>
>>>> Gah -- I meant #1378, meaning the "PML ob1 deadlock" ticket.
>>>>
>>>>
>>>>
>>>> On Jul 28, 2008, at 7:13 AM, Lenny Verkhovsky wrote:
>>>>
>>>> Hi,
>>>>
>>>> I experience hanging of tests ( latency ) since r19010
>>>>
>>>>
>>>> Best Regards
>>>>
>>>> Lenny.
>>>>
>>>> ___
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>>
>>>> --
>>>> Jeff Squyres
>>>> Cisco Systems
>>>>
>>>>
>>>>
>>>> --
>>>> Jeff Squyres
>>>> Cisco Systems
>>>>
>>>> ___
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>> ___
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] trunk hangs since r19010

2008-07-28 Thread Lenny Verkhovsky

I failed to run on different nodes or on the same node via self,openib



On 7/28/08, Ralph Castain <r...@lanl.gov> wrote:
>
> I checked this out some more and I believe it is ticket #1378 related. We
> lock up if SM is included in the BTL's, which is what I had done on my test.
> If I ^sm, I can run fine.
>
> On Jul 28, 2008, at 6:41 AM, Ralph Castain wrote:
>
> It could also be something new. Brad and I noted on Fri that IB was locking
> up as soon as we tried any cross-node communications. Hadn't seen that
> before, and at least I haven't explored it further - planned to do so today.
>
> On Jul 28, 2008, at 6:01 AM, Lenny Verkhovsky wrote:
>
> I believe it it.
>
> On 7/28/08, Jeff Squyres <jsquy...@cisco.com> wrote:
>>
>> On Jul 28, 2008, at 7:51 AM, Jeff Squyres wrote:
>>
>>  Is this related to r1378?
>>>
>>
>> Gah -- I meant #1378, meaning the "PML ob1 deadlock" ticket.
>>
>>
>>  On Jul 28, 2008, at 7:13 AM, Lenny Verkhovsky wrote:
>>>
>>>  Hi,
>>>>
>>>> I experience hanging of tests ( latency ) since r19010
>>>>
>>>>
>>>> Best Regards
>>>>
>>>> Lenny.
>>>>
>>>> ___
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>
>>>
>>>
>>> --
>>> Jeff Squyres
>>> Cisco Systems
>>>
>>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Funny warning message

2008-07-28 Thread Lenny Verkhovsky

It seems that the error felt into the helpfile.

Index: ompi/mca/btl/openib/help-mpi-btl-openib.txt
===
--- ompi/mca/btl/openib/help-mpi-btl-openib.txt (revision 19054)
+++ ompi/mca/btl/openib/help-mpi-btl-openib.txt (working copy)
@@ -497,7 +497,7 @@
 #
 [non optimal rd_win]
 WARNING: rd_win specification is non optimal. For maximum performance it is
-advisable to configure rd_win smaller then (rd_num - rd_low), but currently
+advisable to configure rd_win bigger then (rd_num - rd_low), but currently
 rd_win = %d and (rd_num - rd_low) = %d.
 #
 [apm without lmc]

Best regards

Lenny

On 7/28/08, Ralph Castain  wrote:
>
> Just got this warning today while trying to test IB connections. Last I
> checked, 32 was indeed smaller than 192...
>
> --
> WARNING: rd_win specification is non optimal. For maximum performance it is
> advisable to configure rd_win smaller then (rd_num - rd_low), but currently
> rd_win = 32 and (rd_num - rd_low) = 192.
> --
>
> Ralph
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] trunk hangs since r19010

2008-07-28 Thread Lenny Verkhovsky

I believe it it.

On 7/28/08, Jeff Squyres <jsquy...@cisco.com> wrote:
>
> On Jul 28, 2008, at 7:51 AM, Jeff Squyres wrote:
>
>  Is this related to r1378?
>>
>
> Gah -- I meant #1378, meaning the "PML ob1 deadlock" ticket.
>
>
>  On Jul 28, 2008, at 7:13 AM, Lenny Verkhovsky wrote:
>>
>>  Hi,
>>>
>>> I experience hanging of tests ( latency ) since r19010
>>>
>>>
>>> Best Regards
>>>
>>> Lenny.
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Lenny Verkhovsky

can this also be a reason for seqv on NUMA nodes(#1382) , that I cant
recreate ?

On 7/23/08, Jeff Squyres  wrote:
>
> On Jul 23, 2008, at 10:37 AM, Terry Dontje wrote:
>
>  This seems to work for me too.  What is interesting is my experiments have
>> shown that if you run on RH5.1 you don't need to set mpi_yield_when_idle to
>> 0.
>>
>
> Yes, this makes sense -- on RHEL5.1, it's a much newer Linux kernel and
> PLPA works as expected there.  So ODLS uses the values that PLPA passes back
> and all is good.
>
> On older Linux kernels, we're effectively returning "not supported" from
> paffinity, and therefore ODLS (rightly) assumes that it can't know anything
> and puts us into the "oversubscribed" state.
>
> I'm working on a fix.
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

[OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Lenny Verkhovsky

Sorry Terry, :).

-- Forwarded message --
From: Lenny Verkhovsky <lenny.verkhov...@gmail.com>
List-Post: devel@lists.open-mpi.org
Date: Jul 23, 2008 2:22 PM
Subject: Re: [OMPI devel] [OMPI bugs] [Open MPI] #1250: Performance problem
on SM
To: Lenny Berkhovsky <lenny.verkhov...@gmail.com>



On 7/23/08, Terry Dontje <terry.don...@sun.com> wrote:
>
> I didn't see any attached results on the email.
>
> --td
> Lenny Verkhovsky wrote:
>
>>
>> I rechecked in on the same node, still no degradation,
>>
>> see results attached.
>>
>>
>> On 7/22/08, *Open MPI* <b...@open-mpi.org <mailto:b...@open-mpi.org>>
>> wrote:
>>
>>#1250: Performance problem on SM
>>
>>  +---
>>Reporter:  bosilca  |Owner:  bosilca
>>Type:  defect   |   Status:  assigned
>>Priority:  blocker  |Milestone:  Open MPI 1.3
>>  Version:   |   Resolution:
>>Keywords:   |
>>
>>  +---
>>
>>
>>Comment(by tdd):
>>
>>  Hmmm, Lennyve isn't your mpirun above going across nodes and not
>>on the
>>  same node?  I am running netpipe on a single node.
>>
>>
>>--
>>Ticket URL:
>><https://svn.open-mpi.org/trac/ompi/ticket/1250#comment:20>
>>
>>Open MPI <http://www.open-mpi.org/>
>>
>>
>>___
>>bugs mailing list
>>b...@open-mpi.org <mailto:b...@open-mpi.org>
>>http://www.open-mpi.org/mailman/listinfo.cgi/bugs
>>
>>
>> 
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>
>


NPmpi.log
Description: Binary data

Re: [OMPI devel] [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Lenny Verkhovsky

I rechecked in on the same node, still no degradation,

see results attached.

On 7/22/08, Open MPI  wrote:
>
> #1250: Performance problem on SM
>
> +---
> Reporter:  bosilca  |Owner:  bosilca
> Type:  defect   |   Status:  assigned
> Priority:  blocker  |Milestone:  Open MPI 1.3
>   Version:   |   Resolution:
> Keywords:   |
>
> +---
>
>
> Comment(by tdd):
>
>   Hmmm, Lennyve isn't your mpirun above going across nodes and not on the
>   same node?  I am running netpipe on a single node.
>
>
> --
> Ticket URL: 
>
> Open MPI 
>
>
> ___
> bugs mailing list
> b...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/bugs
>

[OMPI devel] PathScale compiler ( ticket #1326 )

2008-07-16 Thread Lenny Verkhovsky

Hi, Jeff,

I succeded to compile and run simple C++ app with PathScale 3.2 evaluation
version. look below.

Maybe you had some installation / licence problems.

I was in touch with Ben v...@pathscale.com from PathScale support who helped
me with installation.

#head config.log
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by Open MPI configure 1.4a1, which was
generated by GNU Autoconf 2.61. Invocation command line was

  $ ../configure --with-memory-manager=ptmalloc2 --with-openib
--prefix=/home/USERS/lenny/OMPI_COMP_PATH CC=pathcc CXX=pathCC FC=pathf90
F77=pathf90 F90=pathf90

## - ##
## Platform. ##

/home/USERS/lenny/OMPI_COMP_PATH/bin/mpiCC -o hello_c_plus hello++.cc
In file included from /usr/include/c++/4.1.2/backward/iostream.h:31,
  from hello++.cc:34:
/usr/include/c++/4.1.2/backward/backward_warning.h:32:2: warning: #warning
This file includes at least one deprecated or antiquated header. Please
consider using one of the 32 headers found in section 17.4.1.2 of the C++
standard. Examples include substituting the  header for the  header
for C++ includes, or  instead of the deprecated header
. To disable this warning use -Wno-deprecated.

witch1:/home/USERS/lenny/TESTS/COMPILERS #
/home/USERS/lenny/OMPI_COMP_PATH/bin/mpirun -np 2 -H witch16,witch17
./hello_c_plus
Hello World! I am 0 of 2
Hello World! I am 1 of 2


hello++.cc
Description: Binary data

Re: [OMPI devel] IBCM error

2008-07-14 Thread Lenny Verkhovsky

Seems to be fixed.

On 7/14/08, Lenny Verkhovsky <lenny.verkhov...@gmail.com> wrote:
>
> ../configure --with-memory-manager=ptmalloc2 --with-openib
>
> I guess not. I always use same configure line, and only recently I started
> to see this error.
>
> On 7/13/08, Jeff Squyres <jsquy...@cisco.com> wrote:
>>
>> I think you said opposite things: Lenny's command line did not
>> specifically ask for ibcm, but it was used anyway.  Lenny -- did you
>> explicitly request it somewhere else (e.g., env var or MCA param file)?
>>
>> I suspect that you did not; I suspect (without looking at the code again)
>> that ibcm tried to select itself and failed on the ibcm_listen() call, so it
>> fell back to oob.  This might have to be another workaround in OMPI, perhaps
>> something like this:
>>
>> if (ibcm_listen() fails)
>>   if (ibcm explicitly requested)
>>   print_warning()
>>   fail to use ibcm
>>
>> Has this been filed as a bug at openfabrics.org?  I don't think that I
>> filed it when Brad and I were testing on RoadRunner -- it would probably be
>> good if someone filed it.
>>
>>
>>
>> On Jul 13, 2008, at 8:56 AM, Lenny Verkhovsky wrote:
>>
>>  Pasha is right, I didn't disabled it.
>>>
>>> On 7/13/08, Pavel Shamis (Pasha) <pa...@dev.mellanox.co.il> wrote: Jeff
>>> Squyres wrote:
>>> Brad and I did some scale testing of IBCM and saw this error sometimes.
>>>  It seemed to happen with higher frequency when you increased the number of
>>> processes on a single node.
>>>
>>> I talked to Sean Hefty about it, but we never figured out a definitive
>>> cause or solution.  My best guess is that there is something wonky about
>>> multiple processes simultaneously interacting with the IBCM kernel driver
>>> from userspace; but I don't know jack about kernel stuff, so that's a total
>>> SWAG.
>>>
>>> Thanks for reminding me of this issue; I admit that I had forgotten about
>>> it.  :-(  Pasha -- should IBCM not be the default?
>>> It is not default. I guess Lenny configured it explicitly, is not it ?
>>>
>>> Pasha.
>>>
>>>
>>>
>>>
>>>
>>> On Jul 13, 2008, at 7:08 AM, Lenny Verkhovsky wrote:
>>>
>>> Hi,
>>>
>>> I am getting this error sometimes.
>>>
>>> /home/USERS/lenny/OMPI_COMP_PATH/bin/mpirun -np 100 -hostfile
>>> /home/USERS/lenny/TESTS/COMPILERS/hostfile
>>> /home/USERS/lenny/TESTS/COMPILERS/hello
>>> [witch24][[32428,1],96][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c:769:ibcm_component_query]
>>> failed to ib_cm_listen 10 times: rc=-1, errno=22
>>> Hello world! I'm 0 of 100 on witch2
>>>
>>>
>>> Best Regards
>>>
>>> Lenny.
>>>
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>> --
>> Jeff Squyres
>> Cisco Systems
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>

Re: [OMPI devel] IBCM error

2008-07-14 Thread Lenny Verkhovsky

../configure --with-memory-manager=ptmalloc2 --with-openib

I guess not. I always use same configure line, and only recently I started
to see this error.

On 7/13/08, Jeff Squyres <jsquy...@cisco.com> wrote:
>
> I think you said opposite things: Lenny's command line did not specifically
> ask for ibcm, but it was used anyway.  Lenny -- did you explicitly request
> it somewhere else (e.g., env var or MCA param file)?
>
> I suspect that you did not; I suspect (without looking at the code again)
> that ibcm tried to select itself and failed on the ibcm_listen() call, so it
> fell back to oob.  This might have to be another workaround in OMPI, perhaps
> something like this:
>
> if (ibcm_listen() fails)
>   if (ibcm explicitly requested)
>   print_warning()
>   fail to use ibcm
>
> Has this been filed as a bug at openfabrics.org?  I don't think that I
> filed it when Brad and I were testing on RoadRunner -- it would probably be
> good if someone filed it.
>
>
>
> On Jul 13, 2008, at 8:56 AM, Lenny Verkhovsky wrote:
>
>  Pasha is right, I didn't disabled it.
>>
>> On 7/13/08, Pavel Shamis (Pasha) <pa...@dev.mellanox.co.il> wrote: Jeff
>> Squyres wrote:
>> Brad and I did some scale testing of IBCM and saw this error sometimes.
>>  It seemed to happen with higher frequency when you increased the number of
>> processes on a single node.
>>
>> I talked to Sean Hefty about it, but we never figured out a definitive
>> cause or solution.  My best guess is that there is something wonky about
>> multiple processes simultaneously interacting with the IBCM kernel driver
>> from userspace; but I don't know jack about kernel stuff, so that's a total
>> SWAG.
>>
>> Thanks for reminding me of this issue; I admit that I had forgotten about
>> it.  :-(  Pasha -- should IBCM not be the default?
>> It is not default. I guess Lenny configured it explicitly, is not it ?
>>
>> Pasha.
>>
>>
>>
>>
>>
>> On Jul 13, 2008, at 7:08 AM, Lenny Verkhovsky wrote:
>>
>> Hi,
>>
>> I am getting this error sometimes.
>>
>> /home/USERS/lenny/OMPI_COMP_PATH/bin/mpirun -np 100 -hostfile
>> /home/USERS/lenny/TESTS/COMPILERS/hostfile
>> /home/USERS/lenny/TESTS/COMPILERS/hello
>> [witch24][[32428,1],96][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c:769:ibcm_component_query]
>> failed to ib_cm_listen 10 times: rc=-1, errno=22
>> Hello world! I'm 0 of 100 on witch2
>>
>>
>> Best Regards
>>
>> Lenny.
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] IBCM error

2008-07-13 Thread Lenny Verkhovsky

Pasha is right, I didn't disabled it.

On 7/13/08, Pavel Shamis (Pasha) <pa...@dev.mellanox.co.il> wrote:
>
> Jeff Squyres wrote:
>
>> Brad and I did some scale testing of IBCM and saw this error sometimes.
>>  It seemed to happen with higher frequency when you increased the number of
>> processes on a single node.
>>
>> I talked to Sean Hefty about it, but we never figured out a definitive
>> cause or solution.  My best guess is that there is something wonky about
>> multiple processes simultaneously interacting with the IBCM kernel driver
>> from userspace; but I don't know jack about kernel stuff, so that's a total
>> SWAG.
>>
>> Thanks for reminding me of this issue; I admit that I had forgotten about
>> it.  :-(  Pasha -- should IBCM not be the default?
>>
> It is not default. I guess Lenny configured it explicitly, is not it ?
>
> Pasha.
>
>
>>
>>
>> On Jul 13, 2008, at 7:08 AM, Lenny Verkhovsky wrote:
>>
>>  Hi,
>>>
>>> I am getting this error sometimes.
>>>
>>> /home/USERS/lenny/OMPI_COMP_PATH/bin/mpirun -np 100 -hostfile
>>> /home/USERS/lenny/TESTS/COMPILERS/hostfile
>>> /home/USERS/lenny/TESTS/COMPILERS/hello
>>> [witch24][[32428,1],96][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c:769:ibcm_component_query]
>>> failed to ib_cm_listen 10 times: rc=-1, errno=22
>>> Hello world! I'm 0 of 100 on witch2
>>>
>>>
>>> Best Regards
>>>
>>> Lenny.
>>>
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

[OMPI devel] IBCM error

2008-07-13 Thread Lenny Verkhovsky

Hi,

I am getting this error sometimes.

/home/USERS/lenny/OMPI_COMP_PATH/bin/mpirun -np 100 -hostfile
/home/USERS/lenny/TESTS/COMPILERS/hostfile
/home/USERS/lenny/TESTS/COMPILERS/hello
[witch24][[32428,1],96][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c:769:ibcm_component_query]
failed to ib_cm_listen 10 times: rc=-1, errno=22
Hello world! I'm 0 of 100 on witch2

Best Regards

Lenny.

Re: [OMPI devel] CARTO slot definition

2008-07-02 Thread Lenny Verkhovsky

Hi, all,
since there are no objections I will commit the patch to trunk and 1.3
branch.
see also comments below.


2008/6/29 Ralph Castain <r...@lanl.gov>:

> I believe this would help reduce the confusion a great deal. While the
> current carto syntax is the expected for a mathematician, computer users and
> developers have a well-established definition for the term "node" that
> conflicts with that used in carto.
>
> Some thoughts to add:  in carto, do you want to create a definition for
> CORE as well as SOCKET? That way, someone could provide info down to either
> level of granularity. In our subsequent frameworks, we could assume that all
> cores in a socket have the same graph analysis unless the core was described
> separately.
>

For my best knowledge all cores on the same socket have same distance on the
graph, so I dont see any need to add more code and definitions for now.


>
>
> Likewise, you may want to support a NODE that describes connectivity
> between computing nodes. Again, one could assume that all sockets on the
> node share the same graph unless the socket was described separately – or
> perhaps allow someone to describe the graph for the socket to get to the
> edge of the node, and then let the node description handle comm between
> nodes.
>

Interesting idea, I suppose it can be implemented in the future.


>
>
> Up to you – I'm just trying to think of ways we could bring this closer to
> the topo description required in other modules elsewhere in the code to
> avoid having multiple files.
>
> Ralph
>




>
>
> On 6/29/08 7:44 AM, "Lenny Verkhovsky" <lenny.verkhov...@gmail.com> wrote:
>
>   Hi all,
> We have ambiquite definitions of "slot" in rankfile, hostfile and carto
> components.
> Since "slot" is well defined as a processor in the hostfile and rankfile (
> "slot" is defined as processing unit which can be processor number or
> socket:core pair ).
> I propose to change carto file syntax and make it more graph oriented. This
> won't have any effect on the code.
>
> In new carto syntax
>
> NODE will be changed to EDGE
> CONNECTION will be changed to BRANCH
> SLOT will be changed to SOCKET.
>
> Any comments are welcome.
> few words about carto can be found at
> https://svn.open-mpi.org/trac/ompi/wiki/OnHostTopologyDescription
>
>   Index: opal/mca/carto/file/help-opal-carto-file.txt
> ===
> --- opal/mca/carto/file/help-opal-carto-file.txt(revision 18772)
> +++ opal/mca/carto/file/help-opal-carto-file.txt(working copy)
> @@ -27,27 +27,27 @@
>  #
>  [expected node type]
> -File: %s line: %d expected node type (free string). received %s
> +File: %s line: %d expected Edge type (free string). received %s
>  #
>  [expected node name]
> -File: %s line: %d expected Node name (free string). received %s
> +File: %s line: %d expected Edge name (free string). received %s
>  #
>  [expected Connection]
> -File: %s line: %d expected Node connection (node name:weight). received %s
> +File: %s line: %d expected Edge branch (edge name:weight). received %s
>  #
>  [expected deceleration]
> -File: %s line: %d expected Node deceleration (NODE) or connection
> deceleration (CONNECTION). received %s
> +File: %s line: %d expected Edge declaration (EDGE) or branch declaration
> (BRANCH). received %s
>  #
>  [incorrect connection]
> -File: %s line: %d - %s - incorrect connection
> +File: %s line: %d - %s - incorrect branch
>  #
>  [vertex not found]
> -File: %s line: %d - Node %s is not in the graph
> +File: %s line: %d - Edge %s is not in the graph
>  #
>  [unknown token]
> Index: opal/mca/carto/file/carto_file_lex.l
> ===
> --- opal/mca/carto/file/carto_file_lex.l(revision 18772)
> +++ opal/mca/carto/file/carto_file_lex.l(working copy)
> @@ -80,13 +80,13 @@
>
> -NODE   { carto_file_value.sval = yytext;
> +EDGE   { carto_file_value.sval = yytext;
>   return OPAL_CARTO_FILE_NODE_DECELERATION; }
> -CONNECTION { carto_file_value.sval = yytext;
> +BRANCH { carto_file_value.sval = yytext;
>   return OPAL_CARTO_FILE_CONNECTION_DECELERATION; }
> -CON_BI_DIR { carto_file_value.sval = yytext;
> +BRANCH_BI_DIR { carto_file_value.sval = yytext;
>   return OPAL_CARTO_FILE_BIDIRECTION_CONNECTION; }
>  [0-9]  { carto_file_value.ival = atol(yytext);
> Index: opal/mca/carto/file/carto_file.h
> ===
> --- opal/mca/carto/file

Re: [OMPI devel] mtt IBM SPAWN error

2008-06-30 Thread Lenny Verkhovsky

I saw it. But I think it something else, since it works if I run it with
hostlist

#mpirun -np 3 -H witch2,witch3  dynamic/spawn
#


On Mon, Jun 30, 2008 at 4:03 PM, Ralph H Castain <r...@lanl.gov> wrote:

> Well, that error indicates that it was unable to launch the daemon on
> witch3
> for some reason. If you look at the error reported by bash, you will see
> that the ³orted² binary wasn¹t found!
>
> Sounds like a path error  you might check to see if witch3 has the
> binaries
> installed, and if they are where you told the system to look...
>
> Ralph
>
>
>
> On 6/30/08 5:21 AM, "Lenny Verkhovsky" <lenny.verkhov...@gmail.com> wrote:
>
> > I am not familiar with spawn test of IBM, but maybe this is right
> behavior,
> > if spawn test allocates 3 ranks on the node, and then allocates another 3
> > then this test suppose to fail due to max_slots=4.
> >
> > But it fails with the fallowing hostfile as well BUT WITH A DIFFERENT
> ERROR.
> >
> > #cat hostfile2
> > witch2 slots=4 max_slots=4
> > witch3 slots=4 max_slots=4
> > witch1:/home/BENCHMARKS/IBM #
> /home/USERS/lenny/OMPI_ORTE_18772/bin/mpirun -np
> > 3 -hostfile hostfile2 dynamic/spawn
> > bash: orted: command not found
> > [witch1:22789]
> >
> --
> > A daemon (pid 22791) died unexpectedly with status 127 while attempting
> > to launch so we are aborting.
> > There may be more information reported by the environment (see above).
> > This may be because the daemon was unable to find all the needed shared
> > libraries on the remote node. You may set your LD_LIBRARY_PATH to have
> the
> > location of the shared libraries on the remote nodes and this will
> > automatically be forwarded to the remote nodes.
> >
> --
> > [witch1:22789]
> >
> --
> > mpirun was unable to cleanly terminate the daemons on the nodes shown
> > below. Additional manual cleanup may be required - please refer to
> > the "orte-clean" tool for assistance.
> >
> --
> > witch3 - daemon did not report back when launched
> >
> > On Mon, Jun 30, 2008 at 9:38 AM, Lenny Verkhovsky <
> lenny.verkhov...@gmail.com>
> > wrote:
> >> Hi,
> >> trying to run mtt I failed to run IBM spawn test. It fails only when
> using
> >> hostfile, and not when using host list.
> >> ( OMPI from TRUNK )
> >>
> >> This is working :
> >> #mpirun -np 3 -H witch2 dynamic/spawn
> >>
> >> This Fails:
> >> # cat hostfile
> >> witch2 slots=4 max_slots=4
> >> #mpirun -np 3 -hostfile hostfile dynamic/spawn
> >> [witch1:12392]
> >>
> --
> >> There are not enough slots available in the system to satisfy the 3
> slots
> >> that were requested by the application:
> >>   dynamic/spawn
> >>
> >> Either request fewer slots for your application, or make more slots
> available
> >> for use.
> >>
> --
> >> [witch1:12392]
> >>
> --
> >> A daemon (pid unknown) died unexpectedly on signal 1  while attempting
> to
> >> launch so we are aborting.
> >>
> >> There may be more information reported by the environment (see above).
> >>
> >> This may be because the daemon was unable to find all the needed shared
> >> libraries on the remote node. You may set your LD_LIBRARY_PATH to have
> the
> >> location of the shared libraries on the remote nodes and this will
> >> automatically be forwarded to the remote nodes.
> >>
> --
> >> mpirun: clean termination accomplished
> >>
> >>
> >> Using hostfile1 also works
> >> #cat hostfile1
> >> witch2
> >> witch2
> >> witch2
> >>
> >>
> >> Best Regards
> >> Lenny.
> >>
> >
>
>
>
>

Re: [OMPI devel] mtt IBM SPAWN error

2008-06-30 Thread Lenny Verkhovsky

I am not familiar with spawn test of IBM, but maybe this is right behavior,
if spawn test allocates 3 ranks on the node, and then allocates another 3
then this test suppose to fail due to max_slots=4.

But it fails with the fallowing hostfile as well BUT WITH A DIFFERENT ERROR.

#cat hostfile2
witch2 slots=4 max_slots=4
witch3 slots=4 max_slots=4
witch1:/home/BENCHMARKS/IBM # /home/USERS/lenny/OMPI_ORTE_18772/bin/mpirun
-np 3 -hostfile hostfile2 dynamic/spawn
bash: orted: command not found
[witch1:22789]
--
A daemon (pid 22791) died unexpectedly with status 127 while attempting
to launch so we are aborting.
There may be more information reported by the environment (see above).
This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--
[witch1:22789]
--
mpirun was unable to cleanly terminate the daemons on the nodes shown
below. Additional manual cleanup may be required - please refer to
the "orte-clean" tool for assistance.
--
witch3 - daemon did not report back when launched

On Mon, Jun 30, 2008 at 9:38 AM, Lenny Verkhovsky <
lenny.verkhov...@gmail.com> wrote:

> Hi,
> trying to run mtt I failed to run IBM spawn test. It fails only when using
> hostfile, and not when using host list.
> ( OMPI from TRUNK )
>
> This is working :
> #mpirun -np 3 -H witch2 dynamic/spawn
>
> This Fails:
> # cat hostfile
> witch2 slots=4 max_slots=4
>
> #mpirun -np 3 -hostfile hostfile dynamic/spawn
> [witch1:12392]
> --
> There are not enough slots available in the system to satisfy the 3 slots
> that were requested by the application:
>   dynamic/spawn
>
> Either request fewer slots for your application, or make more slots
> available
> for use.
> --
> [witch1:12392]
> --
> A daemon (pid unknown) died unexpectedly on signal 1  while attempting to
> launch so we are aborting.
>
> There may be more information reported by the environment (see above).
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --
> mpirun: clean termination accomplished
>
>
> Using hostfile1 also works
> #cat hostfile1
> witch2
> witch2
> witch2
>
>
> Best Regards
> Lenny.
>

[OMPI devel] mtt IBM SPAWN error

2008-06-30 Thread Lenny Verkhovsky

Hi,
trying to run mtt I failed to run IBM spawn test. It fails only when using
hostfile, and not when using host list.
( OMPI from TRUNK )

This is working :
#mpirun -np 3 -H witch2 dynamic/spawn

This Fails:
# cat hostfile
witch2 slots=4 max_slots=4

#mpirun -np 3 -hostfile hostfile dynamic/spawn
[witch1:12392]
--
There are not enough slots available in the system to satisfy the 3 slots
that were requested by the application:
  dynamic/spawn

Either request fewer slots for your application, or make more slots
available
for use.
--
[witch1:12392]
--
A daemon (pid unknown) died unexpectedly on signal 1  while attempting to
launch so we are aborting.

There may be more information reported by the environment (see above).
This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--
mpirun: clean termination accomplished


Using hostfile1 also works
#cat hostfile1
witch2
witch2
witch2


Best Regards
Lenny.

[OMPI devel] CARTO slot definition

2008-06-29 Thread Lenny Verkhovsky

Hi all,
We have ambiquite definitions of "slot" in rankfile, hostfile and carto
components.
Since "slot" is well defined as a processor in the hostfile and rankfile (
"slot" is defined as processing unit which can be processor number or
socket:core pair ).
I propose to change carto file syntax and make it more graph oriented. This
won't have any effect on the code.

In new carto syntax

NODE will be changed to EDGE
CONNECTION will be changed to BRANCH
SLOT will be changed to SOCKET.

Any comments are welcome.
few words about carto can be found at
https://svn.open-mpi.org/trac/ompi/wiki/OnHostTopologyDescription

  Index: opal/mca/carto/file/help-opal-carto-file.txt
===
--- opal/mca/carto/file/help-opal-carto-file.txt(revision 18772)
+++ opal/mca/carto/file/help-opal-carto-file.txt(working copy)
@@ -27,27 +27,27 @@
 #
 [expected node type]
-File: %s line: %d expected node type (free string). received %s
+File: %s line: %d expected Edge type (free string). received %s
 #
 [expected node name]
-File: %s line: %d expected Node name (free string). received %s
+File: %s line: %d expected Edge name (free string). received %s
 #
 [expected Connection]
-File: %s line: %d expected Node connection (node name:weight). received %s
+File: %s line: %d expected Edge branch (edge name:weight). received %s
 #
 [expected deceleration]
-File: %s line: %d expected Node deceleration (NODE) or connection
deceleration (CONNECTION). received %s
+File: %s line: %d expected Edge declaration (EDGE) or branch declaration
(BRANCH). received %s
 #
 [incorrect connection]
-File: %s line: %d - %s - incorrect connection
+File: %s line: %d - %s - incorrect branch
 #
 [vertex not found]
-File: %s line: %d - Node %s is not in the graph
+File: %s line: %d - Edge %s is not in the graph
 #
 [unknown token]
Index: opal/mca/carto/file/carto_file_lex.l
===
--- opal/mca/carto/file/carto_file_lex.l(revision 18772)
+++ opal/mca/carto/file/carto_file_lex.l(working copy)
@@ -80,13 +80,13 @@

-NODE   { carto_file_value.sval = yytext;
+EDGE   { carto_file_value.sval = yytext;
  return OPAL_CARTO_FILE_NODE_DECELERATION; }
-CONNECTION { carto_file_value.sval = yytext;
+BRANCH { carto_file_value.sval = yytext;
  return OPAL_CARTO_FILE_CONNECTION_DECELERATION; }
-CON_BI_DIR { carto_file_value.sval = yytext;
+BRANCH_BI_DIR { carto_file_value.sval = yytext;
  return OPAL_CARTO_FILE_BIDIRECTION_CONNECTION; }
 [0-9]  { carto_file_value.ival = atol(yytext);
Index: opal/mca/carto/file/carto_file.h
===
--- opal/mca/carto/file/carto_file.h(revision 18772)
+++ opal/mca/carto/file/carto_file.h(working copy)
@@ -21,49 +21,49 @@
 /**
  * @file#this is a comment
 # Node declaration   Node type (Free string)   Node name (Free string)
-# (Reserve word) (slot is a reserve word   (free string)
-# for CPU slot)
+# (Reserve word) (socket is a reserve word   (free string)
+# for CPU socket)
 #===
-  NODE   Memorymem0
-  NODE   Memorymem1
-  NODE   Memorymem2
-  NODE   Memorymem3
+  EDGE   Memorymem0
+  EDGE   Memorymem1
+  EDGE   Memorymem2
+  EDGE   Memorymem3
 #
-  NODE   slot  slot0
-  NODE   slot  slot1
-  NODE   slot  slot2
-  NODE   slot  slot3
+  EDGE   socket  socket0
+  EDGE   socket  socket1
+  EDGE   socket  socket2
+  EDGE   socket  socket3
 #
-  NODE   Infinibandmthca0
-  NODE   Infinibandmthca1
+  EDGE   Infinibandmthca0
+  EDGE   Infinibandmthca1
 #
-  NODE   Ethernet  eth0
-  NODE   Ethernet  eth1
+  EDGE   Ethernet  eth0
+  EDGE   Ethernet  eth1
 #
 #
 # Connection decleration  From node   To node:weight   To node:weight
..
 # (Reserve word)  (declered   (declered(declered
 #  above)  above)   above)
 
#===
-  CONNECTION  mem0

Re: [OMPI devel] PML selection logic

2008-06-29 Thread Lenny Verkhovsky

We can also make few different paramfiles for typical setups ( large cluster
/ minimum LT / max BW e.t.c )
the desired paramfile can be chosen by configure flag and be placed in *
$prefix/etc/openmpi-mca-params.conf*

On Sat, Jun 28, 2008 at 3:55 PM, Jeff Squyres  wrote:

> Agreed.  I have a few ideas in this direction as well (random thoughts that
> might as well be transcribed somewhere):
>
> - some kind of configure --enable-large-system (whatever) option is a Good
> Thing
>
> - it would be good if the configure option simply set [MCA parameter?]
> defaults wherever possible (vs. #if-selecting code).  I think one of the
> biggest lessons learned from Open MPI is that everyone's setup is different
> -- having the ability to mix and match various run-time options, while not
> widely used, is absolutely critical in some scenarios.  So it might be good
> if --enable-large-system sets a bunch of default parameters that some
> sysadmins may still want/need to override.
>
> - decision to run the modex: I haven't seen all of Ralph's work in this
> area, but I wonder if it's similar to the MPI handle parameter checks: it
> could be a multi-value MCA parameter, such as: "never", "always",
> "when-ompi-determines-its-necessary", etc., where the last value can use
> multiple criteria to know if it's necessary to do a modex (e.g., job size,
> when spawn occurs, whether the "pml" [or other critical] MCA param[s] were
> specified, ...etc.).
>
>
>
> On Jun 26, 2008, at 9:26 AM, Ralph H Castain wrote:
>
> Just to complete this thread...
>>
>> Brian raised a very good point, so we identified it on the weekly telecon
>> as
>> a subject that really should be discussed at next week's technical
>> meeting.
>> I think we can find a reasonable answer, but there are several ways it can
>> be done. So rather than doing our usual piecemeal approach to the
>> solution,
>> it makes sense to begin talking about a more holistic design for
>> accommodating both needs.
>>
>> Thanks Brian for pointing out the bigger picture.
>> Ralph
>>
>>
>>
>> On 6/24/08 8:22 AM, "Brian W. Barrett"  wrote:
>>
>> yeah, that could be a problem, but it's such a minority case and we've got
>>> to draw the line somewhere.
>>>
>>> Of course, it seems like this is a never ending battle between two
>>> opposing forces...  The desire to do the "right thing" all the time at
>>> small and medium scale and the desire to scale out to the "big thing".
>>> It seems like in the quest to kill off the modex, we've run into these
>>> pretty often.
>>>
>>> The modex doesn't hurt us at small scale (indeed, we're probably ok with
>>> the routed communication pattern up to 512 nodes or so if we don't do
>>> anything stupid, maybe further).  Is it time to admit defeat in this
>>> argument and have a configure option that turns off the modex (at the
>>> cost
>>> of some of these correctness checks) for the large machines, but keeps
>>> things simple for the common case?  I'm sure there are other things where
>>> this will come up, so perhaps a --enable-large-scale?  Maybe it's a dumb
>>> idea, but it seems like we've made a lot of compromises lately around
>>> this, where no one ends up really happy with the solution :/.
>>>
>>> Brian
>>>
>>>
>>> On Tue, 24 Jun 2008, George Bosilca wrote:
>>>
>>> Brian hinted a possible bug in one of his replies. How does this work in
 the
 case of dynamic processes? We can envision several scenarios, but lets
 take a
 simple: 2 jobs that get connected with connect/accept. One might publish
 the
 PML name (simply because the -mca argument was on) and one might not?

 george.

 On Jun 24, 2008, at 8:28 AM, Jeff Squyres wrote:

 Also sounds good to me.
>
> Note that the most difficult part of the forward-looking plan is that
> we
> usually can't tell the difference between "something failed to
> initialize"
> and "you don't have support for feature X".
>
> I like the general philosophy of: running out of the box always works
> just
> fine, but if you/the sysadmin is smart, you can get performance
> improvements.
>
>
> On Jun 23, 2008, at 4:18 PM, Shipman, Galen M. wrote:
>
> I concur
>> - galen
>>
>> On Jun 23, 2008, at 3:44 PM, Brian W. Barrett wrote:
>>
>> That sounds like a reasonable plan to me.
>>>
>>> Brian
>>>
>>> On Mon, 23 Jun 2008, Ralph H Castain wrote:
>>>
>>> Okay, so let's explore an alternative that preserves the support you
 are
 seeking for the "ignorant user", but doesn't penalize everyone else.
 What we
 could do is simply set things up so that:

 1. if -mca plm xyz is provided, then no modex data is added

 2. if it is not provided, then only rank=0 inserts the data. All
 other
 procs
 simply check their own selection against the one given by

[OMPI devel] Trunk problems

2008-06-25 Thread Lenny Verkhovsky

Hi,
I downloaded new version from trunk and got the fallowing
1. opal_output for no reason ( probaly something was forgotten )
2. it got stacked.


/home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun -np 2 -hostfile hostfile_w4_8
./osu_bw
[witch4:20920] Using eager rdma: 1
[witch4:20921] Using eager rdma: 1
# OSU MPI Bandwidth Test (Version 2.1)
# Size  Bandwidth (MB/s)

( got stacked )


Lenny.

Re: [OMPI devel] BW benchmark hangs after r 18551

2008-06-23 Thread Lenny Verkhovsky

Hi,
Seqf bug fixed in r18706.

Best Regards
Lenny.
On Thu, Jun 19, 2008 at 5:37 PM, Lenny Verkhovsky <
lenny.verkhov...@gmail.com> wrote:

> Sorry,
> I checked it without sm.
>
> pls ignore this mail.
>
>
>
> On Thu, Jun 19, 2008 at 4:32 PM, Lenny Verkhovsky <
> lenny.verkhov...@gmail.com> wrote:
>
>> Hi,
>> I found what caused the problem in both cases.
>>
>> --- ompi/mca/btl/sm/btl_sm.c(revision 18675)
>> +++ ompi/mca/btl/sm/btl_sm.c(working copy)
>> @@ -812,7 +812,7 @@
>>   */
>>  MCA_BTL_SM_FIFO_WRITE(endpoint, endpoint->my_smp_rank,
>>endpoint->peer_smp_rank, frag->hdr, false, rc);
>> -return (rc < 0 ? rc : 1);
>> +   return OMPI_SUCCESS;
>>  }
>> I am just not sure if it's OK.
>>
>> Lenny.
>>   On Wed, Jun 18, 2008 at 3:21 PM, Lenny Verkhovsky <
>> lenny.verkhov...@gmail.com> wrote:
>>
>>> Hi,
>>> I am not sure if it related,
>>> but I applied your patch ( r18667 )  to r 18656 ( one before NUMA )
>>> together with disabling sendi,
>>> The result still the same ( hanging ).
>>>
>>>
>>>
>>>
>>>  On Tue, Jun 17, 2008 at 2:10 PM, George Bosilca <bosi...@eecs.utk.edu>
>>> wrote:
>>>
>>>> Lenny,
>>>>
>>>> I guess you're running the latest version. If not, please update, Galen
>>>> and myself corrected some bugs last week. If you're using the latest (and
>>>> greatest) then ... well I imagine there is at least one bug left.
>>>>
>>>> There is a quick test you can do. In the btl_sm.c in the module
>>>> structure at the beginning of the file, please replace the sendi function 
>>>> by
>>>> NULL. If this fix the problem, then at least we know that it's a sm send
>>>> immediate problem.
>>>>
>>>>  Thanks,
>>>>george.
>>>>
>>>>
>>>> On Jun 17, 2008, at 7:54 AM, Lenny Verkhovsky wrote:
>>>>
>>>> Hi, George,
>>>>>
>>>>> I have a problem running BW benchmark on 100 rank cluster after r18551.
>>>>> The BW is mpi_p that runs mpi_bandwidth with 100K between all pairs.
>>>>>
>>>>>
>>>>> #mpirun -np 100 -hostfile hostfile_w  ./mpi_p_18549 -t bw -s 10
>>>>> BW (100) (size min max avg)  10 576.734030  2001.882416
>>>>> 1062.698408
>>>>> #mpirun -np 100 -hostfile hostfile_w ./mpi_p_18551 -t bw -s 10
>>>>> mpirun: killing job...
>>>>> ( it hangs even after 10 hours ).
>>>>>
>>>>>
>>>>> It doesn't happen if I run --bynode or btl openib,self only.
>>>>>
>>>>>
>>>>> Lenny.
>>>>>
>>>>
>>>>
>>>
>>
>

Re: [OMPI devel] BW benchmark hangs after r 18551

2008-06-17 Thread Lenny Verkhovsky

It seems like we have 2 bugs here.
1. After commiting NUMA awareness we see seqf
2. Before commiting NUMA r18656 we see application hangs.
3. I checked both it with and without sendi, same results.
4. It hangs most of the times, but sometimes large msg ( >1M ) are working.


I will keep investigating :)


VER=TRUNK; //home/USERS/lenny/OMPI_ORTE_${VER}/bin/mpicc -o mpi_p_${VER}
/opt/vltmpi/OPENIB/mpi/examples/mpi_p.c ;
/home/USERS/lenny/OMPI_ORTE_${VER}/bin/mpirun -np 100 -hostfile hostfile_w
./mpi_p_${VER} -t bw -s 400
[witch17:09798] *** Process received signal ***
[witch17:09798] Signal: Segmentation fault (11)
[witch17:09798] Signal code: Address not mapped (1)
[witch17:09798] Failing at address: (nil)
[witch17:09798] [ 0] /lib64/libpthread.so.0 [0x2b1d13530c10]
[witch17:09798] [ 1]
/home/USERS/lenny/OMPI_ORTE_TRUNK/lib/openmpi/mca_btl_sm.so [0x2b1d1557a68a]
[witch17:09798] [ 2]
/home/USERS/lenny/OMPI_ORTE_TRUNK/lib/openmpi/mca_bml_r2.so [0x2b1d14e1b12f]
[witch17:09798] [ 3]
/home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libopen-pal.so.0(opal_progress+0x5a)
[0x2b1d12f6a6da]
[witch17:09798] [ 4] /home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libmpi.so.0
[0x2b1d12cafd28]
[witch17:09798] [ 5]
/home/USERS/lenny/OMPI_ORTE_TRUNK/lib/libmpi.so.0(PMPI_Waitall+0x91)
[0x2b1d12cd9d71]
[witch17:09798] [ 6] ./mpi_p_TRUNK(main+0xd32) [0x401ca2]
[witch17:09798] [ 7] /lib64/libc.so.6(__libc_start_main+0xf4)
[0x2b1d13657154]
[witch17:09798] [ 8] ./mpi_p_TRUNK [0x400ea9]
[witch17:09798] *** End of error message ***
[witch1:24955]
--
mpirun noticed that process rank 62 with PID 9798 on node witch17 exited on
signal 11 (Segmentation fault).
--
witch1:/home/USERS/lenny/TESTS/NUMA #
witch1:/home/USERS/lenny/TESTS/NUMA #
witch1:/home/USERS/lenny/TESTS/NUMA #
witch1:/home/USERS/lenny/TESTS/NUMA # VER=18551;
//home/USERS/lenny/OMPI_ORTE_${VER}/bin/mpicc -o mpi_p_${VER}
/opt/vltmpi/OPENIB/mpi/examples/mpi_p.c ;
/home/USERS/lenny/OMPI_ORTE_${VER}/bin/mpirun -np 100 -hostfile hostfile_w
./mpi_p_${VER} -t bw -s 400
BW (100) (size min max avg)  400654.496755  2121.899985
1156.171067
witch1:/home/USERS/lenny/TESTS/NUMA
#




On Tue, Jun 17, 2008 at 2:10 PM, George Bosilca <bosi...@eecs.utk.edu>
wrote:

> Lenny,
>
> I guess you're running the latest version. If not, please update, Galen and
> myself corrected some bugs last week. If you're using the latest (and
> greatest) then ... well I imagine there is at least one bug left.
>
> There is a quick test you can do. In the btl_sm.c in the module structure
> at the beginning of the file, please replace the sendi function by NULL. If
> this fix the problem, then at least we know that it's a sm send immediate
> problem.
>
>  Thanks,
>george.
>
>
> On Jun 17, 2008, at 7:54 AM, Lenny Verkhovsky wrote:
>
> Hi, George,
>>
>> I have a problem running BW benchmark on 100 rank cluster after r18551.
>> The BW is mpi_p that runs mpi_bandwidth with 100K between all pairs.
>>
>>
>> #mpirun -np 100 -hostfile hostfile_w  ./mpi_p_18549 -t bw -s 10
>> BW (100) (size min max avg)  10 576.734030  2001.882416
>> 1062.698408
>> #mpirun -np 100 -hostfile hostfile_w ./mpi_p_18551 -t bw -s 10
>> mpirun: killing job...
>> ( it hangs even after 10 hours ).
>>
>>
>> It doesn't happen if I run --bynode or btl openib,self only.
>>
>>
>> Lenny.
>>
>
>

[OMPI devel] BW benchmark hangs after r 18551

2008-06-17 Thread Lenny Verkhovsky

Hi, George,

I have a problem running BW benchmark on 100 rank cluster after r18551.
The BW is mpi_p that runs mpi_bandwidth with 100K between all pairs.


#mpirun -np 100 -hostfile hostfile_w  ./mpi_p_18549 -t bw -s 10
BW (100) (size min max avg)  10 576.734030  2001.882416
1062.698408
#mpirun -np 100 -hostfile hostfile_w ./mpi_p_18551 -t bw -s 10
mpirun: killing job...
( it hangs even after 10 hours ).


It doesn't happen if I run --bynode or btl openib,self only.


Lenny.

Re: [OMPI devel] SM BTL NUMA awareness patches

2008-06-12 Thread Lenny Verkhovsky

OK,
I will commit it next week.
I did saw performance improvement in the worst scenario. I believe that with
increasing numbers of CPUs the improvement will more noticable.

On Thu, Jun 12, 2008 at 1:00 AM, Brad Benton <bradford.ben...@gmail.com>
wrote:

> Lenny,
>
> I've looked over the code more and did some initial tests with it.  It
> didn't seem to hurt anything in the default case.  I also consulted with
> George and he would like to see these patches get in for 1.3.  Since it
> seems to do no harm in the default case, I am okay with that as well.  So,
> unless anyone else has objections, please go ahead and apply this to the
> trunk.
>
> BTW, in your testing, were you able to measure any noticeable performance
> improvements?
>
> Thanks & Regards,
> --brad
>
>
>
> On Tue, Jun 10, 2008 at 2:32 PM, Brad Benton <bradford.ben...@gmail.com>
> wrote:
>
>> Hi Lenny,
>>
>> My apologies for not replying sooner.  I would like to look at these
>> patches a bit more.  Since this adds a feature (NUMA awareness in the SM
>> BTL) as well as introduces interface changes to the maffinity framework, I
>> would also like to get George's opinion before deciding whether or not go
>> bring this into the trunk before branching for 1.3.
>>
>> Regards,
>> --Brad
>>
>>
>>
>> On Tue, Jun 10, 2008 at 10:52 AM, Lenny Verkhovsky <
>> lenny.verkhov...@gmail.com> wrote:
>>
>>> Hi,
>>> I didn't want to bring it on the teleconference
>>> but I want to commit Gleb's NUMA awareness patch before you branching
>>> trunk.
>>> Since I didn't get any objection / response about it I guess it's OK.
>>>
>>> Best Regards
>>> Lenny,
>>>
>>>  -- Forwarded message --
>>> From: Lenny Verkhovsky <lenny.verkhov...@gmail.com>
>>> Date: Tue, Jun 3, 2008 at 2:38 PM
>>> Subject: [OMPI devel] SM BTL NUMA awareness patches
>>>  To: Open MPI Developers <de...@open-mpi.org>
>>>
>>>
>>> Hi, all,
>>> If there are no comments for this patch
>>> I can commit it.
>>>
>>> Lenny.
>>>
>>>  -- Forwarded message --
>>> From: Gleb Natapov <gl...@voltaire.com>
>>> Date: 2008/5/28
>>> Subject: [OMPI devel] SM BTL NUMA awareness patches
>>> To: de...@open-mpi.org
>>>
>>>
>>>  Hi,
>>>
>>> Attached two patches implement NUMA awareness in SM BTL. The first one
>>> adds two new functions to maffinity framework required by the second
>>> patch. The functions are:
>>>
>>>  opal_maffinity_base_node_name_to_id() - gets a string that represents a
>>> memory node name and translates
>>> it to memory node id.
>>>  opal_maffinity_base_bind()- binds an address range to
>>> specific
>>> memory node.
>>>
>>> The bind() function cannot be implemented by all maffinity components.
>>> (There is no way first_use maffinity component can implement such
>>> functionality). In this case this function can be set to NULL.
>>>
>>> The second one adds NUMA awareness support to SM BTL and SM MPOOL. Each
>>> process determines what CPU it is running on and exchange this info with
>>> other local processes. Each process creates separate MPOOL for every
>>> memory node available and use them to allocate memory on specific memory
>>> nodes if needed. For instance circular buffer memory is always allocated
>>> on memory node local to receiver process.
>>>
>>> To use this on a Linux machine carto file with HW topology description
>>> should
>>> be provided. Processes should be bound to specific CPU (by specifying
>>> rank file for instance) and session directory should be created on tmpfs
>>> file system (otherwise Linux ignores memory binding commands) by
>>> setting orte_tmpdir_base parameter to point to tmpfs mount point.
>>>
>>> Questions and suggestion are alway welcome.
>>>
>>> --
>>>Gleb.
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>>
>>>
>>
>

[OMPI devel] SM BTL NUMA awareness patches

2008-06-03 Thread Lenny Verkhovsky

Hi, all,
If there are no comments for this patch
I can commit it.

Lenny.

-- Forwarded message --
From: Gleb Natapov 
List-Post: devel@lists.open-mpi.org
Date: 2008/5/28
Subject: [OMPI devel] SM BTL NUMA awareness patches
To: de...@open-mpi.org

Hi,

Attached two patches implement NUMA awareness in SM BTL. The first one
adds two new functions to maffinity framework required by the second
patch. The functions are:

 opal_maffinity_base_node_name_to_id() - gets a string that represents a
memory node name and translates
it to memory node id.
 opal_maffinity_base_bind()- binds an address range to specific
memory node.

The bind() function cannot be implemented by all maffinity components.
(There is no way first_use maffinity component can implement such
functionality). In this case this function can be set to NULL.

The second one adds NUMA awareness support to SM BTL and SM MPOOL. Each
process determines what CPU it is running on and exchange this info with
other local processes. Each process creates separate MPOOL for every
memory node available and use them to allocate memory on specific memory
nodes if needed. For instance circular buffer memory is always allocated
on memory node local to receiver process.

To use this on a Linux machine carto file with HW topology description
should
be provided. Processes should be bound to specific CPU (by specifying
rank file for instance) and session directory should be created on tmpfs
file system (otherwise Linux ignores memory binding commands) by
setting orte_tmpdir_base parameter to point to tmpfs mount point.

Questions and suggestion are alway welcome.

--
   Gleb.

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
commit 883db5e1ce8c3b49cc1376e6acf9c2d5d0d77983
Author: Gleb Natapov 
Date:   Tue May 27 14:55:11 2008 +0300

Add functions to maffinity.

diff --git a/opal/mca/maffinity/base/base.h b/opal/mca/maffinity/base/base.h
index c44efed..339e6a1 100644
--- a/opal/mca/maffinity/base/base.h
+++ b/opal/mca/maffinity/base/base.h
@@ -105,6 +105,9 @@ OPAL_DECLSPEC int opal_maffinity_base_select(void);
  */
 OPAL_DECLSPEC int opal_maffinity_base_set(opal_maffinity_base_segment_t *segments, size_t num_segments);

+OPAL_DECLSPEC int opal_maffinity_base_node_name_to_id(char *, int *);
+OPAL_DECLSPEC int opal_maffinity_base_bind(opal_maffinity_base_segment_t *, size_t, int);
+
 /**
  * Shut down the maffinity MCA framework.
  *
diff --git a/opal/mca/maffinity/base/maffinity_base_wrappers.c b/opal/mca/maffinity/base/maffinity_base_wrappers.c
index ec843eb..eef5c7d 100644
--- a/opal/mca/maffinity/base/maffinity_base_wrappers.c
+++ b/opal/mca/maffinity/base/maffinity_base_wrappers.c
@@ -31,3 +31,33 @@ int opal_maffinity_base_set(opal_maffinity_base_segment_t *segments,
 }
 return opal_maffinity_base_module->maff_module_set(segments, num_segments);
 }
+
+int opal_maffinity_base_node_name_to_id(char *node_name, int *node_id)
+{
+if (!opal_maffinity_base_selected) {
+return OPAL_ERR_NOT_FOUND;
+}
+
+if (!opal_maffinity_base_module->maff_module_name_to_id) {
+*node_id = 0;
+return OPAL_ERR_NOT_IMPLEMENTED;
+}
+
+return opal_maffinity_base_module->maff_module_name_to_id(node_name,
+node_id);
+}
+
+int opal_maffinity_base_bind(opal_maffinity_base_segment_t *segments,
+size_t num_segments, int node_id)
+{
+if (!opal_maffinity_base_selected) {
+return OPAL_ERR_NOT_FOUND;
+}
+
+if (!opal_maffinity_base_module->maff_module_bind) {
+return OPAL_ERR_NOT_IMPLEMENTED;
+}
+
+return opal_maffinity_base_module->maff_module_bind(segments, num_segments,
+node_id);
+}
diff --git a/opal/mca/maffinity/first_use/maffinity_first_use_module.c b/opal/mca/maffinity/first_use/maffinity_first_use_module.c
index a68c2a9..0ae33e1 100644
--- a/opal/mca/maffinity/first_use/maffinity_first_use_module.c
+++ b/opal/mca/maffinity/first_use/maffinity_first_use_module.c
@@ -41,7 +41,9 @@ static const opal_maffinity_base_module_1_0_0_t loc_module = {
 first_use_module_init,

 /* Module function pointers */
-first_use_module_set
+first_use_module_set,
+NULL,
+NULL
 };

 int opal_maffinity_first_use_component_query(mca_base_module_t **module, int *priority)
diff --git a/opal/mca/maffinity/libnuma/maffinity_libnuma_module.c b/opal/mca/maffinity/libnuma/maffinity_libnuma_module.c
index 1fc2231..b2b109c 100644
--- a/opal/mca/maffinity/libnuma/maffinity_libnuma_module.c
+++ b/opal/mca/maffinity/libnuma/maffinity_libnuma_module.c
@@ -20,6 +20,7 @@

 #include 
 #include 
+#include 

 #include "opal/constants.h"
 #include "opal/mca/maffinity/maffinity.h"
@@ -33,6 +34,8 @@

Re: [OMPI devel] [RFC] mca_base_select()

2008-05-11 Thread Lenny Verkhovsky

Hi,
I tried r 18423 with rank_file component and got seqfault
( I increase priority of the component if rmaps_rank_file_path exist)


/home/USERS/lenny/OMPI_ORTE_SMD/bin/mpirun -np 4 -hostfile hostfile_ompi
-mca rmaps_rank_file_path rankfile -mca paffinity_base_verbose 5 ./mpi_p_SMD
-t bw -output 1 -order 1
[witch1:25456] mca:base:select: Querying component [linux]
[witch1:25456] mca:base:select: Query of component [linux] set priority to
10
[witch1:25456] mca:base:select: Selected component [linux]
[witch1:25456] *** Process received signal ***
[witch1:25456] Signal: Segmentation fault (11)
[witch1:25456] Signal code: Invalid permissions (2)
[witch1:25456] Failing at address: 0x2b2875530030
[witch1:25456] [ 0] /lib64/libpthread.so.0 [0x2b28759dfc10]
[witch1:25456] [ 1] /home/USERS/lenny/OMPI_ORTE_SMD/lib/libopen-pal.so.0
[0x2b28753e2bb6]
[witch1:25456] [ 2] /home/USERS/lenny/OMPI_ORTE_SMD/lib/libopen-pal.so.0
[0x2b28753e23b6]
[witch1:25456] [ 3] /home/USERS/lenny/OMPI_ORTE_SMD/lib/libopen-pal.so.0
[0x2b28753e22fd]
[witch1:25456] [ 4]
/home/USERS/lenny/OMPI_ORTE_SMD/lib/libopen-rte.so.0(orte_util_encode_pidmap+0x2f4)
[0x2b287527f412]
[witch1:25456] [ 5]
/home/USERS/lenny/OMPI_ORTE_SMD/lib/libopen-rte.so.0(orte_odls_base_default_get_add_procs_data+0x989)
[0x2b28752934f5]
[witch1:25456] [ 6]
/home/USERS/lenny/OMPI_ORTE_SMD/lib/libopen-rte.so.0(orte_plm_base_launch_apps+0x1a3)
[0x2b287529e60b]
[witch1:25456] [ 7]
/home/USERS/lenny/OMPI_ORTE_SMD/lib/openmpi/mca_plm_rsh.so [0x2b287612f788]
[witch1:25456] [ 8] /home/USERS/lenny/OMPI_ORTE_SMD/bin/mpirun [0x4032bf]
[witch1:25456] [ 9] /home/USERS/lenny/OMPI_ORTE_SMD/bin/mpirun [0x402b53]
[witch1:25456] [10] /lib64/libc.so.6(__libc_start_main+0xf4)
[0x2b2875b06154]
[witch1:25456] [11] /home/USERS/lenny/OMPI_ORTE_SMD/bin/mpirun [0x402aa9]
[witch1:25456] *** End of error message ***
Segmentation fault




On Tue, May 6, 2008 at 9:09 PM, Josh Hursey  wrote:

> This has been committed in r18381
>
> Please let me know if you have any problems with this commit.
>
> Cheers,
> Josh
>
> On May 5, 2008, at 10:41 AM, Josh Hursey wrote:
>
> > Awesome.
> >
> > The branch is updated to the latest trunk head. I encourage folks to
> > check out this repository and make sure that it builds on their
> > system. A normal build of the branch should be enough to find out if
> > there are any cut-n-paste problems (though I tried to be careful,
> > mistakes do happen).
> >
> > I haven't heard any problems so this is looking like it will come in
> > tomorrow after the teleconf. I'll ask again there to see if there are
> > any voices of concern.
> >
> > Cheers,
> > Josh
> >
> > On May 5, 2008, at 9:58 AM, Jeff Squyres wrote:
> >
> >> This all sounds good to me!
> >>
> >> On Apr 29, 2008, at 6:35 PM, Josh Hursey wrote:
> >>
> >>> What:  Add mca_base_select() and adjust frameworks & components to
> >>> use
> >>> it.
> >>> Why:   Consolidation of code for general goodness.
> >>> Where: https://svn.open-mpi.org/svn/ompi/tmp-public/jjh-mca-play
> >>> When:  Code ready now. Documentation ready soon.
> >>> Timeout: May 6, 2008 (After teleconf) [1 week]
> >>>
> >>> Discussion:
> >>> ---
> >>> For a number of years a few developers have been talking about
> >>> creating a MCA base component selection function. For various
> >>> reasons
> >>> this was never implemented. Recently I decided to give it a try.
> >>>
> >>> A base select function will allow Open MPI to provide completely
> >>> consistent selection behavior for many of its frameworks (18 of 31
> >>> to
> >>> be exact at the moment). The primary goal of this work is to
> >>> improving
> >>> code maintainability through code reuse. Other benefits also result
> >>> such as a slightly smaller memory footprint.
> >>>
> >>> The mca_base_select() function represented the most commonly used
> >>> logic for component selection: Select the one component with the
> >>> highest priority and close all of the not selected components. This
> >>> function can be found at the path below in the branch:
> >>> opal/mca/base/mca_base_components_select.c
> >>>
> >>> To support this I had to formalize a query() function in the
> >>> mca_base_component_t of the form:
> >>> int mca_base_query_component_fn(mca_base_module_t **module, int
> >>> *priority);
> >>>
> >>> This function is specified after the open and close component
> >>> functions in this structure as to allow compatibility with
> >>> frameworks
> >>> that do not use the base selection logic. Frameworks that do *not*
> >>> use
> >>> this function are *not* effected by this commit. However, every
> >>> component in the frameworks that use the mca_base_select function
> >>> must
> >>> adjust their component query function to fit that specified above.
> >>>
> >>> 18 frameworks in Open MPI have been changed. I have updated all of
> >>> the
> >>> components in the 18 frameworks available in the trunk on my branch.
> >>> The effected frameworks are:
> >>> - OPAL Carto
> >>> -

[OMPI devel] NO IP address found

2008-05-06 Thread Lenny Verkhovsky

Hi,

running BW benchmark with btl_openib_max_lmc >= 2 couses warning ( MPI from
the TRUNK ) 


 #mpirun --bynode -np 40 -hostfile hostfile_ompi_arbel  -mca
btl_openib_max_lmc 2  ./mpi_p_LMC  -t bw -s 40
BW (40) (size min max avg)  40  321.493757  342.972837
329.493715

 #mpirun --bynode -np 40 -hostfile hostfile_ompi_arbel  -mca
btl_openib_max_lmc 3  ./mpi_p_LMC  -t bw -s 40
[witch9][[7493,1],7][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch2][[7493,1],0][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch10][[7493,1],9][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch6][[7493,1],4][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch4][[7493,1],2][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch7][[7493,1],5][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch2][[7493,1],10][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch9][[7493,1],17][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch5][[7493,1],3][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch8][[7493,1],6][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch6][[7493,1],14][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch10][[7493,1],19][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch5][[7493,1],13][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch4][[7493,1],12][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch9][[7493,1],27][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch5][[7493,1],23][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch2][[7493,1],20][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch9][[7493,1],37][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch7][[7493,1],35][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch4][[7493,1],32][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch4][[7493,1],22][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch5][[7493,1],33][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch2][[7493,1],30][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch8][[7493,1],16][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch7][[7493,1],15][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch10][[7493,1],39][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch7][[7493,1],25][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch10][[7493,1],29][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch6][[7493,1],34][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch8][[7493,1],26][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch6][[7493,1],24][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
[witch8][[7493,1],36][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_rdmacm.c:989:create_message]
No IP address found
BW (40) (size min max avg)  40  312.622582  334.037277
324.014814

using -mca btl openib,self causes warning with LMC >=10


Best regards
Lenny.

Re: [OMPI devel] Intel MPI Benchmark(IMB) using OpenMPI - Segmentation-fault error message.

2008-05-01 Thread Lenny Verkhovsky

On 5/1/08, Mukesh K Srivastava <srimk...@gmail.com> wrote:
>
> Hi Lenny.
>
> Thanks for responding. To correct more - would like to know few things.
>
> (a) I did modify make_mpich makefile present in IMB-3.1/src folder giving
> the path for openmpi. Here I am using same mpirun as built from
> openmpi(v-1.2.5) also did mention in PATH & LD_LIBRARY_PATH.
>
> (b) What is the command on console to run any new additional file with MPI
> API contents call. Do I need to add in Makefile.base of IMB-3.1/src folder
> or mentioning in console as a command it takes care alongwith "$mpirun
> IMB-MPI1"
>
> (c) Does IMB-3.1 need INB(Infiniband) or TCP support to complete it's
> Benchmark routine call, means do I need to configure and build OpnMPI with
> Infiniband stack too?
>

IMB is a set of benchmarks that can be run between 1 and more machines
it calls for MPI API that does all the communication
MPI decides how to run ( IB or TCP or shared memory ) according to
priorities and all possible ways to be connected to another host.

you can make your own benchmark or test program, compile it with mpicc and
run
ex:
#mpicc -o hello_world hello_world.c
#mpirun -np 2 -H host1,host2 ./hello_world


#cat hello_world.c
/*
* Hewlett-Packard Co., High Performance Systems Division
*
* Function: - example: simple "hello world"
*
* $Revision: 1.1.2.1 $
*/

#include 
#include 

main(argc, argv)

int argc;
char *argv[];

{
int rank, size, len;
char name[MPI_MAX_PROCESSOR_NAME];
int to_wait = 0, sleep_diff = 0, max_limit = 0;
double sleep_start = 0.0, sleep_now = 0.0;

MPI_Init(, );
MPI_Comm_rank(MPI_COMM_WORLD, );
MPI_Comm_size(MPI_COMM_WORLD, );

MPI_Get_processor_name(name, );

if (argc > 1)
{
to_wait = atoi(argv[1]);
}

//busy loop for debuging needs
if (to_wait)
{
sleep_start=MPI_Wtime();
while(1)
{
max_limit++;
if(max_limit > 1)
{
fprintf(stdout," exit loop, to_wait: %d, \n", to_wait);
break;
}

sleep_now = MPI_Wtime();
sleep_diff = (int)(sleep_now - sleep_start);
if(sleep_diff >= to_wait)
{
break;
}
}
}

if (rank == 0) //only the first will print this message
{
printf ("Hello world! I'm %d of %d on %s\n", rank, size, name);
}

MPI_Finalize();
exit(0);
}






(d) I don't see any README in IMB-3.1 or anu user-guide which tells how to
> execute rather it simply tells about each 17 benchmark and flags to be used.
>
> BR
>
>
> On 4/30/08, Lenny Verkhovsky <lenny.verkhov...@gmail.com> wrote:
> >
> >
> >
> >
> > On 4/30/08, Mukesh K Srivastava <srimk...@gmail.com> wrote:
> > >
> > > Hi.
> > >
> > > I am using IMB-3.1, an Intel MPI Benchmark tool with OpenMPI(v-1.2.5).
> > > In /IMB-3.1/src/make_mpich file, I had only given the decalartion for
> > > MPI_HOME, which takes care for CC, OPTFLAGS & CLINKER. Building IMB_MPI1,
> > > IMP-EXT & IMB-IO happens succesfully.
> > >
> > > I get proper results of IMB Benchmark with command "-np 1" as mpirun
> > > IMB-MPI1, but for "-np 2", I get below errors -
> > >
> > > -
> > > [mukesh@n161 src]$ mpirun -np 2 IMB-MPI1
> > > [n161:13390] *** Process received signal ***
> > > [n161:13390] Signal: Segmentation fault (11)
> > > [n161:13390] Signal code: Address not mapped (1)
> > > [n161:13390] Failing at address: (nil)
> > > [n161:13390] [ 0] /lib64/tls/libpthread.so.0 [0x399e80c4f0]
> > > [n161:13390] [ 1]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so [0x2a9830f8b4]
> > > [n161:13390] [ 2]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so [0x2a983109e3]
> > > [n161:13390] [ 3]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0xbc)
> > > [0x2a9830fc50]
> > > [n161:13390] [ 4]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x4b)
> > > [0x2a97fce447]
> > > [n161:13390] [ 5]
> > > /home/mukesh/openmpi/prefix/lib/libopen-pal.so.0(opal_progress+0xbc)
> > > [0x2a958fc343]
> > > [n161:13390] [ 6]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x22)
> > > [0x2a962e9e22]
> > > [n161:13390] [ 7]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x677)
> > > [0x2a962f1aab]
> > > [n161:13390] [ 8]
> > > /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0(mca_oob_recv_packed+0x46)
> > > [0x2a9579d243]
> > > [n161:13390] [ 9]
> > > /home/mukesh/openmpi/prefix/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x2f3)
> > > [0x2a96508c8f]
> > >

[OMPI devel] Fwd: Intel MPI Benchmark(IMB) using OpenMPI - Segmentation-fault error message.

2008-04-30 Thread Lenny Verkhovsky

On 4/30/08, Mukesh K Srivastava  wrote:
>
> Hi.
>
> I am using IMB-3.1, an Intel MPI Benchmark tool with OpenMPI(v-1.2.5). In
> /IMB-3.1/src/make_mpich file, I had only given the decalartion for MPI_HOME,
> which takes care for CC, OPTFLAGS & CLINKER. Building IMB_MPI1, IMP-EXT &
> IMB-IO happens succesfully.
>
> I get proper results of IMB Benchmark with command "-np 1" as mpirun
> IMB-MPI1, but for "-np 2", I get below errors -
>
> -
> [mukesh@n161 src]$ mpirun -np 2 IMB-MPI1
> [n161:13390] *** Process received signal ***
> [n161:13390] Signal: Segmentation fault (11)
> [n161:13390] Signal code: Address not mapped (1)
> [n161:13390] Failing at address: (nil)
> [n161:13390] [ 0] /lib64/tls/libpthread.so.0 [0x399e80c4f0]
> [n161:13390] [ 1] /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so
> [0x2a9830f8b4]
> [n161:13390] [ 2] /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so
> [0x2a983109e3]
> [n161:13390] [ 3]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0xbc)
> [0x2a9830fc50]
> [n161:13390] [ 4]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x4b)
> [0x2a97fce447]
> [n161:13390] [ 5]
> /home/mukesh/openmpi/prefix/lib/libopen-pal.so.0(opal_progress+0xbc)
> [0x2a958fc343]
> [n161:13390] [ 6]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x22)
> [0x2a962e9e22]
> [n161:13390] [ 7]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x677)
> [0x2a962f1aab]
> [n161:13390] [ 8]
> /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0(mca_oob_recv_packed+0x46)
> [0x2a9579d243]
> [n161:13390] [ 9]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x2f3)
> [0x2a96508c8f]
> [n161:13390] [10]
> /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x425)
> [0x2a957c391d]
> [n161:13390] [11]
> /home/mukesh/openmpi/prefix/lib/libmpi.so.0(ompi_mpi_init+0xa1e)
> [0x2a9559f042]
> [n161:13390] [12]
> /home/mukesh/openmpi/prefix/lib/libmpi.so.0(PMPI_Init_thread+0xcb)
> [0x2a955e1c5b]
> [n161:13390] [13] IMB-MPI1(main+0x33) [0x403543]
> [n161:13390] [14] /lib64/tls/libc.so.6(__libc_start_main+0xdb)
> [0x399e11c3fb]
> [n161:13390] [15] IMB-MPI1 [0x40347a]
> [n161:13390] *** End of error message ***
> [n161:13391] *** Process received signal ***
> [n161:13391] Signal: Segmentation fault (11)
> [n161:13391] Signal code: Address not mapped (1)
> [n161:13391] Failing at address: (nil)
> [n161:13391] [ 0] /lib64/tls/libpthread.so.0 [0x399e80c4f0]
> [n161:13391] [ 1] /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so
> [0x2a9830f8b4]
> [n161:13391] [ 2] /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so
> [0x2a983109e3]
> [n161:13391] [ 3]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0xbc)
> [0x2a9830fc50]
> [n161:13391] [ 4]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x4b)
> [0x2a97fce447]
> [n161:13391] [ 5]
> /home/mukesh/openmpi/prefix/lib/libopen-pal.so.0(opal_progress+0xbc)
> [0x2a958fc343]
> [n161:13391] [ 6]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x22)
> [0x2a962e9e22]
> [n161:13391] [ 7]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x677)
> [0x2a962f1aab]
> [n161:13391] [ 8]
> /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0(mca_oob_recv_packed+0x46)
> [0x2a9579d243]
> [n161:13391] [ 9] /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0
> [0x2a9579e910]
> [n161:13391] [10]
> /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0(mca_oob_xcast+0x140)
> [0x2a9579d824]
> [n161:13391] [11]
> /home/mukesh/openmpi/prefix/lib/libmpi.so.0(ompi_mpi_init+0xaf1)
> [0x2a9559f115]
> [n161:13391] [12]
> /home/mukesh/openmpi/prefix/lib/libmpi.so.0(PMPI_Init_thread+0xcb)
> [0x2a955e1c5b]
> [n161:13391] [13] IMB-MPI1(main+0x33) [0x403543]
> [n161:13391] [14] /lib64/tls/libc.so.6(__libc_start_main+0xdb)
> [0x399e11c3fb]
> [n161:13391] [15] IMB-MPI1 [0x40347a]
> [n161:13391] *** End of error message ***
>
> -
>
> Query#1: Any clue for above?


It worked for me.

1. maybe mpirun belongs to another MPI.
2. try to define hosts ( -H host1,host2 )



Query#2:  How can I include seperate exe file and have the IMB for it, e.g,
> writing a hello.c with MPI elementary API calls, compiling with mpicc and
> performing IMB for the same exe.?


you have all the sorces
maybe in IMB's README you can find something

Best Regards,
Lenny

BR
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Intel MPI Benchmark(IMB) using OpenMPI - Segmentation-fault error message.

2008-04-30 Thread Lenny Verkhovsky

On 4/30/08, Mukesh K Srivastava  wrote:
>
> Hi.
>
> I am using IMB-3.1, an Intel MPI Benchmark tool with OpenMPI(v-1.2.5). In
> /IMB-3.1/src/make_mpich file, I had only given the decalartion for MPI_HOME,
> which takes care for CC, OPTFLAGS & CLINKER. Building IMB_MPI1, IMP-EXT &
> IMB-IO happens succesfully.
>
> I get proper results of IMB Benchmark with command "-np 1" as mpirun
> IMB-MPI1, but for "-np 2", I get below errors -
>
> -
> [mukesh@n161 src]$ mpirun -np 2 IMB-MPI1
> [n161:13390] *** Process received signal ***
> [n161:13390] Signal: Segmentation fault (11)
> [n161:13390] Signal code: Address not mapped (1)
> [n161:13390] Failing at address: (nil)
> [n161:13390] [ 0] /lib64/tls/libpthread.so.0 [0x399e80c4f0]
> [n161:13390] [ 1] /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so
> [0x2a9830f8b4]
> [n161:13390] [ 2] /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so
> [0x2a983109e3]
> [n161:13390] [ 3]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0xbc)
> [0x2a9830fc50]
> [n161:13390] [ 4]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x4b)
> [0x2a97fce447]
> [n161:13390] [ 5]
> /home/mukesh/openmpi/prefix/lib/libopen-pal.so.0(opal_progress+0xbc)
> [0x2a958fc343]
> [n161:13390] [ 6]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x22)
> [0x2a962e9e22]
> [n161:13390] [ 7]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x677)
> [0x2a962f1aab]
> [n161:13390] [ 8]
> /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0(mca_oob_recv_packed+0x46)
> [0x2a9579d243]
> [n161:13390] [ 9]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_gpr_proxy.so(orte_gpr_proxy_put+0x2f3)
> [0x2a96508c8f]
> [n161:13390] [10]
> /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0(orte_smr_base_set_proc_state+0x425)
> [0x2a957c391d]
> [n161:13390] [11]
> /home/mukesh/openmpi/prefix/lib/libmpi.so.0(ompi_mpi_init+0xa1e)
> [0x2a9559f042]
> [n161:13390] [12]
> /home/mukesh/openmpi/prefix/lib/libmpi.so.0(PMPI_Init_thread+0xcb)
> [0x2a955e1c5b]
> [n161:13390] [13] IMB-MPI1(main+0x33) [0x403543]
> [n161:13390] [14] /lib64/tls/libc.so.6(__libc_start_main+0xdb)
> [0x399e11c3fb]
> [n161:13390] [15] IMB-MPI1 [0x40347a]
> [n161:13390] *** End of error message ***
> [n161:13391] *** Process received signal ***
> [n161:13391] Signal: Segmentation fault (11)
> [n161:13391] Signal code: Address not mapped (1)
> [n161:13391] Failing at address: (nil)
> [n161:13391] [ 0] /lib64/tls/libpthread.so.0 [0x399e80c4f0]
> [n161:13391] [ 1] /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so
> [0x2a9830f8b4]
> [n161:13391] [ 2] /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so
> [0x2a983109e3]
> [n161:13391] [ 3]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_btl_sm.so(mca_btl_sm_component_progress+0xbc)
> [0x2a9830fc50]
> [n161:13391] [ 4]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_bml_r2.so(mca_bml_r2_progress+0x4b)
> [0x2a97fce447]
> [n161:13391] [ 5]
> /home/mukesh/openmpi/prefix/lib/libopen-pal.so.0(opal_progress+0xbc)
> [0x2a958fc343]
> [n161:13391] [ 6]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_msg_wait+0x22)
> [0x2a962e9e22]
> [n161:13391] [ 7]
> /home/mukesh/openmpi/prefix/lib/openmpi/mca_oob_tcp.so(mca_oob_tcp_recv+0x677)
> [0x2a962f1aab]
> [n161:13391] [ 8]
> /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0(mca_oob_recv_packed+0x46)
> [0x2a9579d243]
> [n161:13391] [ 9] /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0
> [0x2a9579e910]
> [n161:13391] [10]
> /home/mukesh/openmpi/prefix/lib/libopen-rte.so.0(mca_oob_xcast+0x140)
> [0x2a9579d824]
> [n161:13391] [11]
> /home/mukesh/openmpi/prefix/lib/libmpi.so.0(ompi_mpi_init+0xaf1)
> [0x2a9559f115]
> [n161:13391] [12]
> /home/mukesh/openmpi/prefix/lib/libmpi.so.0(PMPI_Init_thread+0xcb)
> [0x2a955e1c5b]
> [n161:13391] [13] IMB-MPI1(main+0x33) [0x403543]
> [n161:13391] [14] /lib64/tls/libc.so.6(__libc_start_main+0xdb)
> [0x399e11c3fb]
> [n161:13391] [15] IMB-MPI1 [0x40347a]
> [n161:13391] *** End of error message ***
>
> -
>
> Query#1: Any clue for above?


It worked for me.

1. maybe mpirun belongs to another MPI.
2. try to define hosts ( -H host1,host2 )



Query#2:  How can I include seperate exe file and have the IMB for it, e.g,
> writing a hello.c with MPI elementary API calls, compiling with mpicc and
> performing IMB for the same exe.?


you have all the sorces
maybe in IMB's README you can find something

Best Regards,
Lenny

BR
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Loadbalancing

2008-04-28 Thread Lenny Verkhovsky

They can also use RankMapping policy for a precise mapping .

On 4/25/08, Jeff Squyres  wrote:
>
> Kewl!
>
> I added ticket 1277 so that we are sure to document this for v1.3.
>
>
>
> On Apr 23, 2008, at 11:09 AM, Ralph H Castain wrote:
>
> > I added a new "loadbalance" feature to OMPI today in r18252.
> >
> > Brief summary: adding --loadbalance to the mpirun cmd line will
> > cause the
> > round-robin mapper to balance your specified #procs across the
> > available
> > nodes.
> >
> > More detail:
> > Several users had noted that mapping byslot always caused us to
> > preferentially load the first nodes in an allocation, potentially
> > leaving
> > other nodes unused. If they mapped bynode, of course, this wouldn't
> > happen -
> > but then they were forced to a specific rank-to-node relationship.
> >
> > What they wanted was to have the ranks numbered byslot, but to have
> > the ppn
> > balanced across the entire allocation.
> >
> > This is now supported via the --loadbalance cmd line option. Here is
> > an
> > example of its affect (again, remember that loadbalance only impacts
> > mapping
> > byslot):
> >
> >no-lb  lb bynode
> > node0:  0,1,2,30,1,2   0,3,6
> > node1:  4,5,6  3,4 1,4
> > node2: 5,6 2,5
> >
> >
> > As you can see, the affect of --loadbalance is to balance the ppn
> > across all
> > the available nodes while retaining byslot rank associations. In
> > this case,
> > instead of leaving one node unused, we take advantage of all available
> > resources.
> >
> > Hope this proves helpful
> > Ralph
> >
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

[OMPI devel] Unbelievable situation BUG

2008-04-27 Thread Lenny Verkhovsky

Hi, all 

I faced the "Unbelievable situation"

during running IMB benchmark.

 

 

/home/USERS/lenny/OMPI_ORTE_LMC/bin/mpirun -np 96 --bynode  -hostfile
hostfile_ompi -mca btl_openib_max_lmc 1 ./IMB-MPI1 PingPong PingPing
Sendrecv Exchange Allreduce Reduce Reduce_scatter Bcast Barrier

 

 

 

#

# Benchmarking Allreduce

# #processes = 96

#

#Benchmarking#procs   #bytes #repetitions  t_min[usec]
t_max[usec]  t_avg[usec]

Allreduce   96  0 1000 0.02
0.03 0.02

Allreduce   96  4 1000   297.88
298.07   297.95

Allreduce   96  8 1000   296.15
296.32   296.24

Allreduce   96 16 1000   297.99
298.17   298.09

Allreduce   96 32 1000   296.97
297.20   297.04

Allreduce   96 64 1000   298.43
298.64   298.49

Allreduce   96128 1000   296.86
297.07   296.93

Allreduce   96256 1000   298.00
298.30   298.09

Allreduce   96512 1000   296.79
296.96   296.85

Allreduce   96   1024 1000   299.23
299.39   299.31

Allreduce   96   2048 1000   295.51
295.64   295.57

Allreduce   96   4096 1000   246.02
246.13   246.08

Allreduce   96   8192 1000   492.52
492.74   492.63

Allreduce   96  16384 1000  5380.59
5381.47  5381.10

Allreduce   96  32768 1000  5372.86
5373.69  5373.36

Allreduce   96  65536  640  5470.41
5471.88  5471.16

Allreduce   96 131072  320  5554.52
5556.82  .75

[witch24:15639] Unbelievable situation ... we got a duplicated fragment
with seq number of 0 (expected 65534) from witch23

[witch24:15639] Unbelievable situation ... we got a duplicated fragment
with seq number of 65116 (expected 65534) from witch23

[witch24:15639] *** Process received signal ***

[witch24:15639] Signal: Segmentation fault (11)

[witch24:15639] Signal code: Address not mapped (1)

[witch24:15639] Failing at address: 0x632457d0

[witch24:15639] [ 0] /lib64/libpthread.so.0 [0x2b7929a9bc10]

[witch24:15639] [ 1]
/home/USERS/lenny/OMPI_ORTE_LMC/lib/openmpi/mca_allocator_bucket.so
[0x2b792aa47d34]

[witch24:15639] [ 2]
/home/USERS/lenny/OMPI_ORTE_LMC/lib/openmpi/mca_pml_ob1.so
[0x2b792b172163]

[witch24:15639] [ 3]
/home/USERS/lenny/OMPI_ORTE_LMC/lib/openmpi/mca_btl_openib.so
[0x2b792b6b0772]

[witch24:15639] [ 4]
/home/USERS/lenny/OMPI_ORTE_LMC/lib/openmpi/mca_btl_openib.so
[0x2b792b6b15ff]

[witch24:15639] [ 5]
/home/USERS/lenny/OMPI_ORTE_LMC/lib/openmpi/mca_bml_r2.so
[0x2b792b38307f]

[witch24:15639] [ 6]
/home/USERS/lenny/OMPI_ORTE_LMC/lib/libopen-pal.so.0(opal_progress+0x4a)
[0x2b79294cd16a]

[witch24:15639] [ 7] /home/USERS/lenny/OMPI_ORTE_LMC/lib/libmpi.so.0
[0x2b79292163a8]

[witch24:15639] [ 8]
/home/USERS/lenny/OMPI_ORTE_LMC/lib/openmpi/mca_coll_tuned.so
[0x2b792c077cb7]

[witch24:15639] [ 9]
/home/USERS/lenny/OMPI_ORTE_LMC/lib/openmpi/mca_coll_tuned.so
[0x2b792c07b296]

[witch24:15639] [10]
/home/USERS/lenny/OMPI_ORTE_LMC/lib/libmpi.so.0(PMPI_Allreduce+0x1e7)
[0x2b7929229907]

[witch24:15639] [11] ./IMB-MPI1(IMB_allreduce+0x8e) [0x40764e]

[witch24:15639] [12] ./IMB-MPI1(main+0x3aa) [0x4034ea]

[witch24:15639] [13] /lib64/libc.so.6(__libc_start_main+0xf4)
[0x2b7929bc2154]

[witch24:15639] [14] ./IMB-MPI1 [0x4030a9]

[witch24:15639] *** End of error message ***


--

Best Regards,

Lenny.

Re: [OMPI devel] RMAPS rank_file component patch and modifications for review

2008-04-01 Thread Lenny Verkhovsky

Hi,
is there any elegant way to register mpi parameter that will actually be
pointer or alias to hidden opal parameter ?
I still want to leave opal_paffinity_alone flag untouched but instead expose
mpi_paffinity_alone for the user.

thanks
Lenny.

On Mon, Mar 31, 2008 at 2:55 PM, Jeff Squyres <jsquy...@cisco.com> wrote:

> On Mar 27, 2008, at 8:02 AM, Lenny Verkhovsky wrote:
> >
> >> - I don't think we can delete the MCA param ompi_paffinity_alone; it
> >> exists in the v1.2 series and has historical precedent.
> > It will not be deleted,
> > It will just use the same infrastructure ( slot_list parameter and
> > opal_base functions ). It will be transparent for the user.
> >
> > User have 3 ways to setup it
> > 1.mca opal_paffinity_alone 1
> >   This will set paffinity as it did before
> > 2.mca opal_paffinity_slot_list "slot_list"
> >   Used to define slots that will be used for all ranks on all
> > nodes.
> > 3.mca rmaps_rank_file_path rankfile
> >   Assigning ranks to CPUs according to the file
>
>
> I don't see the MCA parameter "mpi_paffinity_alone" anymore:
>
> -
> [4:54] svbu-mpi:~/svn/ompi2 % ompi_info --param all all | grep
> paffinity_alone
> MCA opal: parameter "opal_paffinity_alone" (current
> value: "0")
> [4:54] svbu-mpi:~/svn/ompi2 %
> -
>
> My point is that I don't think we should delete this parameter; there
> is historical precedence for it (and it has been documented on the web
> page for a long, long time).  Perhaps it can now simply be a synonym
> for opal_paffinity_alone (registered in the MPI layer, not opal).
>
> --
>  Jeff Squyres
> Cisco Systems
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] segfault on host not found error.

2008-04-01 Thread Lenny Verkhovsky

yes, it seems to be fixed.
thanks.

On Mon, Mar 31, 2008 at 9:17 PM, Ralph H Castain <r...@lanl.gov> wrote:

> I am unable to replicate the segfault. However, I was able to get the job
> to
> hang. I fixed that behavior with r18044.
>
> Perhaps you can test this again and let me know what you see. A gdb stack
> trace would be more helpful.
>
> Thanks
> Ralph
>
>
>
> On 3/31/08 5:13 AM, "Lenny Verkhovsky" <len...@voltaire.com> wrote:
>
> >
> >
> >
> > I accidently run job on the hostfile where one of hosts was not properly
> > mounted. As a result I got an error and a segfault.
> >
> >
> > /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun -np 29 -hostfile hostfile
> > ./mpi_p01 -t lt
> > bash: /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/orted: No such file or
> > directory
> > 
> > --
> > A daemon (pid 9753) died unexpectedly with status 127 while attempting
> > to launch so we are aborting.
> >
> > There may be more information reported by the environment (see above).
> >
> > This may be because the daemon was unable to find all the needed shared
> > libraries on the remote node. You may set your LD_LIBRARY_PATH to have
> > the
> > location of the shared libraries on the remote nodes and this will
> > automatically be forwarded to the remote nodes.
> > 
> > --
> > 
> > --
> > mpirun was unable to start the specified application as it encountered
> > an error.
> > More information may be available above.
> > 
> > --
> > [witch1:09745] *** Process received signal ***
> > [witch1:09745] Signal: Segmentation fault (11)
> > [witch1:09745] Signal code: Address not mapped (1)
> > [witch1:09745] Failing at address: 0x3c
> > [witch1:09745] [ 0] /lib64/libpthread.so.0 [0x2aff223ebc10]
> > [witch1:09745] [ 1]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0 [0x2aff21cdfe21]
> > [witch1:09745] [ 2]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_rml_oob.so
> > [0x2aff22c398f1]
> > [witch1:09745] [ 3]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so
> > [0x2aff22d426ee]
> > [witch1:09745] [ 4]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so
> > [0x2aff22d433fb]
> > [witch1:09745] [ 5]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so
> > [0x2aff22d4485b]
> > [witch1:09745] [ 6]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b]
> > [witch1:09745] [ 7] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
> > [0x403203]
> > [witch1:09745] [ 8]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b]
> > [witch1:09745] [ 9]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0(opal_progress+0x
> > 8b) [0x2aff21e060cb]
> > [witch1:09745] [10]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_trigger_eve
> > nt+0x20) [0x2aff21cc6940]
> > [witch1:09745] [11]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_wakeup+0x2d
> > ) [0x2aff21cc776d]
> > [witch1:09745] [12]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_plm_rsh.so
> > [0x2aff22b34756]
> > [witch1:09745] [13]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0 [0x2aff21cc6ea7]
> > [witch1:09745] [14]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b]
> > [witch1:09745] [15]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0(opal_progress+0x
> > 8b) [0x2aff21e060cb]
> > [witch1:09745] [16]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_plm_base_da
> > emon_callback+0xad) [0x2aff21ce068d]
> > [witch1:09745] [17]
> > /home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_plm_rsh.so
> > [0x2aff22b34e5e]
> > [witch1:09745] [18] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
> > [0x402e13]
> > [witch1:09745] [19] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
> > [0x402873]
> > [witch1:09745] [20] /lib64/libc.so.6(__libc_start_main+0xf4)
> > [0x2aff22512154]
> > [witch1:09745] [21] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
> > [0x4027c9]
> > [witch1:09745] *** End of error message ***
> > Segmentation fault (core dumped)
> >
> >
> > Best Regards,
> > Lenny.
> >
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] RMAPS rank_file component patch and modifications for review

2008-03-31 Thread Lenny Verkhovsky

OK, 
I am putting it back.



> -Original Message-
> From: terry.don...@sun.com [mailto:terry.don...@sun.com]
> Sent: Monday, March 31, 2008 2:59 PM
> To: Open MPI Developers
> Cc: Lenny Verkhovsky; Sharon Melamed
> Subject: Re: [OMPI devel] RMAPS rank_file component patch and
> modifications for review
> 
> Jeff Squyres wrote:
> > On Mar 27, 2008, at 8:02 AM, Lenny Verkhovsky wrote:
> >
> >>> - I don't think we can delete the MCA param ompi_paffinity_alone;
it
> >>> exists in the v1.2 series and has historical precedent.
> >>>
> >> It will not be deleted,
> >> It will just use the same infrastructure ( slot_list parameter and
> >> opal_base functions ). It will be transparent for the user.
> >>
> >> User have 3 ways to setup it
> >> 1. mca opal_paffinity_alone 1
> >>This will set paffinity as it did before
> >> 2. mca opal_paffinity_slot_list "slot_list"
> >>Used to define slots that will be used for all ranks on all
> >> nodes.
> >> 3. mca rmaps_rank_file_path rankfile
> >>Assigning ranks to CPUs according to the file
> >>
> >
> >
> > I don't see the MCA parameter "mpi_paffinity_alone" anymore:
> >
> > -
> > [4:54] svbu-mpi:~/svn/ompi2 % ompi_info --param all all | grep
> > paffinity_alone
> >  MCA opal: parameter "opal_paffinity_alone" (current
> > value: "0")
> > [4:54] svbu-mpi:~/svn/ompi2 %
> > -
> >
> > My point is that I don't think we should delete this parameter;
there
> > is historical precedence for it (and it has been documented on the
web
> > page for a long, long time).  Perhaps it can now simply be a synonym
> > for opal_paffinity_alone (registered in the MPI layer, not opal).
> >
> >
> I agree with Jeff on the above.  This would cause a lot of busy work
for
> our customers and internal setups.
> 
> --td

[OMPI devel] segfault on host not found error.

2008-03-31 Thread Lenny Verkhovsky




I accidently run job on the hostfile where one of hosts was not properly
mounted. As a result I got an error and a segfault.


/home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun -np 29 -hostfile hostfile
./mpi_p01 -t lt
bash: /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/orted: No such file or
directory

--
A daemon (pid 9753) died unexpectedly with status 127 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have
the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.

--

--
mpirun was unable to start the specified application as it encountered
an error.
More information may be available above.

--
[witch1:09745] *** Process received signal ***
[witch1:09745] Signal: Segmentation fault (11)
[witch1:09745] Signal code: Address not mapped (1)
[witch1:09745] Failing at address: 0x3c
[witch1:09745] [ 0] /lib64/libpthread.so.0 [0x2aff223ebc10]
[witch1:09745] [ 1]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0 [0x2aff21cdfe21]
[witch1:09745] [ 2]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_rml_oob.so
[0x2aff22c398f1]
[witch1:09745] [ 3]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so
[0x2aff22d426ee]
[witch1:09745] [ 4]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so
[0x2aff22d433fb]
[witch1:09745] [ 5]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_oob_tcp.so
[0x2aff22d4485b]
[witch1:09745] [ 6]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b]
[witch1:09745] [ 7] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
[0x403203]
[witch1:09745] [ 8]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b]
[witch1:09745] [ 9]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0(opal_progress+0x
8b) [0x2aff21e060cb]
[witch1:09745] [10]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_trigger_eve
nt+0x20) [0x2aff21cc6940]
[witch1:09745] [11]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_wakeup+0x2d
) [0x2aff21cc776d]
[witch1:09745] [12]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_plm_rsh.so
[0x2aff22b34756]
[witch1:09745] [13]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0 [0x2aff21cc6ea7]
[witch1:09745] [14]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0 [0x2aff21e1242b]
[witch1:09745] [15]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-pal.so.0(opal_progress+0x
8b) [0x2aff21e060cb]
[witch1:09745] [16]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/libopen-rte.so.0(orte_plm_base_da
emon_callback+0xad) [0x2aff21ce068d]
[witch1:09745] [17]
/home/USERS/lenny/OMPI_ORTE_TRUNK//lib/openmpi/mca_plm_rsh.so
[0x2aff22b34e5e]
[witch1:09745] [18] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
[0x402e13]
[witch1:09745] [19] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
[0x402873]
[witch1:09745] [20] /lib64/libc.so.6(__libc_start_main+0xf4)
[0x2aff22512154]
[witch1:09745] [21] /home/USERS/lenny/OMPI_ORTE_TRUNK/bin/mpirun
[0x4027c9]
[witch1:09745] *** End of error message ***
Segmentation fault (core dumped)


Best Regards,
Lenny.

Re: [OMPI devel] trunk segfault

2008-03-27 Thread Lenny Verkhovsky

yes, thanks.



On Thu, Mar 27, 2008 at 2:07 PM, Jeff Squyres <jsquy...@cisco.com> wrote:

> Lenny --
>
> Did this get fixed?  We were mucking with some mca param stuff on the
> trunk yesterday; not sure if it was related to this failure or not.
>
>
> On Mar 26, 2008, at 10:34 AM, Lenny Verkhovsky wrote:
> > Hi, all
> >
> > I compiled and builded source from trunk
> > and it causes segfault
> >
> > /home/USERS/lenny/OMPI_ORTE_NEW/bin/mpirun -np 1 -H witch17 /home/
> > USERS/lenny/TESTS/ORTE/mpi_p01_NEW -t lt
> >
> >
> --
> > It looks like MPI_INIT failed for some reason; your parallel process
> > is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during MPI_INIT; some of which are due to configuration or
> > environment
> > problems.  This failure appears to be an internal failure; here's some
> > additional information (which may only be relevant to an Open MPI
> > developer):
> >   mca_mpi_register_params() failed
> >   --> Returned "Error" (-1) instead of "Success" (0)
> >
> --
> > [witch17:01220] *** Process received signal ***
> > [witch17:01220] Signal: Segmentation fault (11)
> > [witch17:01220] Signal code:  (128)
> > [witch17:01220] Failing at address: (nil)
> > [witch17:01220] [ 0] /lib64/libpthread.so.0 [0x2aadf7072c10]
> > [witch17:01220] [ 1] /home/USERS/lenny/OMPI_ORTE_NEW/lib/libopen-
> > pal.so.0(free+0x56) [0x2aadf6acb6d6]
> > [witch17:01220] [ 2] /home/USERS/lenny/OMPI_ORTE_NEW/lib/libopen-
> > pal.so.0(opal_argv_free+0x25) [0x2aadf6ab9635]
> > [witch17:01220] [ 3] /home/USERS/lenny/OMPI_ORTE_NEW/lib/libmpi.so.0
> > [0x2aadf67f4206]
> > [witch17:01220] [ 4] /home/USERS/lenny/OMPI_ORTE_NEW/lib/libmpi.so.
> > 0(MPI_Init+0xf0) [0x2aadf68117c0]
> > [witch17:01220] [ 5] /home/USERS/lenny/TESTS/ORTE/mpi_p01_NEW(main
> > +0xef) [0x40109f]
> > [witch17:01220] [ 6] /lib64/libc.so.6(__libc_start_main+0xf4)
> > [0x2aadf7199154]
> > [witch17:01220] [ 7] /home/USERS/lenny/TESTS/ORTE/mpi_p01_NEW
> > [0x400ee9]
> > [witch17:01220] *** End of error message ***
> >
> --
> > mpirun noticed that process rank 0 with PID 1220 on node witch17
> > exited on signal 11 (Segmentation fault).
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] RMAPS rank_file component patch and modifications for review

2008-03-27 Thread Lenny Verkhovsky

NO, just tried to see some printouts during the run,
I use in the code

opal_output_verbose(0, 0,"LNY100 opal_paffinity_base_slot_list_set ver=%d
",0);
opal_output_verbose(1, 0,"LNY101 opal_paffinity_base_slot_list_set ver=%d
",1);
OPAL_OUTPUT_VERBOSE((1, 0,"VERBOSE LNY102 opal_paffinity_base_slot_list_set
ver=%d ",1));
but all I see is the first line ( since I put level 0)
I suppose that to see the second line I must configure with --enable-debug,
but this is not working for me either.



On Thu, Mar 27, 2008 at 2:02 PM, Jeff Squyres <jsquy...@cisco.com> wrote:

> Are you using BTL_OUTPUT or something else from btl_base_error.h?
>
>
> On Mar 27, 2008, at 7:49 AM, Lenny Verkhovsky wrote:
> > Hi,
> > thanks for the comments. I will definetly implement all of them and
> > commit the code as soon as I finished.
> >
> > Also I experience few problems with using opal_verbose_output,
> > either there is a bugs or I am doing something wrong.
> >
> >
> > /home/USERS/lenny/OMPI_ORTE_DEBUG/bin/mpirun -mca mca_verbose 0 -mca
> > paffinity_base_verbose 1 --byslot -np 2 -hostfile hostfile -mca
> > btl_openib_max_lmc 1  -mca opal_paffinity_alone 1 -mca
> > btl_openib_verbose 1  /home/USERS/lenny/TESTS/ORTE/mpi_p01_debug -t lt
> >
> >
> > /home/USERS/lenny/TESTS/ORTE/mpi_p01_debug: symbol lookup error: /
> > home/USERS/lenny/OMPI_ORTE_DEBUG//lib/openmpi/mca_btl_openib.so:
> > undefined symbol: mca_btl_base_out
> > /home/USERS/lenny/TESTS/ORTE/mpi_p01_debug: symbol lookup error: /
> > home/USERS/lenny/OMPI_ORTE_DEBUG//lib/openmpi/mca_btl_openib.so:
> > undefined symbol: mca_btl_base_out
> >
> --
> > mpirun has exited due to process rank 1 with PID 5896 on
> > node witch17 exiting without calling "finalize". This may
> > have caused other processes in the application to be
> > terminated by signals sent by mpirun (as reported here).
> >
> >
> > On Wed, Mar 26, 2008 at 2:50 PM, Ralph H Castain <r...@lanl.gov> wrote:
> > I would tend to echo Tim's suggestions. I note that you do lookup
> > that opal
> > mca param in orte as well. I know you sent me a note about that off-
> > list - I
> > apologize for not getting to it yet, but was swamped yesterday.
> >
> > I think the solution suggested in #1 below is the right approach.
> > Looking up
> > opal params in orte or ompi is probably not a good idea. We have had
> > problems in the past where params were looked up in multiple places as
> > people -do- sometimes change the names (ahem...).
> >
> > Also, I would suggest using the macro version of verbose
> > OPAL_OUTPUT_VERBOSE
> > so that it compiles out for non-debug builds - up to you. Many of us
> > use it
> > as we don't need the output from optimized builds.
> >
> > Other than that, I think this looks fine. I do truly appreciate the
> > cleanup
> > of ompi_mpi_init.
> >
> > Ralph
> >
> >
> >
> > On 3/26/08 6:09 AM, "Tim Prins" <tpr...@cs.indiana.edu> wrote:
> >
> > > Hi Lenny,
> > >
> > > This looks good. But I have a couple of suggestions (which others
> > may
> > > disagree with):
> > >
> > > 1. You register an opal mca parameter, but look it up in ompi,
> > then call
> > > a opal function with the result. What if you had a function
> > > opal_paffinity_base_set_slots(long rank) (or some other name, I
> > don't
> > > care) which looked up the mca parameter and then setup the slots
> > as you
> > > are doing if it is fount. This would make things a bit cleaner IMHO.
> > >
> > > 2. the functions in the paffinety base should be prefixed with
> > > 'opal_paffinity_base_'
> > >
> > > 3. Why was the ompi_debug_flag added? It is not used anywhere.
> > >
> > > 4. You probably do not need to add the opal debug flag. There is
> > already
> > > a 'paffinity_base_verbose' flag which should suit your purposes
> > fine. So
> > > you should just be able to replace all of the conditional output
> > > statements in paffinity with something like
> > > opal_output_verbose(10, opal_paffinity_base_output, ...),
> > > where 10 is the verbosity level number.
> > >
> > > Tim
> > >
> > >
> > > Lenny Verkhovsky wrote:
> > >>
> > >>
> > >> Hi, all
> > >>
> > >> Attached patch for modified Rank_File RMAPS component.
>

Re: [OMPI devel] RMAPS rank_file component patch and modifications for review

2008-03-27 Thread Lenny Verkhovsky



> -Original Message-
> From: Jeff Squyres [mailto:jsquy...@cisco.com]
> Sent: Thursday, March 27, 2008 1:38 PM
> To: Lenny Verkhovsky
> Cc: Ralph H Castain; Sharon Melamed; Open MPI Developers
> Subject: Re: RMAPS rank_file component patch and modifications for
review
> 
> A few more comments on top of what Tim / Ralph said:
> 
> - opal_paffinity MCA params should be defined and registered in the
> opal paffinity base (in the base open function so that ompi_info can
> still see them), not opal/runtime/opal_params.c.
OK.

> 
> - I don't have a problem with setting the paffinity slot list from
> ompi_mpi_init, but we should probably make the corresponding MCA
> parameter be an "mpi_*" name; because this is functionality that is
> being exported through the MPI layer.  Additionally, the name
> "mpi_" will make more sense to users; they don't know
> anything about opal/orte -- "mpi_" resonates with running
> their MPI job.
I think in opal_paffinity_base it makes more sense and ompi_mpi_init
will look cleaner.

> 
> - I don't think we can delete the MCA param ompi_paffinity_alone; it
> exists in the v1.2 series and has historical precedent.
It will not be deleted,
It will just use the same infrastructure ( slot_list parameter and
opal_base functions ). It will be transparent for the user.

User have 3 ways to setup it
1.  mca opal_paffinity_alone 1 
This will set paffinity as it did before
2.  mca opal_paffinity_slot_list "slot_list"
Used to define slots that will be used for all ranks on all
nodes.
3.  mca rmaps_rank_file_path rankfile
Assigning ranks to CPUs according to the file

Rank_file_path can be used with opal_paffinity_slot_list
In this case all undefined by rankfile ranks will be assigned by
opal_paffinity_slot_list mca parameter.


> 
> - Note that symbols that are static don't have to abide by the prefix
> rule.  I'm not saying you need to change anything -- you don't -- I
> just notice that you made some symbols both static and use the prefix
> rule.  That's fine, but if you want to use shorter symbol names for
> static symbols, that's fine too.
> 
> 
> 
> On Mar 26, 2008, at 6:01 AM, Lenny Verkhovsky wrote:
> >
> > Hi, all
> > Attached patch for modified Rank_File RMAPS component.
> >
> > 1.introduced new general purpose debug flags
> >   mpi_debug
> >   opal_debug
> >
> > 2.introduced new mca parameter opal_paffinity_slot_list
> > 3.ompi_mpi_init cleaned from opal paffinity functions
> > 4.opal paffinity functions moved to new file opal/mca/paffinity/
> > base/paffinity_base_service.c
> > 5.rank_file component files were renamed according to prefix
> > policy
> > 6.global variables renamed as well.
> > 7.few bug fixes that were brought during previous discussions.
> > 8.If user defines opal_paffinity_alone and rmaps_rank_file_path
> > or opal_paffinity_slot_list,
> > then he gets a Warning that only opal_paffinity_alone will be used.
> >
> > .
> > Best Regards,
> > Lenny.
> >
> > 
> 
> 
> --
> Jeff Squyres
> Cisco Systems

[OMPI devel] RMAPS rank_file component patch and modifications for review

2008-03-26 Thread Lenny Verkhovsky

 

Hi, all

Attached patch for modified Rank_File RMAPS component.

 

1.introduced new general purpose debug flags

  mpi_debug 

  opal_debug

 

2.introduced new mca parameter opal_paffinity_slot_list

3.ompi_mpi_init cleaned from opal paffinity functions

4.opal paffinity functions moved to new file
opal/mca/paffinity/base/paffinity_base_service.c

5.rank_file component files were renamed according to prefix policy 

6.global variables renamed as well.

7.few bug fixes that were brought during previous discussions. 

8.If user defines opal_paffinity_alone and rmaps_rank_file_path or
opal_paffinity_slot_list, 

then he gets a Warning that only opal_paffinity_alone will be used.

 

.

Best Regards,

Lenny.

 



rank_file.patch
Description: rank_file.patch

Re: [OMPI devel] rankfile questions

2008-03-19 Thread Lenny Verkhovsky


Hi,

> -Original Message-
> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org]
On
> Behalf Of Ralph Castain
> Sent: Wednesday, March 19, 2008 3:19 AM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] rankfile questions
> 
> Not trying to pile on here...but I do have a question.
> 
> This commit inserted a bunch of affinity-specific code in
ompi_mpi_init.c.
> Was this truly necessary?
> 
> It seems to me this violates our code architecture. Affinity-specific
code
> belongs in the opal_p[m]affinity functions. Why aren't we just calling
a
> "opal_paffinity_set_my_processor" function (or whatever name you like)
in
> mpi_init, and doing all this paffinity stuff there?

This is the only place where this code is used. These functions process
the info from ODLS and set paffinity appropriately. Moving this code to
OPAL will cause unnecessary changes in paffinity base API.  

> 
> It would make mpi_init a lot cleaner, and preserve the code standards
we
> have had since the beginning.
> 
> In addition, the code that has been added returns ORTE error and
success
> codes. Given the location, it should be OMPI error and success codes -
if
> we
> move it to where I think it belongs (in OPAL), then those codes should
> obviously be OPAL codes.


Will be cleaned up,
thanks.

> 
> If I'm missing some reason why these things can't be done, please
> enlighten
> me. Otherwise, it would be nice if this could be cleaned up.
> 
> Thanks
> Ralph
> 
> On 3/18/08 8:39 AM, "Jeff Squyres"  wrote:
> 
> > On Mar 18, 2008, at 9:32 AM, Jeff Squyres wrote:
> >
> >> I notice that rankfile didn't compile properly on some platforms
and
> >> issued warnings on other platforms.  Thanks to Ralph for cleaning
it
> >> up...
> >>
> >> 1. I see a getenv("slot_list") in the MPI side of the code; it
looks
> >> like $slot_list is set by the odls for the MPI process.  Why isn't
it
> >> an MCA parameter?  That's what all other values passed by the orted
to
> >> the MPI process appear to be.

"slot_list" consist of socket:core pair for the rank to be bind to. This
info changes according to rankfile and different for each node and rank,
therefore it cannot be passed via mca parameter.

> >>
> >> 2. I see that ompi_mpi_params.c is now registering 2 rmaps-level
MCA
> >> parameters.  Why?  Shouldn't these be in ORTE somewhere?

If you mean paffinity_alone and rank_file_debug, then 
1. paffinity_alone was there before.
2. After getting some answers from Ralph about orte_debug in
ompi_mpi_init I intend to introduce ompi_debug mca parameter that will
be used in this library and rank_file_debug will be removed.

> >
> >
> > A few more notes:
> >
> > 3. Most of the files in orte/mca/rmaps/rankfile do not obey the
prefix
> > rule.  I think that they should be renamed.

Rank_file component was copied from round_robin, I thought it would be
strange if it would look differently.

> >
> > 4. A quick look through rankfile_lex.l seems to show that there are
> > global variables that are not protected by the prefix rule (or
> > static).  Ditto in rmaps_rf.c.  These should be fixed.

What do you mean?

> >
> > 5. rank_file_done was instantiated in both rankfile_lex.l and
> > ramps_rf.c (causing a duplicate symbol linker error on OS X).  I
> > removed it from rmaps_rf.c (it was declared "extern" in
> > rankfile_lex.h, assumedly to indicate that it is "owned" by the
lex.l
> > file...?).
thanks

> >
> > 6. svn:ignore was not set in the new rankfile directory.
Will be fixed.


I guess due to the heavy network traffic nowadays, all these comments
came now and not 2 weeks ago when I sent the code for reviews :) :) :).

Best Regards,
Lenny.

> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

[OMPI devel] rankfile mapping RMAPS component.

2008-02-26 Thread Lenny Verkhovsky

Hi,

You can check Rankfile mapping component of RMAPS at
/tmp-public/rank_file/

Notes:
1.  This is based on rhc-step2b revision 17573
2.  Used plpa1.1
3.  if rankfile is present the priority of rankfile component of
RMAPS gets high value.
4.  all unspecified by rankfile ranks are assigned using -byslot or
-bynode  policy. It's highly recommended to assign all ranks in rankfile
if this component is used.
5.  usage: mpirun -mca rmaps_rankfile_path rankfile ./app
6.  you can use -mca mpi_paffinity_debug 1 to check CPU binding.
7.  example:

#cat hosfile
host1 
host2   
host3 
host4

#cat rankfile
rank 1=host1 slot=1:0,1
rank 0=host2 slot=0:*
rank 2=host4 slot=1-2
rank 3=host3 slot=0:1,1:0-2

# mpirun -np 2 -hosfile hostfile -mca rmaps_rankfile_path rankfile ./app


Explanation:
rank 1 will be bounded to host1, socket1 core0 and socket1 core1
rank 0 will be bounded to host2, socket0, all cores
rank 2 will be bounded to host4, CPU #1 and CPU #2
rank 3 will be bounded to host3, socket0 core1, socket1 core0, socket1
core1, socket1 core2.


Important!!!
* There is a user's responsibility to provide the correct number of CPU,
Socket or Core to be bind to. 
* Also it's user's responsibility to provide correct hostname.
* There are machines with not sequential socket numbering.
* use cat /proc/cpuinfo to check CPU, socket and core numbering on your
machine.
* Example:

processor   : 3 /*  CPU number */
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Xeon(R) CPU5110  @ 1.60GHz
stepping: 6
cpu MHz : 1595.957
cache size  : 4096 KB
physical id : 3 /* Socket id */
siblings: 2
core id : 1 /* Core id */
cpu cores   : 2
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall
nx lm constant_tsc pni monitor ds_cpl vmx tm2 cx16 xtpr dca lahf_lm
bogomips: 3192.10
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:


Best Regards,
Lenny.

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r17584

2008-02-25 Thread Lenny Verkhovsky




> -Original Message-
> From: Jeff Squyres [mailto:jsquy...@cisco.com]
> Sent: ב 25 פברואר 2008 16:52
> To: de...@open-mpi.org
> Cc: Lenny Verkhovsky
> Subject: Re: [OMPI svn-full] svn:open-mpi r17584
> 
> Lenny --
> 
> Is this the patch that Sharon was working on? I literally just
> created a new branch for bringing in plpa v1.1.  Should I do it on
> your rank_file branch instead?
>

No need to.
Yes, it's Sharon's patch,

I made a new branch for Rank Mapping RMAPS component that is a copy of Ralph's 
rhc-step2b branch with a new PLPA1.1
I also had to patch the files due to new API of PLPA with Sharon's patch.

Are you planning to merge new PLPA1.1 with the trunk ?


> 
> On Feb 25, 2008, at 9:46 AM, lenn...@osl.iu.edu wrote:
> 
> > Author: lennyve
> > Date: 2008-02-25 09:46:28 EST (Mon, 25 Feb 2008)
> > New Revision: 17584
> > URL: https://svn.open-mpi.org/trac/ompi/changeset/17584
> >
> > Log:
> > Added patched files due to PLPA.1.1 API
> >
> > Added:
> >   tmp-public/rank_file/opal/mca/paffinity/linux/Makefile.am
> >   tmp-public/rank_file/opal/mca/paffinity/linux/configure.m4
> >   tmp-public/rank_file/opal/mca/paffinity/linux/configure.params
> >   tmp-public/rank_file/opal/mca/paffinity/linux/paffinity_linux.h
> >   tmp-public/rank_file/opal/mca/paffinity/linux/
> > paffinity_linux_component.c
> >   tmp-public/rank_file/opal/mca/paffinity/linux/
> > paffinity_linux_module.c
> >
> > Added: tmp-public/rank_file/opal/mca/paffinity/linux/Makefile.am
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > ==
> > --- (empty file)
> > +++ tmp-public/rank_file/opal/mca/paffinity/linux/Makefile.am
> > 2008-02-25 09:46:28 EST (Mon, 25 Feb 2008)
> > @@ -0,0 +1,53 @@
> > +#
> > +# Copyright (c) 2004-2005 The Trustees of Indiana University and
> > Indiana
> > +# University Research and Technology
> > +# Corporation.  All rights reserved.
> > +# Copyright (c) 2004-2005 The University of Tennessee and The
> > University
> > +# of Tennessee Research Foundation.  All
> > rights
> > +# reserved.
> > +# Copyright (c) 2004-2005 High Performance Computing Center
> > Stuttgart,
> > +# University of Stuttgart.  All rights
> > reserved.
> > +# Copyright (c) 2004-2005 The Regents of the University of
> > California.
> > +# All rights reserved.
> > +# Copyright (c) 2007  Cisco Systems, Inc.  All rights reserved.
> > +# $COPYRIGHT$
> > +#
> > +# Additional copyrights may follow
> > +#
> > +# $HEADER$
> > +#
> > +
> > +SUBDIRS = plpa
> > +
> > +# To find plpa_bottom.h
> > +AM_CPPFLAGS = -I$(top_srcdir)/opal/mca/paffinity/linux/plpa/src/
> > libplpa
> > +
> > +sources = \
> > +paffinity_linux.h \
> > +paffinity_linux_component.c \
> > +paffinity_linux_module.c
> > +
> > +# Make the output library in this directory, and name it either
> > +# mca__.la (for DSO builds) or libmca__.la
> > +# (for static builds).
> > +
> > +if OMPI_BUILD_paffinity_linux_DSO
> > +component_noinst =
> > +component_install = mca_paffinity_linux.la
> > +else
> > +component_noinst = libmca_paffinity_linux.la
> > +component_install =
> > +endif
> > +
> > +mcacomponentdir = $(pkglibdir)
> > +mcacomponent_LTLIBRARIES = $(component_install)
> > +mca_paffinity_linux_la_SOURCES = $(sources)
> > +mca_paffinity_linux_la_LDFLAGS = -module -avoid-version
> > +mca_paffinity_linux_la_LIBADD = \
> > +$(top_ompi_builddir)/opal/mca/paffinity/linux/plpa/src/
> > libplpa/libplpa_included.la
> > +
> > +noinst_LTLIBRARIES = $(component_noinst)
> > +libmca_paffinity_linux_la_SOURCES =$(sources)
> > +libmca_paffinity_linux_la_LDFLAGS = -module -avoid-version
> > +libmca_paffinity_linux_la_LIBADD = \
> > +$(top_ompi_builddir)/opal/mca/paffinity/linux/plpa/src/
> > libplpa/libplpa_included.la
> >
> > Added: tmp-public/rank_file/opal/mca/paffinity/linux/configure.m4
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > =
> > ==
> > --- (empty file)
> > +++ tmp-public/rank_file/opal/mca/paffinity/linux/configure.m4
> > 2008-02-25 09:46:28 EST (Mon, 25 Feb 2008)
> > @@ -0,0

1 2 >

1 - 100 of 106 matches

Mail list logo