Re: [OMPI devel] OpenMPI, PLPA and Linux cpuset/cgroup support

2009-07-16 Thread Chris Samuel

- "Ralph Castain"  wrote:

> Sounds like a problem in PLPA - I'll have to defer
> to them.

Understood, thanks for that update.  I'll try and
find some time to look inside PLPA too. 

> Our primary PLPA person is on vacation this week, so
> you might not hear back from him until later next week
> when he gets through his inbox mountain.

I can quite sympathise!  I'm away on leave next week
so it might be a little time until we can resynchronise. :-)

> PLPA may have its own mailing list too - not really sure.

Just seen the link from Terry (thanks!), will take a
look at how busy it is.

All the best,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency


Re: [OMPI devel] DDT and spawn issue?

2009-07-16 Thread Jeff Squyres

On Jul 15, 2009, at 3:57 PM, George Bosilca wrote:


Regarding the latency issue, there is not much to say about. The
platform we tested on is clearly older than what other people test on,
but this is all about. The two versions (before and after the data-
type move) have the same latency, there is no reason to focus on the
latency number.




Ok.  I asked about it because you guys cited a high number on some  
cluster, but didn't provide any details about the makeup of that  
cluster.  Hence, it sounded like the DDT changes increased the  
latency.  If not, great!


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r21703

2009-07-16 Thread Jeff Squyres

Thanks George!

On Jul 16, 2009, at 3:13 PM, bosi...@osl.iu.edu wrote:


Author: bosilca
Date: 2009-07-16 15:13:30 EDT (Thu, 16 Jul 2009)
New Revision: 21703
URL: https://svn.open-mpi.org/trac/ompi/changeset/21703

Log:
Get rid of the ompi_convertor.h header file. Replace all references  
to ompi_convertor

by opal_convertor.
Cleanup the pcie BTL.

Removed:
  trunk/ompi/datatype/ompi_convertor.h
Text files modified:
  trunk/ompi/datatype/Makefile.am  | 1 -
  trunk/ompi/mca/btl/pcie/btl_pcie.c   |15 ---
  trunk/ompi/mca/btl/pcie/btl_pcie.h   | 4 ++--
  trunk/ompi/mca/btl/pcie/btl_pcie_component.c | 1 -
  4 files changed, 10 insertions(+), 11 deletions(-)

Modified: trunk/ompi/datatype/Makefile.am
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/ompi/datatype/Makefile.am (original)
+++ trunk/ompi/datatype/Makefile.am	2009-07-16 15:13:30 EDT (Thu, 16  
Jul 2009)

@@ -20,7 +20,6 @@
#

headers = \
-ompi_convertor.h \
ompi_datatype.h \
ompi_datatype_internal.h


Deleted: trunk/ompi/datatype/ompi_convertor.h
= 
= 
= 
= 
= 
= 
= 
= 
==
--- trunk/ompi/datatype/ompi_convertor.h	2009-07-16 15:13:30 EDT  
(Thu, 16 Jul 2009)

+++ (empty file)
@@ -1,89 +0,0 @@
-/* -*- Mode: C; c-basic-offset:4 ; -*- */
-/*
- * Copyright (c) 2009  The University of Tennessee and The  
University
- * of Tennessee Research Foundation.  All  
rights

- * reserved.
- * Copyright (c) 2009  Oak Ridge National Labs.  All rights  
reserved.

- * $COPYRIGHT$
- *
- * Additional copyrights may follow
- *
- * $HEADER$
- */
-
-#ifndef OMPI_CONVERTOR_H
-#define OMPI_CONVERTOR_H
-
-#include "ompi_config.h"
-
-#include 
-
-#include "opal/datatype/opal_convertor.h"
-#include "ompi/datatype/ompi_datatype.h"
-
-/*
- * XXX TODO To be deleted again.
- * Very small interface to have code, which depends on  
ompi_convertor_prepare... interface

- * to work, still...
- *
- * However, still any header #include "opal/datatype/ 
opal_convertor.h" will need

- * to be renamed to #include "ompi/datatype/ompi_convertor.h"
- */
-#warning "This header file should only be included as a  
convenience. Please use the opal_convert.h header, functions and  
macros"

-
-#define ompi_convertor_topal_convertor_t
-
-static inline int32_t  
ompi_convertor_prepare_for_send( opal_convertor_t* convertor,
-   const  
ompi_datatype_t* datatype,

-   int32_t count,
-   const void*  
pUserBuf)

-{
-return opal_convertor_prepare_for_send( convertor,
-&(datatype->super),
-count,
-pUserBuf);
-}
-
-static inline int32_t  
ompi_convertor_copy_and_prepare_for_send( const opal_convertor_t*  
pSrcConv,
- 
const ompi_datatype_t* datatype,
- 
int32_t count,
- 
const void* pUserBuf,
- 
int32_t flags,
- 
opal_convertor_t* convertor )

-{
-return opal_convertor_copy_and_prepare_for_send( pSrcConv,
- &(datatype- 
>super),

- count,
- pUserBuf,
- flags,
- convertor );
-}
-
-
-static inline int32_t  
ompi_convertor_prepare_for_recv( opal_convertor_t* convertor,
-   const  
ompi_datatype_t* datatype,

-   int32_t count,
-   const void*  
pUserBuf )

-{
-return opal_convertor_prepare_for_recv( convertor,
-&(datatype->super),
-count,
-pUserBuf );
-}
-
-static inline int32_t  
ompi_convertor_copy_and_prepare_for_recv( const opal_convertor_t*  
pSrcConv,
- 
const ompi_datatype_t* datatype,
- 
int32_t count,
- 
const void* pUserBuf,
- 
int32_t 

Re: [OMPI devel] XML output

2009-07-16 Thread Ralph Castain

Yo Greg

Any way your user can send me the print statements? I can't find  
anything wrong with the code - I'm wondering if he has some non- 
printing character in there that is causing a problem.


Thanks
Ralph

On Jul 16, 2009, at 12:37 PM, Ralph Castain wrote:

Weird. It doesn't look like it is actually interleaving, does it? It  
looks more like a leading tag was incorrectly inserted between the m  
and i in "mixing" for some reason.


I'll take a look at the code to see what might have triggered that...

On Jul 16, 2009, at 12:16 PM, Greg Watson wrote:


Ralph,

One of our users is seeing the following output with the XML option  
enabled (1.3.3):


time_mix_freq = 17
Time mixing option:
  avgfit -- time averaging
  with timestep chosen to fit exactly into one day  
or coupling interval
Averaging time steps are at step numbers2,17 each  
day

 

It appears that the XML tags for the same task are being  
interleaved. Any idea if this is fixable?


Thanks,

Greg
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel






Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r21698

2009-07-16 Thread Jeff Squyres

Even worse, I was trying to fix this by changing btl_sctp.c from

#include "opal/datatype/opal_convertor.h"

to

#include "ompi/datatype/ompi_convertor.h"

But then I get

#warning "This header file should only be included as a convenience.  
Please use the opal_convert.h header, functions and macros"


And the top of that file has the following:

/*
 * XXX TODO To be deleted again.
 * Very small interface to have code, which depends on  
ompi_convertor_prepare... interface

 * to work, still...
 *
 * However, still any header #include "opal/datatype/ 
opal_convertor.h" will need

 * to be renamed to #include "ompi/datatype/ompi_convertor.h"
 */

WTF?

I dislike that this header was put in as a workaround to not have to  
update other code to match the new interface (i.e., they're all 1- 
liner inline functions).  This is significantly icky and should be  
fixed.  If you're going to update the interface, then update it.   
Don't put a patchwork around making it look like you didn't update it.






On Jul 16, 2009, at 2:31 PM, Jeff Squyres (jsquyres) wrote:


George --

This does not compile.

btl_sctp.c: In function `mca_btl_sctp_prepare_dst':
btl_sctp.c:339: error: implicit declaration of function
`ompi_datatype_type_lb'
btl_sctp.c:339: error: `ompi_datatype_t' undeclared (first use in this
function)
btl_sctp.c:339: error: (Each undeclared identifier is reported only  
once

btl_sctp.c:339: error: for each function it appears in.)
btl_sctp.c:339: error: syntax error before ')' token

Is there a missing header file?



On Jul 16, 2009, at 2:25 PM,  wrote:

> Author: bosilca
> Date: 2009-07-16 14:25:08 EDT (Thu, 16 Jul 2009)
> New Revision: 21698
> URL: https://svn.open-mpi.org/trac/ompi/changeset/21698
>
> Log:
> No opal datatype functions in the BTL. The datatype attached to the
> convertor is an ompi_datatype_t so calling the ompi level functions
> is the way to go.
>
> Text files modified:
>trunk/ompi/mca/btl/pcie/btl_pcie.c | 2 +-
>trunk/ompi/mca/btl/sctp/btl_sctp.c | 2 +-
>2 files changed, 2 insertions(+), 2 deletions(-)
>
> Modified: trunk/ompi/mca/btl/pcie/btl_pcie.c
> =
> =
> =
> =
> =
> =
> =
> =
>  
==

> --- trunk/ompi/mca/btl/pcie/btl_pcie.c  (original)
> +++ trunk/ompi/mca/btl/pcie/btl_pcie.c  2009-07-16 14:25:08 EDT
> (Thu, 16 Jul 2009)
> @@ -360,7 +360,7 @@
>  if(NULL == frag) {
>  return NULL;
>  }
> -ompi_datatype_type_lb(convertor->pDesc, );
> +ompi_datatype_type_lb((ompi_datatype_t*)convertor->pDesc, );
>  frag->segment.seg_addr.pval = convertor->pBaseBuf + lb +
>  convertor->bConverted;
>  if(NULL == registration) {
>
> Modified: trunk/ompi/mca/btl/sctp/btl_sctp.c
> =
> =
> =
> =
> =
> =
> =
> =
>  
==

> --- trunk/ompi/mca/btl/sctp/btl_sctp.c  (original)
> +++ trunk/ompi/mca/btl/sctp/btl_sctp.c  2009-07-16 14:25:08 EDT
> (Thu, 16 Jul 2009)
> @@ -336,7 +336,7 @@
>  return NULL;
>  }
>
> -opal_datatype_type_lb(convertor->pDesc, );
> +ompi_datatype_type_lb((ompi_datatype_t*)convertor->pDesc, );
>  frag->segments->seg_len = *size;
>  frag->segments->seg_addr.pval = convertor->pBaseBuf + lb +
> convertor->bConverted;
>
> ___
> svn-full mailing list
> svn-f...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/svn-full
>


--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r21698

2009-07-16 Thread George Bosilca

There was a missing header. 21701 should fixes the problem.

  george.

On Jul 16, 2009, at 14:31 , Jeff Squyres wrote:


George --

This does not compile.

btl_sctp.c: In function `mca_btl_sctp_prepare_dst':
btl_sctp.c:339: error: implicit declaration of function  
`ompi_datatype_type_lb'
btl_sctp.c:339: error: `ompi_datatype_t' undeclared (first use in  
this function)
btl_sctp.c:339: error: (Each undeclared identifier is reported only  
once

btl_sctp.c:339: error: for each function it appears in.)
btl_sctp.c:339: error: syntax error before ')' token

Is there a missing header file?



On Jul 16, 2009, at 2:25 PM,  wrote:


Author: bosilca
Date: 2009-07-16 14:25:08 EDT (Thu, 16 Jul 2009)
New Revision: 21698
URL: https://svn.open-mpi.org/trac/ompi/changeset/21698

Log:
No opal datatype functions in the BTL. The datatype attached to the
convertor is an ompi_datatype_t so calling the ompi level functions
is the way to go.

Text files modified:
  trunk/ompi/mca/btl/pcie/btl_pcie.c | 2 +-
  trunk/ompi/mca/btl/sctp/btl_sctp.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

Modified: trunk/ompi/mca/btl/pcie/btl_pcie.c
= 
= 
= 
= 
= 
= 
= 
= 
= 
=

--- trunk/ompi/mca/btl/pcie/btl_pcie.c  (original)
+++ trunk/ompi/mca/btl/pcie/btl_pcie.c  2009-07-16 14:25:08 EDT  
(Thu, 16 Jul 2009)

@@ -360,7 +360,7 @@
if(NULL == frag) {
return NULL;
}
-ompi_datatype_type_lb(convertor->pDesc, );
+ompi_datatype_type_lb((ompi_datatype_t*)convertor->pDesc, );
frag->segment.seg_addr.pval = convertor->pBaseBuf + lb +
convertor->bConverted;
if(NULL == registration) {

Modified: trunk/ompi/mca/btl/sctp/btl_sctp.c
= 
= 
= 
= 
= 
= 
= 
= 
= 
=

--- trunk/ompi/mca/btl/sctp/btl_sctp.c  (original)
+++ trunk/ompi/mca/btl/sctp/btl_sctp.c  2009-07-16 14:25:08 EDT  
(Thu, 16 Jul 2009)

@@ -336,7 +336,7 @@
return NULL;
}

-opal_datatype_type_lb(convertor->pDesc, );
+ompi_datatype_type_lb((ompi_datatype_t*)convertor->pDesc, );
frag->segments->seg_len = *size;
frag->segments->seg_addr.pval = convertor->pBaseBuf + lb +  
convertor->bConverted;


___
svn-full mailing list
svn-f...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/svn-full




--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] XML output

2009-07-16 Thread Ralph Castain
Weird. It doesn't look like it is actually interleaving, does it? It  
looks more like a leading tag was incorrectly inserted between the m  
and i in "mixing" for some reason.


I'll take a look at the code to see what might have triggered that...

On Jul 16, 2009, at 12:16 PM, Greg Watson wrote:


Ralph,

One of our users is seeing the following output with the XML option  
enabled (1.3.3):


time_mix_freq = 17
Time mixing option:
  avgfit -- time averaging
  with timestep chosen to fit exactly into one day  
or coupling interval
Averaging time steps are at step numbers2,17 each  
day

 

It appears that the XML tags for the same task are being  
interleaved. Any idea if this is fixable?


Thanks,

Greg
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r21698

2009-07-16 Thread Jeff Squyres

George --

This does not compile.

btl_sctp.c: In function `mca_btl_sctp_prepare_dst':
btl_sctp.c:339: error: implicit declaration of function  
`ompi_datatype_type_lb'
btl_sctp.c:339: error: `ompi_datatype_t' undeclared (first use in this  
function)

btl_sctp.c:339: error: (Each undeclared identifier is reported only once
btl_sctp.c:339: error: for each function it appears in.)
btl_sctp.c:339: error: syntax error before ')' token

Is there a missing header file?



On Jul 16, 2009, at 2:25 PM,  wrote:


Author: bosilca
Date: 2009-07-16 14:25:08 EDT (Thu, 16 Jul 2009)
New Revision: 21698
URL: https://svn.open-mpi.org/trac/ompi/changeset/21698

Log:
No opal datatype functions in the BTL. The datatype attached to the
convertor is an ompi_datatype_t so calling the ompi level functions
is the way to go.

Text files modified:
   trunk/ompi/mca/btl/pcie/btl_pcie.c | 2 +-
   trunk/ompi/mca/btl/sctp/btl_sctp.c | 2 +-
   2 files changed, 2 insertions(+), 2 deletions(-)

Modified: trunk/ompi/mca/btl/pcie/btl_pcie.c
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/ompi/mca/btl/pcie/btl_pcie.c  (original)
+++ trunk/ompi/mca/btl/pcie/btl_pcie.c  2009-07-16 14:25:08 EDT  
(Thu, 16 Jul 2009)

@@ -360,7 +360,7 @@
 if(NULL == frag) {
 return NULL;
 }
-ompi_datatype_type_lb(convertor->pDesc, );
+ompi_datatype_type_lb((ompi_datatype_t*)convertor->pDesc, );
 frag->segment.seg_addr.pval = convertor->pBaseBuf + lb +
 convertor->bConverted;
 if(NULL == registration) {

Modified: trunk/ompi/mca/btl/sctp/btl_sctp.c
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/ompi/mca/btl/sctp/btl_sctp.c  (original)
+++ trunk/ompi/mca/btl/sctp/btl_sctp.c  2009-07-16 14:25:08 EDT  
(Thu, 16 Jul 2009)

@@ -336,7 +336,7 @@
 return NULL;
 }

-opal_datatype_type_lb(convertor->pDesc, );
+ompi_datatype_type_lb((ompi_datatype_t*)convertor->pDesc, );
 frag->segments->seg_len = *size;
 frag->segments->seg_addr.pval = convertor->pBaseBuf + lb +  
convertor->bConverted;


___
svn-full mailing list
svn-f...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/svn-full




--
Jeff Squyres
Cisco Systems



[OMPI devel] XML output

2009-07-16 Thread Greg Watson

Ralph,

One of our users is seeing the following output with the XML option  
enabled (1.3.3):


time_mix_freq = 17
Time mixing option:
  avgfit -- time averaging
  with timestep chosen to fit exactly into one day or  
coupling interval
Averaging time steps are at step numbers2,17 each  
day

 

It appears that the XML tags for the same task are being interleaved.  
Any idea if this is fixable?


Thanks,

Greg


[OMPI devel] default btl eager_limit

2009-07-16 Thread Terry Dontje
I was playing around with some really silly fragment sizes (sub 72 
bytes) when I ran into some asserts in the btl_openib_sendi.  I traced 
the assert to be caused by mca_pml_ob1_send_request_start_btl() 
calculating the true eager_limit with the following line:


  size_t eager_limit = btl->btl_eager_limit - sizeof(mca_pml_ob1_hdr_t);

If btl_eager_limit ends up being less than the sizeof(mca_pml_ob1_hdr_t) 
the eager_limit calculated results in a very large number and an assert 
later on in the stack.


It seems to me that it would be nice to insert some checks in  
mca_btl_base_param_register() to make sure btl_eager_limit is > 
sizeof(mca_pml_ob1_hdr_t).  Am I missing a reason why this was not done 
in the first place?


--td


Re: [OMPI devel] OpenMPI, PLPA and Linux cpuset/cgroup support

2009-07-16 Thread Terry Dontje

There are some mailing lists for PLPA at:

http://www.open-mpi.org/community/lists/plpa.php

--td
Ralph Castain wrote:
Sounds like a problem in PLPA - I'll have to defer to them. Our 
primary PLPA person is on vacation this week, so you might not hear 
back from him until later next week when he gets through his inbox 
mountain.


PLPA may have its own mailing list too - not really sure.

On Jul 15, 2009, at 10:24 PM, Chris Samuel wrote:



- "Ralph Castain"  wrote:


Looking at your command line, did you remember to set -mca
mpi_paffinity_alone 1?


Ahh, no, sorry, still feeling my way with this..


If not, we won't set affinity on the processes.


Now it fails immediately with:

 Setting processor affinity failed
 --> Returned "Invalid argument" (-11) instead of "Success" (0)

wrapped in a bunch of OpenMPI messages explaining that it
failed on start.

The strace looks much the same as before.

[csamuel@tango047 CPI]$ fgrep affinity cpi-trace.txt
10853 execve("/usr/local/openmpi/1.3.3-gcc/bin/mpiexec", ["mpiexec", 
"-mca", "mpi_paffinity_alone", "1", "-npernode", "4", 
"/home/csamuel/Sources/Tests/CPI/"...], [/* 56 vars */]) = 0

10853 sched_getaffinity(0, 128,  { 3c }) = 8
10853 sched_setaffinity(0, 8,  { 0 })   = -1 EFAULT (Bad address)
10854 sched_getaffinity(0, 128,  
10854 <... sched_getaffinity resumed>  { 3c }) = 8
10854 sched_setaffinity(0, 8,  { 0 } 
10854 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
10857 sched_getaffinity(0, 128,  
10857 <... sched_getaffinity resumed>  { 3c }) = 8
10857 sched_setaffinity(0, 8,  { 0 } 
10857 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
10856 sched_getaffinity(0, 128,  
10856 <... sched_getaffinity resumed>  { 3c }) = 8
10856 sched_setaffinity(0, 8,  { 0 } 
10856 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
10855 sched_getaffinity(0, 128,  
10855 <... sched_getaffinity resumed>  { 3c }) = 8
10855 sched_setaffinity(0, 8,  { 0 } 
10855 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
10857 sched_setaffinity(10857, 8,  { 8 } 
10857 <... sched_setaffinity resumed> ) = 0
10856 sched_setaffinity(10856, 8,  { 4 } 
10856 <... sched_setaffinity resumed> ) = 0
10854 sched_setaffinity(10854, 8,  { 1 } 
10854 <... sched_setaffinity resumed> ) = -1 EINVAL (Invalid argument)


cheers,
Chris
--
Christopher Samuel - (03) 9925 4751 - Systems Manager
The Victorian Partnership for Advanced Computing
P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] OpenMPI, PLPA and Linux cpuset/cgroup support

2009-07-16 Thread Ralph Castain
Sounds like a problem in PLPA - I'll have to defer to them. Our  
primary PLPA person is on vacation this week, so you might not hear  
back from him until later next week when he gets through his inbox  
mountain.


PLPA may have its own mailing list too - not really sure.

On Jul 15, 2009, at 10:24 PM, Chris Samuel wrote:



- "Ralph Castain"  wrote:


Looking at your command line, did you remember to set -mca
mpi_paffinity_alone 1?


Ahh, no, sorry, still feeling my way with this..


If not, we won't set affinity on the processes.


Now it fails immediately with:

 Setting processor affinity failed
 --> Returned "Invalid argument" (-11) instead of "Success" (0)

wrapped in a bunch of OpenMPI messages explaining that it
failed on start.

The strace looks much the same as before.

[csamuel@tango047 CPI]$ fgrep affinity cpi-trace.txt
10853 execve("/usr/local/openmpi/1.3.3-gcc/bin/mpiexec", ["mpiexec",  
"-mca", "mpi_paffinity_alone", "1", "-npernode", "4", "/home/csamuel/ 
Sources/Tests/CPI/"...], [/* 56 vars */]) = 0

10853 sched_getaffinity(0, 128,  { 3c }) = 8
10853 sched_setaffinity(0, 8,  { 0 })   = -1 EFAULT (Bad address)
10854 sched_getaffinity(0, 128,  
10854 <... sched_getaffinity resumed>  { 3c }) = 8
10854 sched_setaffinity(0, 8,  { 0 } 
10854 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
10857 sched_getaffinity(0, 128,  
10857 <... sched_getaffinity resumed>  { 3c }) = 8
10857 sched_setaffinity(0, 8,  { 0 } 
10857 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
10856 sched_getaffinity(0, 128,  
10856 <... sched_getaffinity resumed>  { 3c }) = 8
10856 sched_setaffinity(0, 8,  { 0 } 
10856 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
10855 sched_getaffinity(0, 128,  
10855 <... sched_getaffinity resumed>  { 3c }) = 8
10855 sched_setaffinity(0, 8,  { 0 } 
10855 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
10857 sched_setaffinity(10857, 8,  { 8 } 
10857 <... sched_setaffinity resumed> ) = 0
10856 sched_setaffinity(10856, 8,  { 4 } 
10856 <... sched_setaffinity resumed> ) = 0
10854 sched_setaffinity(10854, 8,  { 1 } 
10854 <... sched_setaffinity resumed> ) = -1 EINVAL (Invalid argument)


cheers,
Chris
--
Christopher Samuel - (03) 9925 4751 - Systems Manager
The Victorian Partnership for Advanced Computing
P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




[OMPI devel] Question on MPI_Info

2009-07-16 Thread Prasadcse Perera
Hi all,
I have been trying some simple code to write a file using Parallel I/O on
Open MPI. Here I specify the MPI_Info value as 0 and the execution
terminates with this messge for any number of processes:

*** An error occurred in MPI_File_open
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_INFO: invalid info object
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)

The src file is attached here .And im using openmpi-1.3.3a1r21566 build.
-- 
http://www.codeproject.com/script/Articles/MemberArticles.aspx?amid=3489381

#include 
#include 


int main(int args, char* argv[]){
  int size, rank;
  MPI::Init();
  size = MPI::COMM_WORLD.Get_rank();
  size = MPI::COMM_WORLD.Get_size();
  int amode, size_int;

  char *fname, *drep;
  MPI_Datatype etype, filetype;
  MPI_Info info;
  MPI_Status status;
  MPI_Offset disp;
  MPI_File fh;
  fname = "testfile.txt";
  drep = "native";

  amode = MPI_MODE_CREATE | MPI_MODE_WRONLY;

  size_int = sizeof(size_int);
  info = 0;

  MPI_File_open(MPI_COMM_WORLD, fname, amode, info, );
  disp = rank * size_int;
  etype = MPI_INTEGER;
  filetype = MPI_INTEGER;

  MPI_File_set_view(fh, disp, etype, filetype, drep, info);
  MPI_File_write(fh, , 1, MPI_INTEGER, );

  MPI_File_close();
  MPI_Finalize();
}


Re: [OMPI devel] selectively bind MPI to one HCA out of available ones

2009-07-16 Thread neeraj
Thanks a lot Pasha,
You saved lot of my time.
Thanks

Regards

Neeraj Chourasia (MTS)
Computational Research Laboratories Ltd.
(A wholly Owned Subsidiary of TATA SONS Ltd)
B-101, ICC Trade Towers, Senapati Bapat Road
Pune 411016 (Mah) INDIA
(O) +91-20-6620 9863  (Fax) +91-20-6620 9862
M: +91.9225520634




"Pavel Shamis (Pasha)"  
Sent by: devel-boun...@open-mpi.org
07/16/2009 01:50 PM
Please respond to
pa...@dev.mellanox.co.il; Please respond to
Open MPI Developers 


To
Open MPI Developers 
cc
Open MPI Users 
Subject
Re: [OMPI devel] selectively bind MPI to one HCA out of available ones






Hi,
You can select ib device used with openib btl by using follow parametres:
 MCA btl: parameter "btl_openib_if_include" (current value: , data 
source: default value)
  Comma-delimited list of devices/ports to be 
used (e.g. "mthca0,mthca1:2"; empty value means to
  use all ports found).  Mutually exclusive with 
btl_openib_if_exclude.
 MCA btl: parameter "btl_openib_if_exclude" (current value: , data 
source: default value)
  Comma-delimited list of device/ports to be 
excluded (empty value means to not exclude any
  ports).  Mutually exclusive with 
btl_openib_if_include.

For example, if you want to use first port on mthc0 you command line 
will look like:

mpirun -np. --mca btl_openib_if_include mthca0:1 

Pasha

nee...@crlindia.com wrote:
>
> Hi all,
>
> I have a cluster where both HCA's of blade are active, but 
> connected to different subnet.
> Is there an option in MPI to select one HCA out of available 
> one's? I know it can be done by making changes in openmpi code, but i 
> need clean interface like option during mpi launch time to select 
> mthca0 or mthca1?
>
> Any help is appreciated. Btw i just checked Mvapich and 
> feature is there inside.
>
> Regards
>
> Neeraj Chourasia (MTS)
> Computational Research Laboratories Ltd.
> (A wholly Owned Subsidiary of TATA SONS Ltd)
> B-101, ICC Trade Towers, Senapati Bapat Road
> Pune 411016 (Mah) INDIA
> (O) +91-20-6620 9863  (Fax) +91-20-6620 9862
> M: +91.9225520634
>
> =-=-= Notice: The information contained in this 
> e-mail message and/or attachments to it may contain confidential or 
> privileged information. If you are not the intended recipient, any 
> dissemination, use, review, distribution, printing or copying of the 
> information contained in this e-mail message and/or attachments to it 
> are strictly prohibited. If you have received this communication in 
> error, please notify us by reply e-mail or telephone and immediately 
> and permanently delete the message and any attachments. Internet 
> communications cannot be guaranteed to be timely, secure, error or 
> virus-free. The sender does not accept liability for any errors or 
> omissions.Thank you =-=-=
>
> 
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


=-=-=



Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. 

Internet communications cannot be guaranteed to be timely,
secure, error or virus-free. The sender does not accept liability
for any errors or omissions.Thank you

=-=-=


Re: [OMPI devel] selectively bind MPI to one HCA out of available ones

2009-07-16 Thread Pavel Shamis (Pasha)

Hi,
You can select ib device used with openib btl by using follow parametres:
MCA btl: parameter "btl_openib_if_include" (current value: , data 
source: default value)
 Comma-delimited list of devices/ports to be 
used (e.g. "mthca0,mthca1:2"; empty value means to
 use all ports found).  Mutually exclusive with 
btl_openib_if_exclude.
MCA btl: parameter "btl_openib_if_exclude" (current value: , data 
source: default value)
 Comma-delimited list of device/ports to be 
excluded (empty value means to not exclude any
 ports).  Mutually exclusive with 
btl_openib_if_include.


For example, if you want to use first port on mthc0 you command line 
will look like:


mpirun -np. --mca btl_openib_if_include mthca0:1 

Pasha

nee...@crlindia.com wrote:


Hi all,

I have a cluster where both HCA's of blade are active, but 
connected to different subnet.
Is there an option in MPI to select one HCA out of available 
one's? I know it can be done by making changes in openmpi code, but i 
need clean interface like option during mpi launch time to select 
mthca0 or mthca1?


Any help is appreciated. Btw i just checked Mvapich and 
feature is there inside.


Regards

Neeraj Chourasia (MTS)
Computational Research Laboratories Ltd.
(A wholly Owned Subsidiary of TATA SONS Ltd)
B-101, ICC Trade Towers, Senapati Bapat Road
Pune 411016 (Mah) INDIA
(O) +91-20-6620 9863  (Fax) +91-20-6620 9862
M: +91.9225520634

=-=-= Notice: The information contained in this 
e-mail message and/or attachments to it may contain confidential or 
privileged information. If you are not the intended recipient, any 
dissemination, use, review, distribution, printing or copying of the 
information contained in this e-mail message and/or attachments to it 
are strictly prohibited. If you have received this communication in 
error, please notify us by reply e-mail or telephone and immediately 
and permanently delete the message and any attachments. Internet 
communications cannot be guaranteed to be timely, secure, error or 
virus-free. The sender does not accept liability for any errors or 
omissions.Thank you =-=-=




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] [OMPI users] where can i get a tracing tool

2009-07-16 Thread Matthias Jurenz
Eugene,

great, thank you very much!

Now we have also write access to the FAQs, so it's easier for us to make
changes on it.


Matthias

On Wed, 2009-07-15 at 09:17 -0700, Eugene Loh wrote:
> Done.  Hit "reload" on the URL below, check out an SVN repository, or 
> wait for these changes to be pushed to the live site.
> 
> Matthias Jurenz wrote:
> 
> >Could you also mention the tool 'otfprofile' under the section 7,
> >please?
> >
> >On Tue, 2009-07-14 at 18:54 -0700, Eugene Loh wrote:
> >  
> >
> >>P.S.  Until the page goes live, I'll also leave it at
> >>http://www.osl.iu.edu/~eloh/faq/?category=perftools .
> >>
> 
> 
-- 
Matthias Jurenz
Technische Universität Dresden
Center for Information Services and High Performance Computing (ZIH)
01062 Dresden, Germany

Phone : (+49) 351/463-31945
Fax   : (+49) 351/463-37773
e-mail: matthias.jur...@tu-dresden.de
WWW   : http://www.tu-dresden.de/zih


smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] OpenMPI, PLPA and Linux cpuset/cgroup support

2009-07-16 Thread Chris Samuel

- "Ralph Castain"  wrote:

> Looking at your command line, did you remember to set -mca  
> mpi_paffinity_alone 1?

Ahh, no, sorry, still feeling my way with this..

> If not, we won't set affinity on the processes.

Now it fails immediately with:

  Setting processor affinity failed
  --> Returned "Invalid argument" (-11) instead of "Success" (0)

wrapped in a bunch of OpenMPI messages explaining that it
failed on start.

The strace looks much the same as before.

[csamuel@tango047 CPI]$ fgrep affinity cpi-trace.txt
10853 execve("/usr/local/openmpi/1.3.3-gcc/bin/mpiexec", ["mpiexec", "-mca", 
"mpi_paffinity_alone", "1", "-npernode", "4", 
"/home/csamuel/Sources/Tests/CPI/"...], [/* 56 vars */]) = 0
10853 sched_getaffinity(0, 128,  { 3c }) = 8
10853 sched_setaffinity(0, 8,  { 0 })   = -1 EFAULT (Bad address)
10854 sched_getaffinity(0, 128,  
10854 <... sched_getaffinity resumed>  { 3c }) = 8
10854 sched_setaffinity(0, 8,  { 0 } 
10854 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
10857 sched_getaffinity(0, 128,  
10857 <... sched_getaffinity resumed>  { 3c }) = 8
10857 sched_setaffinity(0, 8,  { 0 } 
10857 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
10856 sched_getaffinity(0, 128,  
10856 <... sched_getaffinity resumed>  { 3c }) = 8
10856 sched_setaffinity(0, 8,  { 0 } 
10856 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
10855 sched_getaffinity(0, 128,  
10855 <... sched_getaffinity resumed>  { 3c }) = 8
10855 sched_setaffinity(0, 8,  { 0 } 
10855 <... sched_setaffinity resumed> ) = -1 EFAULT (Bad address)
10857 sched_setaffinity(10857, 8,  { 8 } 
10857 <... sched_setaffinity resumed> ) = 0
10856 sched_setaffinity(10856, 8,  { 4 } 
10856 <... sched_setaffinity resumed> ) = 0
10854 sched_setaffinity(10854, 8,  { 1 } 
10854 <... sched_setaffinity resumed> ) = -1 EINVAL (Invalid argument)


cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency