Re: [OMPI devel] FreeBSD timer_base_open error?

2008-03-26 Thread George Bosilca
I was working off-list with Brad on this. Brian is right, the logic in  
configure.m4 is wrong. It overwrite the timer_linux_happy to yes if  
the host match "i?86-*|x86_64*|ia64-*|powerpc-*|powerpc64-*|sparc*-*".  
On FreeBSD host is i386-unknown-freebsd6.2.


Here is a quick and dirty patch. I just move the selection logic a  
little bit around, without any major modifications.


  george.

Index: configure.m4
===
--- configure.m4(revision 17970)
+++ configure.m4(working copy)
@@ -40,14 +40,12 @@
  [timer_linux_happy="yes"],
  [timer_linux_happy="no"])])

-AS_IF([test "$timer_linux_happy" = "yes"],
-  [AS_IF([test -r "/proc/cpuinfo"],
- [timer_linux_happy="yes"],
- [timer_linux_happy="no"])])
-
case "${host}" in
i?86-*|x86_64*|ia64-*|powerpc-*|powerpc64-*|sparc*-*)
-timer_linux_happy="yes"
+AS_IF([test "$timer_linux_happy" = "yes"],
+  [AS_IF([test -r "/proc/cpuinfo"],
+ [timer_linux_happy="yes"],
+ [timer_linux_happy="no"])])
 ;;
*)
 timer_linux_happy="no"



On Mar 25, 2008, at 10:31 PM, Brian Barrett wrote:

On Mar 25, 2008, at 6:16 PM, Jeff Squyres wrote:

"linux" is the name of the component.  It looks like opal/mca/timer/
linux/timer_linux_component.c is doing some checks during component
open() and returning an error if it can't be used (e.g,. if it's not
on linux).

The timer components are a little different than normal MCA
frameworks; they *must* be compiled in libopen-pal statically, and
there will only be one of them built.

In this case, I'm guessing that linux was built simply because  
nothing

else was selected to be built, but then its component_open() function
failed because it didn't find /proc/cpuinfo.



This is actually incorrect.  The linux component looks for /proc/
cpuinfo and builds if it founds that file.  There's a base component
that's built if nothing else is found.  The configure logic for the
linux component is probably not the right thing to do -- it should
probably be modified to check both for that file (there are systems
that call themselves "linux" but don't have a /proc/cpuinfo) is
readable and that we're actually on Linux.

Brian

--
  Brian Barrett

  There is an art . . . to flying. The knack lies in learning how to
  throw yourself at the ground and miss.
  Douglas Adams, 'The Hitchhikers Guide to the Galaxy'



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


[OMPI devel] RMAPS rank_file component patch and modifications for review

2008-03-26 Thread Lenny Verkhovsky
 

Hi, all

Attached patch for modified Rank_File RMAPS component.

 

1.introduced new general purpose debug flags

  mpi_debug 

  opal_debug

 

2.introduced new mca parameter opal_paffinity_slot_list

3.ompi_mpi_init cleaned from opal paffinity functions

4.opal paffinity functions moved to new file
opal/mca/paffinity/base/paffinity_base_service.c

5.rank_file component files were renamed according to prefix policy 

6.global variables renamed as well.

7.few bug fixes that were brought during previous discussions. 

8.If user defines opal_paffinity_alone and rmaps_rank_file_path or
opal_paffinity_slot_list, 

then he gets a Warning that only opal_paffinity_alone will be used.

 

.

Best Regards,

Lenny.

 



rank_file.patch
Description: rank_file.patch


Re: [OMPI devel] RMAPS rank_file component patch and modifications for review

2008-03-26 Thread Tim Prins

Hi Lenny,

This looks good. But I have a couple of suggestions (which others may 
disagree with):


1. You register an opal mca parameter, but look it up in ompi, then call 
a opal function with the result. What if you had a function 
opal_paffinity_base_set_slots(long rank) (or some other name, I don't 
care) which looked up the mca parameter and then setup the slots as you 
are doing if it is fount. This would make things a bit cleaner IMHO.


2. the functions in the paffinety base should be prefixed with 
'opal_paffinity_base_'


3. Why was the ompi_debug_flag added? It is not used anywhere.

4. You probably do not need to add the opal debug flag. There is already 
a 'paffinity_base_verbose' flag which should suit your purposes fine. So 
you should just be able to replace all of the conditional output 
statements in paffinity with something like

opal_output_verbose(10, opal_paffinity_base_output, ...),
where 10 is the verbosity level number.

Tim


Lenny Verkhovsky wrote:
 


Hi, all

Attached patch for modified Rank_File RMAPS component.

 


1.introduced new general purpose debug flags

  mpi_debug

  opal_debug

 


2.introduced new mca parameter opal_paffinity_slot_list

3.ompi_mpi_init cleaned from opal paffinity functions

4.opal paffinity functions moved to new file 
opal/mca/paffinity/base/paffinity_base_service.c


5.rank_file component files were renamed according to prefix policy

6.global variables renamed as well.

7.few bug fixes that were brought during previous discussions.

8.If user defines opal_paffinity_alone and rmaps_rank_file_path or 
opal_paffinity_slot_list,


then he gets a Warning that only opal_paffinity_alone will be used.

 


.

Best Regards,

Lenny.

 





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




[OMPI devel] Debug output

2008-03-26 Thread Ralph H Castain
My apologies - some timing debug output of mine inadvertently was included
in a commit. I am working now to correct that, but need to rebuild the
system to ensure the fix is correct.

Will remove that detailed output shortly.
Ralph




Re: [OMPI devel] RMAPS rank_file component patch and modifications for review

2008-03-26 Thread Ralph H Castain
I would tend to echo Tim's suggestions. I note that you do lookup that opal
mca param in orte as well. I know you sent me a note about that off-list - I
apologize for not getting to it yet, but was swamped yesterday.

I think the solution suggested in #1 below is the right approach. Looking up
opal params in orte or ompi is probably not a good idea. We have had
problems in the past where params were looked up in multiple places as
people -do- sometimes change the names (ahem...).

Also, I would suggest using the macro version of verbose OPAL_OUTPUT_VERBOSE
so that it compiles out for non-debug builds - up to you. Many of us use it
as we don't need the output from optimized builds.

Other than that, I think this looks fine. I do truly appreciate the cleanup
of ompi_mpi_init.

Ralph



On 3/26/08 6:09 AM, "Tim Prins"  wrote:

> Hi Lenny,
> 
> This looks good. But I have a couple of suggestions (which others may
> disagree with):
> 
> 1. You register an opal mca parameter, but look it up in ompi, then call
> a opal function with the result. What if you had a function
> opal_paffinity_base_set_slots(long rank) (or some other name, I don't
> care) which looked up the mca parameter and then setup the slots as you
> are doing if it is fount. This would make things a bit cleaner IMHO.
> 
> 2. the functions in the paffinety base should be prefixed with
> 'opal_paffinity_base_'
> 
> 3. Why was the ompi_debug_flag added? It is not used anywhere.
> 
> 4. You probably do not need to add the opal debug flag. There is already
> a 'paffinity_base_verbose' flag which should suit your purposes fine. So
> you should just be able to replace all of the conditional output
> statements in paffinity with something like
> opal_output_verbose(10, opal_paffinity_base_output, ...),
> where 10 is the verbosity level number.
> 
> Tim
> 
> 
> Lenny Verkhovsky wrote:
>>  
>> 
>> Hi, all
>> 
>> Attached patch for modified Rank_File RMAPS component.
>> 
>>  
>> 
>> 1.introduced new general purpose debug flags
>> 
>>   mpi_debug
>> 
>>   opal_debug
>> 
>>  
>> 
>> 2.introduced new mca parameter opal_paffinity_slot_list
>> 
>> 3.ompi_mpi_init cleaned from opal paffinity functions
>> 
>> 4.opal paffinity functions moved to new file
>> opal/mca/paffinity/base/paffinity_base_service.c
>> 
>> 5.rank_file component files were renamed according to prefix policy
>> 
>> 6.global variables renamed as well.
>> 
>> 7.few bug fixes that were brought during previous discussions.
>> 
>> 8.If user defines opal_paffinity_alone and rmaps_rank_file_path or
>> opal_paffinity_slot_list,
>> 
>> then he gets a Warning that only opal_paffinity_alone will be used.
>> 
>>  
>> 
>> .
>> 
>> Best Regards,
>> 
>> Lenny.
>> 
>>  
>> 
>> 
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Debug output

2008-03-26 Thread Ralph H Castain
Fixed with r17977 - again, my apologies


On 3/26/08 6:42 AM, "Ralph H Castain"  wrote:

> My apologies - some timing debug output of mine inadvertently was included
> in a commit. I am working now to correct that, but need to rebuild the
> system to ensure the fix is correct.
> 
> Will remove that detailed output shortly.
> Ralph
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




[OMPI devel] trunk segfault

2008-03-26 Thread Lenny Verkhovsky
Hi, all

I compiled and builded source from trunk
and it causes segfault

/home/USERS/lenny/OMPI_ORTE_NEW/bin/mpirun -np 1 -H witch17
/home/USERS/lenny/TESTS/ORTE/mpi_p01_NEW -t lt

--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
  mca_mpi_register_params() failed
  --> Returned "Error" (-1) instead of "Success" (0)
--
[witch17:01220] *** Process received signal ***
[witch17:01220] Signal: Segmentation fault (11)
[witch17:01220] Signal code:  (128)
[witch17:01220] Failing at address: (nil)
[witch17:01220] [ 0] /lib64/libpthread.so.0 [0x2aadf7072c10]
[witch17:01220] [ 1]
/home/USERS/lenny/OMPI_ORTE_NEW/lib/libopen-pal.so.0(free+0x56)
[0x2aadf6acb6d6]
[witch17:01220] [ 2]
/home/USERS/lenny/OMPI_ORTE_NEW/lib/libopen-pal.so.0(opal_argv_free+0x25)
[0x2aadf6ab9635]
[witch17:01220] [ 3] /home/USERS/lenny/OMPI_ORTE_NEW/lib/libmpi.so.0
[0x2aadf67f4206]
[witch17:01220] [ 4]
/home/USERS/lenny/OMPI_ORTE_NEW/lib/libmpi.so.0(MPI_Init+0xf0)
[0x2aadf68117c0]
[witch17:01220] [ 5] /home/USERS/lenny/TESTS/ORTE/mpi_p01_NEW(main+0xef)
[0x40109f]
[witch17:01220] [ 6] /lib64/libc.so.6(__libc_start_main+0xf4)
[0x2aadf7199154]
[witch17:01220] [ 7] /home/USERS/lenny/TESTS/ORTE/mpi_p01_NEW [0x400ee9]
[witch17:01220] *** End of error message ***
--
mpirun noticed that process rank 0 with PID 1220 on node witch17 exited on
signal 11 (Segmentation fault).


Re: [OMPI devel] [OMPI svn] svn:open-mpi r17956

2008-03-26 Thread Josh Hursey

Jeff,

I think this commit is not quite correct. I'm working on a patch to  
fix it at the moment, but just wanted to give a heads up for anyone  
that is experience the same problem I am.


Before this commit I could set "opal_event_include=select" in  
my .openmpi/mca-params.conf file and the event engine would only use  
'select' for all OMPI/ORTE processes. This commit overrides this  
selection by forcing that all MPI apps use "all". I noticed the break  
since the FT builds (which require 'select' at the moment) were failing.


The fix might be as easy as checking to see if the user specified  
anything other than the default then forcing only if the user did not  
define anything. Thoughts?


-- Josh

On Mar 25, 2008, at 1:18 PM, jsquy...@osl.iu.edu wrote:


Author: jsquyres
Date: 2008-03-25 13:18:17 EDT (Tue, 25 Mar 2008)
New Revision: 17956
URL: https://svn.open-mpi.org/trac/ompi/changeset/17956

Log:
Fix #1253: default libevent to use select/poll and only use the other
mechanisms (such as epoll) if someone (ompi_mpi_init()) requests
otherwise.  See big comment in opal/event/event.c for a full
explanation.

Text files modified:
  trunk/ompi/runtime/ompi_mpi_init.c |32  
+++-
  trunk/opal/event/event.c   |37  
-
  trunk/orte/orted/orted_main.c  |31  
---

  3 files changed, 59 insertions(+), 41 deletions(-)

Modified: trunk/ompi/runtime/ompi_mpi_init.c
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/ompi/runtime/ompi_mpi_init.c  (original)
+++ trunk/ompi/runtime/ompi_mpi_init.c	2008-03-25 13:18:17 EDT (Tue,  
25 Mar 2008)

@@ -234,15 +234,37 @@
/* see comment below about sched_yield */
int num_processors;
#endif
-
-/* Join the run-time environment - do the things that don't hit
-   the registry */

-if (ORTE_SUCCESS != (ret = opal_init())) {
-error = "ompi_mpi_init: opal_init failed";
+/* Setup enough to check get/set MCA params */
+
+if (ORTE_SUCCESS != (ret = opal_init_util())) {
+error = "ompi_mpi_init: opal_init_util failed";
goto error;
}

+/* _After_ opal_init_util() but _before_ orte_init(), we need to
+   set an MCA param that tells libevent that it's ok to use any
+   mechanism in libevent that si available on this platform  
(e.g.,

+   epoll and friends).  Per opal/event/event.s, we default to
+   select/poll -- but we know that MPI processes won't be using
+   pty's with the event engine, so it's ok to relax this
+   constraint and let any fd-monitoring mechanism be used. */
+ret = mca_base_param_reg_string_name("opal", "event_include",
+ "Internal orted MCA param:  
tell opal_init() to use a specific mechanism in libevent",

+ false, false, "all", NULL);
+if (ret >= 0) {
+/* We have to explicitly "set" the MCA param value here
+   because libevent initialization will re-register the MCA
+   param and therefore override the default. Setting the  
value

+   here puts the desired value ("all") in different storage
+   that is not overwritten if/when the MCA param is
+   re-registered.  Note that we do *NOT* set this value as an
+   environment variable, just so that it won't be inherited  
by

+   any spawned processes and potentially cause unintented
+   side-effects with launching ORTE tools... */
+mca_base_param_set_string(ret, "all");
+}
+
/* check to see if we want timing information */
param = mca_base_param_reg_int_name("ompi", "timing",
"Request that critical  
timing loops be measured",


Modified: trunk/opal/event/event.c
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/opal/event/event.c(original)
+++ trunk/opal/event/event.c	2008-03-25 13:18:17 EDT (Tue, 25 Mar  
2008)

@@ -256,15 +256,42 @@

#if OPAL_HAVE_WORKING_EVENTOPS

-/**
- * Retrieve the upper level specified event system, if any.
+/* Retrieve the upper level specified event system, if any.
+ * Default to select() on OS X and poll() everywhere else because
+ * various parts of OMPI / ORTE use libevent with pty's.  pty's
+ * *only* work with select on OS X (tested on Tiger and Leopard);
+ * we *know* that both select and poll works with pty's  
everywhere

+ * else we care about (other mechansisms such as epoll *may* work
+ * with pty's -- we have not tested comprehensively with newer
+ * versions of Linux, etc.).  So the safe thing to do is:
+ *
+ * - On OS X, default to using "select" only
+ * - Everywhere else, default to using "poll" only (because poll
+ *   is more scalable than select)
+ *
+ * An upper layer may override

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17956

2008-03-26 Thread Josh Hursey
My fix is in r17980. I did some limited testing with and without C/R  
and things look fine. Wider testing may be in order, but I think MTT  
should take care of that this evening.


Cheers,
Josh

On Mar 26, 2008, at 10:40 AM, Josh Hursey wrote:


Jeff,

I think this commit is not quite correct. I'm working on a patch to
fix it at the moment, but just wanted to give a heads up for anyone
that is experience the same problem I am.

Before this commit I could set "opal_event_include=select" in
my .openmpi/mca-params.conf file and the event engine would only use
'select' for all OMPI/ORTE processes. This commit overrides this
selection by forcing that all MPI apps use "all". I noticed the break
since the FT builds (which require 'select' at the moment) were  
failing.


The fix might be as easy as checking to see if the user specified
anything other than the default then forcing only if the user did not
define anything. Thoughts?

-- Josh

On Mar 25, 2008, at 1:18 PM, jsquy...@osl.iu.edu wrote:


Author: jsquyres
Date: 2008-03-25 13:18:17 EDT (Tue, 25 Mar 2008)
New Revision: 17956
URL: https://svn.open-mpi.org/trac/ompi/changeset/17956

Log:
Fix #1253: default libevent to use select/poll and only use the other
mechanisms (such as epoll) if someone (ompi_mpi_init()) requests
otherwise.  See big comment in opal/event/event.c for a full
explanation.

Text files modified:
 trunk/ompi/runtime/ompi_mpi_init.c |32 
+++-
 trunk/opal/event/event.c   |37 
-
 trunk/orte/orted/orted_main.c  |31
---
 3 files changed, 59 insertions(+), 41 deletions(-)

Modified: trunk/ompi/runtime/ompi_mpi_init.c
=
=
=
=
=
=
=
=
= 
=

--- trunk/ompi/runtime/ompi_mpi_init.c  (original)
+++ trunk/ompi/runtime/ompi_mpi_init.c  2008-03-25 13:18:17 EDT (Tue,
25 Mar 2008)
@@ -234,15 +234,37 @@
   /* see comment below about sched_yield */
   int num_processors;
#endif
-
-/* Join the run-time environment - do the things that don't hit
-   the registry */

-if (ORTE_SUCCESS != (ret = opal_init())) {
-error = "ompi_mpi_init: opal_init failed";
+/* Setup enough to check get/set MCA params */
+
+if (ORTE_SUCCESS != (ret = opal_init_util())) {
+error = "ompi_mpi_init: opal_init_util failed";
   goto error;
   }

+/* _After_ opal_init_util() but _before_ orte_init(), we need to
+   set an MCA param that tells libevent that it's ok to use any
+   mechanism in libevent that si available on this platform
(e.g.,
+   epoll and friends).  Per opal/event/event.s, we default to
+   select/poll -- but we know that MPI processes won't be using
+   pty's with the event engine, so it's ok to relax this
+   constraint and let any fd-monitoring mechanism be used. */
+ret = mca_base_param_reg_string_name("opal", "event_include",
+ "Internal orted MCA param:
tell opal_init() to use a specific mechanism in libevent",
+ false, false, "all", NULL);
+if (ret >= 0) {
+/* We have to explicitly "set" the MCA param value here
+   because libevent initialization will re-register the MCA
+   param and therefore override the default. Setting the
value
+   here puts the desired value ("all") in different storage
+   that is not overwritten if/when the MCA param is
+   re-registered.  Note that we do *NOT* set this value as  
an

+   environment variable, just so that it won't be inherited
by
+   any spawned processes and potentially cause unintented
+   side-effects with launching ORTE tools... */
+mca_base_param_set_string(ret, "all");
+}
+
   /* check to see if we want timing information */
   param = mca_base_param_reg_int_name("ompi", "timing",
   "Request that critical
timing loops be measured",

Modified: trunk/opal/event/event.c
=
=
=
=
=
=
=
=
= 
=

--- trunk/opal/event/event.c(original)
+++ trunk/opal/event/event.c2008-03-25 13:18:17 EDT (Tue, 25 Mar
2008)
@@ -256,15 +256,42 @@

#if OPAL_HAVE_WORKING_EVENTOPS

-/**
- * Retrieve the upper level specified event system, if any.
+/* Retrieve the upper level specified event system, if any.
+ * Default to select() on OS X and poll() everywhere else  
because

+ * various parts of OMPI / ORTE use libevent with pty's.  pty's
+ * *only* work with select on OS X (tested on Tiger and  
Leopard);

+ * we *know* that both select and poll works with pty's
everywhere
+ * else we care about (other mechansisms such as epoll *may*  
work

+ * with pty's -- we have not tested comprehensively with newer
+ * versions of Linux, etc.).  So the safe thing to do is:
+ *
+ * 

Re: [OMPI devel] FreeBSD timer_base_open error?

2008-03-26 Thread Brian W. Barrett

George -

Good catch -- that's going to cause a problem :).  But I think we should 
add yet another check to also make sure that we're on Linux.  So the three 
tests would be:


  1) Am I on a platform that we have timer assembly support for?
 (That's the long list of architectures that we recently,
 and incorrectly, added).
  2) Am I on Linux (since we really only know how to parse
 /proc/cpuinfo on Linux)
  3) Is /proc/cpuinfo readable (Because we have a couple architectures
 that are reported by config.guess as Linux, but don't have
 /proc/cpuinfo).

Make sense?

Brian

On Wed, 26 Mar 2008, George Bosilca wrote:

I was working off-list with Brad on this. Brian is right, the logic in 
configure.m4 is wrong. It overwrite the timer_linux_happy to yes if the host 
match "i?86-*|x86_64*|ia64-*|powerpc-*|powerpc64-*|sparc*-*". On FreeBSD host 
is i386-unknown-freebsd6.2.


Here is a quick and dirty patch. I just move the selection logic a little bit 
around, without any major modifications.


george.

Index: configure.m4
===
--- configure.m4(revision 17970)
+++ configure.m4(working copy)
@@ -40,14 +40,12 @@
[timer_linux_happy="yes"],
[timer_linux_happy="no"])])

-AS_IF([test "$timer_linux_happy" = "yes"],
-  [AS_IF([test -r "/proc/cpuinfo"],
- [timer_linux_happy="yes"],
- [timer_linux_happy="no"])])
-
  case "${host}" in
  i?86-*|x86_64*|ia64-*|powerpc-*|powerpc64-*|sparc*-*)
-timer_linux_happy="yes"
+AS_IF([test "$timer_linux_happy" = "yes"],
+  [AS_IF([test -r "/proc/cpuinfo"],
+ [timer_linux_happy="yes"],
+ [timer_linux_happy="no"])])
   ;;
  *)
   timer_linux_happy="no"



On Mar 25, 2008, at 10:31 PM, Brian Barrett wrote:

On Mar 25, 2008, at 6:16 PM, Jeff Squyres wrote:

"linux" is the name of the component.  It looks like opal/mca/timer/
linux/timer_linux_component.c is doing some checks during component
open() and returning an error if it can't be used (e.g,. if it's not
on linux).

The timer components are a little different than normal MCA
frameworks; they *must* be compiled in libopen-pal statically, and
there will only be one of them built.

In this case, I'm guessing that linux was built simply because nothing
else was selected to be built, but then its component_open() function
failed because it didn't find /proc/cpuinfo.



This is actually incorrect.  The linux component looks for /proc/
cpuinfo and builds if it founds that file.  There's a base component
that's built if nothing else is found.  The configure logic for the
linux component is probably not the right thing to do -- it should
probably be modified to check both for that file (there are systems
that call themselves "linux" but don't have a /proc/cpuinfo) is
readable and that we're actually on Linux.

Brian

--
 Brian Barrett

 There is an art . . . to flying. The knack lies in learning how to
 throw yourself at the ground and miss.
 Douglas Adams, 'The Hitchhikers Guide to the Galaxy'



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] [OMPI svn] svn:open-mpi r17956

2008-03-26 Thread Jeff Squyres

Sorry about that, Josh -- thanks for fixing it.

I added one more very minor change on top of r17980.


On Mar 26, 2008, at 10:55 AM, Josh Hursey wrote:

My fix is in r17980. I did some limited testing with and without C/R
and things look fine. Wider testing may be in order, but I think MTT
should take care of that this evening.

Cheers,
Josh

On Mar 26, 2008, at 10:40 AM, Josh Hursey wrote:


Jeff,

I think this commit is not quite correct. I'm working on a patch to
fix it at the moment, but just wanted to give a heads up for anyone
that is experience the same problem I am.

Before this commit I could set "opal_event_include=select" in
my .openmpi/mca-params.conf file and the event engine would only use
'select' for all OMPI/ORTE processes. This commit overrides this
selection by forcing that all MPI apps use "all". I noticed the break
since the FT builds (which require 'select' at the moment) were
failing.

The fix might be as easy as checking to see if the user specified
anything other than the default then forcing only if the user did not
define anything. Thoughts?

-- Josh

On Mar 25, 2008, at 1:18 PM, jsquy...@osl.iu.edu wrote:


Author: jsquyres
Date: 2008-03-25 13:18:17 EDT (Tue, 25 Mar 2008)
New Revision: 17956
URL: https://svn.open-mpi.org/trac/ompi/changeset/17956

Log:
Fix #1253: default libevent to use select/poll and only use the  
other

mechanisms (such as epoll) if someone (ompi_mpi_init()) requests
otherwise.  See big comment in opal/event/event.c for a full
explanation.

Text files modified:
trunk/ompi/runtime/ompi_mpi_init.c |32 
+++-
trunk/opal/event/event.c   |37 
-
trunk/orte/orted/orted_main.c  |31
---
3 files changed, 59 insertions(+), 41 deletions(-)

Modified: trunk/ompi/runtime/ompi_mpi_init.c
=
=
=
=
=
=
=
=
=
= 


--- trunk/ompi/runtime/ompi_mpi_init.c  (original)
+++ trunk/ompi/runtime/ompi_mpi_init.c  2008-03-25 13:18:17 EDT (Tue,
25 Mar 2008)
@@ -234,15 +234,37 @@
  /* see comment below about sched_yield */
  int num_processors;
#endif
-
-/* Join the run-time environment - do the things that don't hit
-   the registry */

-if (ORTE_SUCCESS != (ret = opal_init())) {
-error = "ompi_mpi_init: opal_init failed";
+/* Setup enough to check get/set MCA params */
+
+if (ORTE_SUCCESS != (ret = opal_init_util())) {
+error = "ompi_mpi_init: opal_init_util failed";
  goto error;
  }

+/* _After_ opal_init_util() but _before_ orte_init(), we need  
to

+   set an MCA param that tells libevent that it's ok to use any
+   mechanism in libevent that si available on this platform
(e.g.,
+   epoll and friends).  Per opal/event/event.s, we default to
+   select/poll -- but we know that MPI processes won't be using
+   pty's with the event engine, so it's ok to relax this
+   constraint and let any fd-monitoring mechanism be used. */
+ret = mca_base_param_reg_string_name("opal", "event_include",
+ "Internal orted MCA param:
tell opal_init() to use a specific mechanism in libevent",
+ false, false, "all",  
NULL);

+if (ret >= 0) {
+/* We have to explicitly "set" the MCA param value here
+   because libevent initialization will re-register the MCA
+   param and therefore override the default. Setting the
value
+   here puts the desired value ("all") in different storage
+   that is not overwritten if/when the MCA param is
+   re-registered.  Note that we do *NOT* set this value as
an
+   environment variable, just so that it won't be inherited
by
+   any spawned processes and potentially cause unintented
+   side-effects with launching ORTE tools... */
+mca_base_param_set_string(ret, "all");
+}
+
  /* check to see if we want timing information */
  param = mca_base_param_reg_int_name("ompi", "timing",
  "Request that critical
timing loops be measured",

Modified: trunk/opal/event/event.c
=
=
=
=
=
=
=
=
=
= 


--- trunk/opal/event/event.c(original)
+++ trunk/opal/event/event.c2008-03-25 13:18:17 EDT (Tue, 25 Mar
2008)
@@ -256,15 +256,42 @@

#if OPAL_HAVE_WORKING_EVENTOPS

-/**
- * Retrieve the upper level specified event system, if any.
+/* Retrieve the upper level specified event system, if any.
+ * Default to select() on OS X and poll() everywhere else
because
+ * various parts of OMPI / ORTE use libevent with pty's.  pty's
+ * *only* work with select on OS X (tested on Tiger and
Leopard);
+ * we *know* that both select and poll works with pty's
everywhere
+ * else we care about (other mechansisms such as epoll *may*
work
+ * with

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17956

2008-03-26 Thread Josh Hursey

No worries. Thanks for the adjustment.

Cheers,
Josh

On Mar 26, 2008, at 3:10 PM, Jeff Squyres wrote:

Sorry about that, Josh -- thanks for fixing it.

I added one more very minor change on top of r17980.


On Mar 26, 2008, at 10:55 AM, Josh Hursey wrote:

My fix is in r17980. I did some limited testing with and without C/R
and things look fine. Wider testing may be in order, but I think MTT
should take care of that this evening.

Cheers,
Josh

On Mar 26, 2008, at 10:40 AM, Josh Hursey wrote:


Jeff,

I think this commit is not quite correct. I'm working on a patch to
fix it at the moment, but just wanted to give a heads up for anyone
that is experience the same problem I am.

Before this commit I could set "opal_event_include=select" in
my .openmpi/mca-params.conf file and the event engine would only use
'select' for all OMPI/ORTE processes. This commit overrides this
selection by forcing that all MPI apps use "all". I noticed the  
break

since the FT builds (which require 'select' at the moment) were
failing.

The fix might be as easy as checking to see if the user specified
anything other than the default then forcing only if the user did  
not

define anything. Thoughts?

-- Josh

On Mar 25, 2008, at 1:18 PM, jsquy...@osl.iu.edu wrote:


Author: jsquyres
Date: 2008-03-25 13:18:17 EDT (Tue, 25 Mar 2008)
New Revision: 17956
URL: https://svn.open-mpi.org/trac/ompi/changeset/17956

Log:
Fix #1253: default libevent to use select/poll and only use the
other
mechanisms (such as epoll) if someone (ompi_mpi_init()) requests
otherwise.  See big comment in opal/event/event.c for a full
explanation.

Text files modified:
trunk/ompi/runtime/ompi_mpi_init.c |32 
+++-
trunk/opal/event/event.c   |37 
-
trunk/orte/orted/orted_main.c  |31
---
3 files changed, 59 insertions(+), 41 deletions(-)

Modified: trunk/ompi/runtime/ompi_mpi_init.c
=
=
=
=
=
=
=
=
=
=
=== 
=

--- trunk/ompi/runtime/ompi_mpi_init.c  (original)
+++ trunk/ompi/runtime/ompi_mpi_init.c	2008-03-25 13:18:17 EDT  
(Tue,

25 Mar 2008)
@@ -234,15 +234,37 @@
  /* see comment below about sched_yield */
  int num_processors;
#endif
-
-/* Join the run-time environment - do the things that don't  
hit

-   the registry */

-if (ORTE_SUCCESS != (ret = opal_init())) {
-error = "ompi_mpi_init: opal_init failed";
+/* Setup enough to check get/set MCA params */
+
+if (ORTE_SUCCESS != (ret = opal_init_util())) {
+error = "ompi_mpi_init: opal_init_util failed";
  goto error;
  }

+/* _After_ opal_init_util() but _before_ orte_init(), we need
to
+   set an MCA param that tells libevent that it's ok to use  
any

+   mechanism in libevent that si available on this platform
(e.g.,
+   epoll and friends).  Per opal/event/event.s, we default to
+   select/poll -- but we know that MPI processes won't be  
using

+   pty's with the event engine, so it's ok to relax this
+   constraint and let any fd-monitoring mechanism be used. */
+ret = mca_base_param_reg_string_name("opal", "event_include",
+ "Internal orted MCA  
param:

tell opal_init() to use a specific mechanism in libevent",
+ false, false, "all",
NULL);
+if (ret >= 0) {
+/* We have to explicitly "set" the MCA param value here
+   because libevent initialization will re-register the  
MCA

+   param and therefore override the default. Setting the
value
+   here puts the desired value ("all") in different  
storage

+   that is not overwritten if/when the MCA param is
+   re-registered.  Note that we do *NOT* set this value as
an
+   environment variable, just so that it won't be  
inherited

by
+   any spawned processes and potentially cause unintented
+   side-effects with launching ORTE tools... */
+mca_base_param_set_string(ret, "all");
+}
+
  /* check to see if we want timing information */
  param = mca_base_param_reg_int_name("ompi", "timing",
  "Request that critical
timing loops be measured",

Modified: trunk/opal/event/event.c
=
=
=
=
=
=
=
=
=
=
=== 
=

--- trunk/opal/event/event.c(original)
+++ trunk/opal/event/event.c2008-03-25 13:18:17 EDT (Tue, 25 Mar
2008)
@@ -256,15 +256,42 @@

#if OPAL_HAVE_WORKING_EVENTOPS

-/**
- * Retrieve the upper level specified event system, if any.
+/* Retrieve the upper level specified event system, if any.
+ * Default to select() on OS X and poll() everywhere else
because
+ * various parts of OMPI / ORTE use libevent with pty's.   
pty's

+ * *only* work with select on OS X (tested on Tiger and
Leopard);
+ * we *know* that bot

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r17983

2008-03-26 Thread Jeff Squyres

What's Interix?

On Mar 26, 2008, at 7:20 PM, bosi...@osl.iu.edu wrote:

Author: bosilca
Date: 2008-03-26 19:20:33 EDT (Wed, 26 Mar 2008)
New Revision: 17983
URL: https://svn.open-mpi.org/trac/ompi/changeset/17983

Log:
Add support for Interix.

Added:
  trunk/config/ompi_interix.m4   (contents, props changed)
Text files modified:
  trunk/acinclude.m4 | 1 +
  trunk/configure.ac | 3 +++
  2 files changed, 4 insertions(+), 0 deletions(-)

Modified: trunk/acinclude.m4
=
=
=
=
=
=
=
=
==
--- trunk/acinclude.m4  (original)
+++ trunk/acinclude.m4  2008-03-26 19:20:33 EDT (Wed, 26 Mar 2008)
@@ -108,6 +108,7 @@
# Include the macros for Windows checking
#
m4_include(config/ompi_microsoft.m4)
+m4_include(config/ompi_interix.m4)

#
# The config/mca_no_configure_components.m4 file is generated by

Added: trunk/config/ompi_interix.m4
=
=
=
=
=
=
=
=
==
--- (empty file)
+++ trunk/config/ompi_interix.m4	2008-03-26 19:20:33 EDT (Wed, 26  
Mar 2008)

@@ -0,0 +1,56 @@
+dnl -*- shell-script -*-
+dnl
+dnl Copyright (c)  2008 The University of Tennessee and The  
University
+dnl of Tennessee Research Foundation.  All  
rights

+dnl reserved.
+dnl $COPYRIGHT$
+dnl
+dnl Additional copyrights may follow
+dnl
+dnl $HEADER$
+dnl
+
+ 
##

+#
+# OMPI_INTERIX
+#
+# Detect if the environment is SUA/SFU (i.e. Interix) and modify
+# the compiling environment accordingly.
+#
+# USAGE:
+#   OMPI_INTERIX()
+#
+ 
##

+AC_DEFUN([OMPI_INTERIX],[
+
+AC_MSG_CHECKING(for Interix environment)
+AC_TRY_COMPILE([],
+   [#if !defined(__INTERIX)
+#error Normal Unix environment
+#endif],
+   is_interix=yes,
+   is_interix=no)
+AC_MSG_RESULT([$is_interix])
+if test "$is_interix" = "yes"; then
+
+ompi_show_subtitle "Interix detection"
+
+if ! test -d /usr/include/port; then
+AC_MSG_WARN([Compiling Open MPI under Interix require  
an up-to-date])
+AC_MSG_WARN([version of libport. Please ask your system  
administrator])

+AC_MSG_WARN([to install it (pkg_update -L libport).])
+AC_MSG_ERROR([*** Cannot continue])
+fi
+#
+# These are the minimum requirements for Interix ...
+#
+AC_MSG_WARN([-lport was added to the linking flags])
+LDFLAGS="-lport $LDFLAGS"
+AC_MSG_WARN([-D_ALL_SOURCE -D_USE_LIBPORT was added to  
the compilation flags])
+CFLAGS="-D_ALL_SOURCE -D_USE_LIBPORT -I/usr/include/port  
$CFLAGS"
+CPPFLAGS="-D_ALL_SOURCE -D_USE_LIBPORT -I/usr/include/port  
$CPPFLAGS"
+CXXFLAGS="-D_ALL_SOURCE -D_USE_LIBPORT -I/usr/include/port  
$CXXFLAGS"

+
+fi
+
+])

Modified: trunk/configure.ac
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/configure.ac  (original)
+++ trunk/configure.ac  2008-03-26 19:20:33 EDT (Wed, 26 Mar 2008)
@@ -192,6 +192,9 @@
AM_CONDITIONAL(OMPI_NEED_WINDOWS_REPLACEMENTS,
   test "$ompi_cv_c_compiler_vendor" = "microsoft" )

+# Do all Interix detections if necessary
+OMPI_INTERIX
+
# Does the compiler support "ident"-like constructs?

OMPI_CHECK_IDENT([CC], [CFLAGS], [c], [C])
___
svn-full mailing list
svn-f...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/svn-full



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r17983

2008-03-26 Thread George Bosilca
Interix or SUA or SFU is the POSIX layer integrated with the latest 
versions of Windows (such as Vista, and Server 2003). It provide fork, 
rsh basically most of the tools we need.


 george.

Jeff Squyres wrote:

What's Interix?

On Mar 26, 2008, at 7:20 PM, bosi...@osl.iu.edu wrote:
  

Author: bosilca
Date: 2008-03-26 19:20:33 EDT (Wed, 26 Mar 2008)
New Revision: 17983
URL: https://svn.open-mpi.org/trac/ompi/changeset/17983

Log:
Add support for Interix.

Added:
  trunk/config/ompi_interix.m4   (contents, props changed)
Text files modified:
  trunk/acinclude.m4 | 1 +
  trunk/configure.ac | 3 +++
  2 files changed, 4 insertions(+), 0 deletions(-)

Modified: trunk/acinclude.m4
=
=
=
=
=
=
=
=
==
--- trunk/acinclude.m4  (original)
+++ trunk/acinclude.m4  2008-03-26 19:20:33 EDT (Wed, 26 Mar 2008)
@@ -108,6 +108,7 @@
# Include the macros for Windows checking
#
m4_include(config/ompi_microsoft.m4)
+m4_include(config/ompi_interix.m4)

#
# The config/mca_no_configure_components.m4 file is generated by

Added: trunk/config/ompi_interix.m4
=
=
=
=
=
=
=
=
==
--- (empty file)
+++ trunk/config/ompi_interix.m4	2008-03-26 19:20:33 EDT (Wed, 26  
Mar 2008)

@@ -0,0 +1,56 @@
+dnl -*- shell-script -*-
+dnl
+dnl Copyright (c)  2008 The University of Tennessee and The  
University
+dnl of Tennessee Research Foundation.  All  
rights

+dnl reserved.
+dnl $COPYRIGHT$
+dnl
+dnl Additional copyrights may follow
+dnl
+dnl $HEADER$
+dnl
+
+ 
##

+#
+# OMPI_INTERIX
+#
+# Detect if the environment is SUA/SFU (i.e. Interix) and modify
+# the compiling environment accordingly.
+#
+# USAGE:
+#   OMPI_INTERIX()
+#
+ 
##

+AC_DEFUN([OMPI_INTERIX],[
+
+AC_MSG_CHECKING(for Interix environment)
+AC_TRY_COMPILE([],
+   [#if !defined(__INTERIX)
+#error Normal Unix environment
+#endif],
+   is_interix=yes,
+   is_interix=no)
+AC_MSG_RESULT([$is_interix])
+if test "$is_interix" = "yes"; then
+
+ompi_show_subtitle "Interix detection"
+
+if ! test -d /usr/include/port; then
+AC_MSG_WARN([Compiling Open MPI under Interix require  
an up-to-date])
+AC_MSG_WARN([version of libport. Please ask your system  
administrator])

+AC_MSG_WARN([to install it (pkg_update -L libport).])
+AC_MSG_ERROR([*** Cannot continue])
+fi
+#
+# These are the minimum requirements for Interix ...
+#
+AC_MSG_WARN([-lport was added to the linking flags])
+LDFLAGS="-lport $LDFLAGS"
+AC_MSG_WARN([-D_ALL_SOURCE -D_USE_LIBPORT was added to  
the compilation flags])
+CFLAGS="-D_ALL_SOURCE -D_USE_LIBPORT -I/usr/include/port  
$CFLAGS"
+CPPFLAGS="-D_ALL_SOURCE -D_USE_LIBPORT -I/usr/include/port  
$CPPFLAGS"
+CXXFLAGS="-D_ALL_SOURCE -D_USE_LIBPORT -I/usr/include/port  
$CXXFLAGS"

+
+fi
+
+])

Modified: trunk/configure.ac
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/configure.ac  (original)
+++ trunk/configure.ac  2008-03-26 19:20:33 EDT (Wed, 26 Mar 2008)
@@ -192,6 +192,9 @@
AM_CONDITIONAL(OMPI_NEED_WINDOWS_REPLACEMENTS,
   test "$ompi_cv_c_compiler_vendor" = "microsoft" )

+# Do all Interix detections if necessary
+OMPI_INTERIX
+
# Does the compiler support "ident"-like constructs?

OMPI_CHECK_IDENT([CC], [CFLAGS], [c], [C])
___
svn-full mailing list
svn-f...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/svn-full