Are there API functions or data structures that can be used to determine if the 1-to-many model is supported on the system?

More specifically: can you have your configure.m4 script check to see if the current system a) supports SCTP, and b) if yes, if it supports 1-to-many? This kind of checking would theoretically allow running on Solaris, but automatically default to the 1-to-1 mode (if your BTL supports that).

This also falls in-line with the autoconf mantra: test for the desired behavior, not the desired platform (because the list of supported platforms may change over time). :-)


On Nov 14, 2007, at 1:17 PM, Brad Penoff wrote:

On Nov 14, 2007 5:11 AM, Terry Dontje <terry.don...@sun.com> wrote:

Brad Penoff wrote:
On Nov 12, 2007 3:26 AM, Jeff Squyres <jsquy...@cisco.com> wrote:

I have no objections to bringing this into the trunk, but I agree that
an .ompi_ignore is probably a good idea at first.


I'll try to cook up a commit soon then!


One question that I'd like to have answered is how OMPI decides
whether to use the SCTP BTL or not.  If there are SCTP stacks
available by default in Linux and OS X -- but their performance may be
sub-optimal and/or buggy, we may want to have the SCTP BTL only
activated if the user explicitly asks for it.  Open MPI is very
concerned with "out of the box" behavior -- we need to ensure that
"mpirun a.out" will "just work" on all of our supported platforms.


Just to make a few things explicit...

Things would only work out of the box on FreeBSD, and there the stack
is very good.

We have less experience with the Linux stack but hope the availability of and SCTP BTL will help encourage its use by us and others. Now it
is a module by default (loaded with "modprobe sctp") but the actual
SCTP sockets extension API needs to be downloaded and installed
separately.  The so-called lksctp-tools can be obtained here:
http://sourceforge.net/project/showfiles.php?group_id=26529

The OS X stack does not come by default but instead is a kernel extension:
http://sctp.fh-muenster.de/sctp-nke.html
I haven't yet started this testing but intend to soon.  As of now
though, the supplied configure.m4 does not try to even build the
component on Mac OS X.

So in my opinion, things in the configure scripts should be fine the
way the are since only FreeBSD stack (which we have confidence in)
will try to work out of the box; the others require the user to
install things.


Greetings,

I am gathering from the text above you haven't tried your BTL on Solaris
at all.

The short answer to that is correct, we haven't tried the Open MPI
SCTP BTL yet on Solaris.  In fact, the configure.m4 file checks the
$host value and only tries to build if it's on Linux or a BSD variant.
Mac OS X uses the same code as BSD but I have only just got my hands
on a machine so even it hasn't been tested yet; Solaris remains on the
TODO list.

However, there's a slightly longer answer...

After a series of emails with the Sun SCTP people
(sctp-questi...@sun.com but mostly Kacheong Poon) a year ago, I
learned SCTP support is within Solaris 10 by default.  In general,
SCTP supports its own socket API, in addition to the standard Berkeley
sockets API; the SCTP-specific sockets API unlocks some of SCTP's
newer features (e.g, multistreaming).  We make use of this
SCTP-specific sockets API.

The Solaris stack (as of a year ago) made certain assumptions about
the SCTP-specific sockets API.  I'm just looking back on those emails
now to refresh my memory... it looks like on the Solaris stack as of
Nov 2006, it did not allow the use one-to-many sockets (the current
default in our BTL) together with the sctp_sendmsg call.  They
mentioned an alternative just we didn't have the time to explore it.
I'm not sure if this has changed on the Solaris stack within the past
year... I never got the time to revisit this.

In the past, we had mostly used the one-to-many socket (with our LAM
and MPICH2 versions).  One unique thing about this Open MPI SCTP BTL
is that there is also a choice to make use of (the more TCP-like)
one-to-one socket style.  The socket style used by the SCTP BTL is
adjustable with the MCA parameter btl_sctp_if_11 (if set to 1, it uses
1-1 sockets; by default it is 0 and uses 1-many).  I've never used
one-to-one sockets on the Solaris stack, but it may have a better
chance of working (also one-to-many may work now; I haven't kept
up-to-date).

We also noticed that on Solaris we had to do some things a little
different with iovec's because the struct msghdr (used by sendmsg) had
no msg_control field; to get around this, we had to pack the iovec's
contents into a buffer and send that buffer instead of using the iovec
directly.

Anyway, hope this fully answers your questions.  In general, it'd be
nice if we have the time/assistance to add in Solaris support
eventually.

brad


--td

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Jeff Squyres
Cisco Systems

Reply via email to