Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 10:44:11AM +0200, Christian Kauhaus wrote: > Hello *, Hi. > University of Jena (Germany). Our work group is digging into how to > connect several clusters on a campus. I think I'm also a member of this workgroup, though I am not working at University of Jena, but studyi

Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 09:07:39AM -0500, Brian Barrett wrote: > > I have a first quick and dirty patch, replacing AF_INET by AF_INET6, > > the sockaddr_in structs and so on. > Is there a way to do this to better support both IPv4 and IPv6? I think so, too. There are probably two different ways t

Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 09:36:31AM -0500, Jeff Squyres (jsquyres) wrote: > I have no personal experience with IPv6, but one thought that strikes me > is that the components might be able to figure out what to do by looking > at/parsing either the hostnames or the results that come back from > reso

Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 05:21:42PM +0200, Ralf Wildenhues wrote: > > Perhaps it's a good idea to port any internal structure to > > IPv6, as it is able to represent the whole v4 namespace. > > One can always determine whether it is a real v6 or only > > a mapped v4 address (the common ::: pref

Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 05:55:28PM +0200, Ralf Wildenhues wrote: > Have not: > HP-UX 11.00 HPUX 11iv2 has, for the early HPUX-11 versions there is TOUR (Transport Optional Upgrade Release) -- mail: a...@thur.de http://adi.thur.de PGP: v2-key via keyserver Schlecht: Du kannst deine

Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 11:06:55AM -0800, Brooks Davis wrote: > > One little problem here is that it is possible to disable the > > IPv6-mapped IPv4 addresses at least under Linux and some BSD variants. > > For Linux, have a look at sys.net.ipv6.bindv6only. Some authors even > More specifically,

[OMPI devel] How to test OpenMPI?

2006-05-02 Thread Adrian Knoth
Hi, as already mentioned some weeks ago, we plan to provide IPv6-support for OpenMPI. Before touching the code, we'd like to have a test environment to ensure not to break anything. There is a test/-directory, but the tests inside seem to be very basic, no network testing or anything running lon

[OMPI devel] Building ompi occasionally touches the source files

2006-07-17 Thread Adrian Knoth
Hi, I have a bunch of boxes used to test and compile OMPI (we're talking about the openmpi-1.1 release). Two of them are Debian sarge (current stable), two are Debian testing (i386+amd64) and one is Debian unstable (amd64) The source is shared via svn, so it's for sure all are using the same cod

Re: [OMPI devel] Building ompi occasionally touches the source files

2006-07-18 Thread Adrian Knoth
On Tue, Jul 18, 2006 at 12:34:21PM +0200, Christian Kauhaus wrote: > >b) fails to complete (see attachment), the errors are all > > related to lex. > What are the flex versions used on these systems? On Debian stable it is > flex 2.5.31 and on my Gentoo box it is flex 2.5.33, both giving

Re: [OMPI devel] Building ompi occasionally touches the source files

2006-07-20 Thread Adrian Knoth
On Mon, Jul 17, 2006 at 10:05:05PM +0200, Adrian Knoth wrote: Hi, > The source is shared via svn, so it's for sure all are using the > same code. > 2. If compiling inside my directory layout, the build > > a) changes the following two files in trunk/src/ > > a

Re: [OMPI devel] OpenMPI not conforming with the C90 spec?

2006-08-19 Thread Adrian Knoth
On Thu, Aug 17, 2006 at 11:48:44PM +0100, Jonathan Underwood wrote: > Hi, Hi! > Compiling a file with the gcc options -Wall and -pedantic gives the > following warning: > mpi.h:147: warning: ISO C90 does not support 'long long' > Is this intentional, or is this a bug? If you do not insist on us

[OMPI devel] A few notes on IPv6 status

2006-08-19 Thread Adrian Knoth
Hi, as mentioned earlier this year, I'm now working on IPv6 support for OpenMPI. The main design goals are: - do not break existing IPv4 code - compile on SUSv2 (without new socket API) - do not use mapped addresses - test the new code on many systems The porting of OPAL is more or

Re: [OMPI devel] A few notes on IPv6 status

2006-08-21 Thread Adrian Knoth
On Sat, Aug 19, 2006 at 11:07:26PM +0200, Adrian Knoth wrote: > Hi, Hi! > Do you agree with a resulting URL like tcp://[2001:6f8::1]:port or > do you think it should be tcp6://? I've changed this to tcp6://, because orte/mca/oob/tcp/oob_tcp.c contains the following lines: /

[OMPI devel] First IPv6 communication with ORTE

2006-08-24 Thread Adrian Knoth
Hi, I'm glad to announce the first IPv6 launch of orted: tcp6 0960 2001:638:906:2:20:43810 2001:638:906:2::1:43421 ESTABLISHED18368/orted Unit testing discovered the relevant bugs. They're now fixed and it's actually working. Who'd ever guess this? ;) I'm going to prepare some deve

[OMPI devel] [IPv6] new component oob/tcp6

2006-09-01 Thread Adrian Knoth
Hi, yesterday I felt impelled to create a new ORTE oob component: tcp6. I was able to either compile the library with IPv4 or IPv6 support, but not with both (so to say: two different ompi installations or at least two different DSO versions). As far as I can see, many functions use mca_oob_tcp_

Re: [OMPI devel] [IPv6] new component oob/tcp6

2006-09-01 Thread Adrian Knoth
On Fri, Sep 01, 2006 at 07:01:25AM -0600, Ralph Castain wrote: > > Do you agree to go on with two oob components, tcp and tcp6? > Yes, I think that's the right approach It's a deal. ;) > I think this can be supported nicely in the framework system. All we > have to do is set the IPv6 component's

Re: [OMPI devel] [IPv6] new component oob/tcp6

2006-09-06 Thread Adrian Knoth
On Wed, Sep 06, 2006 at 05:44:23PM +0200, Christian Kauhaus wrote: > Our current plan is to look into the hostfile and see if there are > > (1a) just IPv4 addresses > (1b) IPv4 addresses and hostnames for which 'A' queries can be resolved > (2a) just IPv6 addresses > (2b) IPv6 addresses and host

Re: [OMPI devel] [IPv6] new component oob/tcp6

2006-09-07 Thread Adrian Knoth
On Thu, Sep 07, 2006 at 11:46:28AM -0400, Jeff Squyres wrote: > > On Fri, Sep 01, 2006 at 07:01:25AM -0600, Ralph Castain wrote: > > > >>> Do you agree to go on with two oob components, tcp and tcp6? > >> Yes, I think that's the right approach > > > > It's a deal. ;) > Actually, I would disagree

Re: [OMPI devel] [IPv6] new component oob/tcp6

2006-09-07 Thread Adrian Knoth
On Thu, Sep 07, 2006 at 07:51:28PM +0200, Adrian Knoth wrote: > No problem, just two hours ago, Christian and me decided to drop > the idea of oob/tcp6 and go on with only one oob-tcp-component. > It shouldn't be that hard and I'll try it tonight or tomorrow. Looks quite pro

[OMPI devel] [IPv6] ORTE layer working

2006-09-12 Thread Adrian Knoth
Hi, I'm glad to announce a first working version of IPv4+IPv6 orte. It contains: - IPv6 interface discovery on Linux - a single orte/mca/oob/tcp component - a single module (no multiple instances) - two listening sockets - two connecting sockets The listening sockets always stay

Re: [OMPI devel] [IPv6] ORTE layer working

2006-09-22 Thread Adrian Knoth
On Tue, Sep 12, 2006 at 05:44:49PM +0200, Adrian Knoth wrote: > I'm glad to announce a first working version of IPv4+IPv6 orte. > > It contains: >- IPv6 interface discovery on Linux >- a single orte/mca/oob/tcp component >- a single module (no multiple instance

[OMPI devel] IPv6 in btl/tcp

2006-10-11 Thread Adrian Knoth
Hi, this mail starts like all the others before ;): I'm glad to announce a first working version of btl/tcp with both, IPv4 and IPv6 support. adi@ipc654:~/ompi/trunk/test$ ruby ringtest.rb Loaded suite ringtest Started 0: sending message (0) to 1 1: got message (1) from 0, sending to 2 2: got m

Re: [OMPI devel] IPv6 in btl/tcp

2006-10-16 Thread Adrian Knoth
On Wed, Oct 11, 2006 at 11:28:13PM +0200, Adrian Knoth wrote: > The ringtest also works fine in plain IPv4 environments and > mixed environments within the same cluster. It fails on > mixed multi-cluster setups and heterogenous OSs, but I'm > going to fix these issues on Satur

Re: [OMPI devel] IPv6 in btl/tcp

2006-10-17 Thread Adrian Knoth
On Mon, Oct 16, 2006 at 07:22:12PM -0600, Brian Barrett wrote: > I just committed some code in the TCP OOB component to deal with > packing / unpacking sockaddr_in structures for cases where there is > different heterogeneity / padding. I think it's going to require > some work to make it I

[OMPI devel] New oob/tcp?

2006-10-25 Thread Adrian Knoth
Hi, I've seen a new oob/tcp component in the v1.2 branch (copied from the trunk). Of course, it doesn't merge with my IPv6 patch, so I'm currently using the old oob/tcp in my branch. Is this new component considered stable, thus making it worth to port the IPv6 patch? -- mail: a...@thur.de

Re: [OMPI devel] New oob/tcp?

2006-10-25 Thread Adrian Knoth
On Wed, Oct 25, 2006 at 06:27:47AM -0600, Ralph H Castain wrote: > I don't see any new component, Adrian. There have been a few updates to the > existing component, some of which might cause conflicts with the merge, but > those shouldn't be too hard to resolve. Ok, I just saw something with "cre

Re: [OMPI devel] New oob/tcp?

2006-10-25 Thread Adrian Knoth
On Wed, Oct 25, 2006 at 02:48:33PM +0200, Adrian Knoth wrote: > > I don't see any new component, Adrian. There have been a few updates to the > > existing component, some of which might cause conflicts with the merge, but > > those shouldn't be too hard to resolve. >

[OMPI devel] IPv6 code uploaded to svn

2006-10-25 Thread Adrian Knoth
Hi, I've uploaded my current IPv6 code to /tmp/adi-ipv6/. The checkin was splitted to ease the review. What has changed?: OPAL: (changeset 12308) The OPAL layer can now detect IPv6 addresses on Linux and Solaris. The functions in if.c were rewritten to handle the new address struct

[OMPI devel] MPI between amd64 and x86

2006-11-01 Thread Adrian Knoth
Hi, I'm currently testing the new IPv6 code in a lot of different setups. It's doing fine with Linux and Solaris, both on x86. There are also no problems between multiple amd64s, but I wasn't able to communicate between x86 and amd64. The oob connection is up, but the BTL hangs. gdb (remote) sho

Re: [OMPI devel] MPI between amd64 and x86

2006-11-04 Thread Adrian Knoth
On Sat, Nov 04, 2006 at 02:07:58PM +0530, Nysal Jan wrote: > >come from the BTL headers where the fields do not have the same > >alignment inside. The original question was asked by Nysal Jan on an > >email with the subject "SEGV in EM64T <--> PPC64 communication" on > >Oct. 11 2006. Unfortunately

[OMPI devel] valgrind messages important?

2006-11-12 Thread Adrian Knoth
Hi, I'm currently tracing a segfault in mpi_init which is caused by ompi/runtime/ompi_mpi_init.c:569 ret = MCA_PML_CALL(add_procs(procs, nprocs)); free(procs); In most cases, no segfault occurs and everything works fine, but with some special combinations of machines, I can trigger the b

Re: [OMPI devel] Cross-Cluster OpenMPI

2006-11-19 Thread Adrian Knoth
On Sun, Nov 19, 2006 at 02:35:27AM -0500, Resat Umit Payli wrote: > Hi; Hi! > I am interested in using OpenMPI cross-cluster runs on the Grid > environments. Though it's not Grid, but "our" IPv6 code is intended to be run on multi-clusters. (if you're only looking for using all of your machine

[OMPI devel] IPv6 up and working

2006-11-24 Thread Adrian Knoth
Hi, last week I've rewritten my btl-tcp component to improve several aspects, mainly no oversubscription of interfaces. I now have: - the MCA parameter btl_tcp_disable_family={4|6} to force the use of a special address family at runtime - a working include/exclude list for interfaces

Re: [OMPI devel] Major revision to the RML/OOB

2006-12-05 Thread Adrian Knoth
On Mon, Dec 04, 2006 at 06:26:26AM -0700, Ralph Castain wrote: > Hello all Hi! > With some luck and (hopefully) not too many conflicting priorities, Jeff > and I may complete this work by Christmas [..] > As always, feel free to comment and/or make suggestions! You wrote a lot about oob, socket

Re: [OMPI devel] Major revision to the RML/OOB

2006-12-06 Thread Adrian Knoth
On Wed, Dec 06, 2006 at 07:07:42AM -0700, Ralph H Castain wrote: > The concern is that we want to leave open the possibility of putting this > revision into 1.2 since it will have a major performance impact on both > startup time and the max cluster size we can support. The IP6 code is > scheduled

Re: [OMPI devel] Major revision to the RML/OOB

2006-12-08 Thread Adrian Knoth
On Thu, Dec 07, 2006 at 11:12:23AM -0500, Jeff Squyres wrote: Hi, > > I therefore suggest to move the OPAL changes into the trunk, > > also the small hostfile code (lex code for IPv6) and the btl code. > Can you describe the changes in opal that were made for IPv6? These changes are limited to t

[OMPI devel] NFS race condition in romio

2007-01-08 Thread Adrian Knoth
Hi, we're facing a NFS race condition if File_Open is called for a nonexisting file: #include int main(int argc, char *argv[]) { MPI::Init(argc, argv); MPI::File _outputFile; double dummy = 42; _outputFile = MPI::File::Open(MPI::COMM_WORLD, "foo", MPI_MOD

Re: [OMPI devel] NFS race condition in romio

2007-01-08 Thread Adrian Knoth
On Mon, Jan 08, 2007 at 11:49:32PM +0100, Adrian Knoth wrote: > The attached patch fixes this problem, but perhaps there is New patch, I've missed the non-NFS case. -- Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany private: http://adi.thur.de Ind

Re: [OMPI devel] NFS race condition in romio

2007-01-09 Thread Adrian Knoth
On Tue, Jan 09, 2007 at 12:03:38AM +0100, Adrian Knoth wrote: > > The attached patch fixes this problem, but perhaps there is > New patch, I've missed the non-NFS case. This patch was wrong, too (containing a double free segfault). Don't code when dog-tired... ;) I'v

Re: [OMPI devel] SOS!! Run-time error

2007-04-15 Thread Adrian Knoth
On Sun, Apr 15, 2007 at 01:40:01PM -0400, chaitali dherange wrote: > Hi, Hi! > I have downloaded the developer version of source code by downloading a > nightly Subversion snapshot tarball.And have installed the openmpi. Things are getting much clearer when you compile Open MPI with --enable-

Re: [OMPI devel] SOS... help needed :(

2007-04-16 Thread Adrian Knoth
On Sun, Apr 15, 2007 at 10:25:06PM -0400, chaitali dherange wrote: > Hi, Hi! > giving more priority to the MPI calls over the non MPI ones. > static I mean.. we know that our clusters use Infiniband for MPI ... > so all the non MPI communication can be assumed to be TCP > communication using th

Re: [OMPI devel] replace 'atoi' with 'strtol'

2007-04-18 Thread Adrian Knoth
On Wed, Apr 18, 2007 at 01:16:54PM -0400, George Bosilca wrote: > That's right, long and int have the same size on Windows 32 and 64 > bits (always 32 bits). However, they are considered as being > different types (!!!). How about (u)int32_t? When I was an Ada programmer, subtypes with the ap

[OMPI devel] sockaddr* vs. sockaddr_storage*

2007-04-29 Thread Adrian Knoth
Hi, especially bosilca (George?) r14544 broke the IPv6 support (see Ticket #1008). I've committed a quick patch, but I guess we (George and me?) will have to look closer in order to provide the desired functionality. There's another question concerning r14544: why did you change sockaddr_storage*

Re: [OMPI devel] sockaddr* vs. sockaddr_storage*

2007-04-29 Thread Adrian Knoth
On Sun, Apr 29, 2007 at 10:18:01AM -0400, George Bosilca wrote: > I have to ask you to remove r14549 quickly as it bring back the trunk > to the stage it was before r14544 (only random support for multiple I'll have a look how to accomplish both: IPv6 and a reverted r14549. > BTL). It's not

Re: [OMPI devel] sockaddr* vs. sockaddr_storage*

2007-04-29 Thread Adrian Knoth
On Sun, Apr 29, 2007 at 06:07:03PM +0200, Adrian Knoth wrote: > > I have to ask you to remove r14549 quickly as it bring back the trunk > > to the stage it was before r14544 (only random support for multiple > I'll have a look how to accomplish both: IPv6 and a reverted

Re: [OMPI devel] sockaddr* vs. sockaddr_storage*

2007-05-01 Thread Adrian Knoth
On Tue, May 01, 2007 at 07:39:07AM -0700, Jeff Squyres wrote: > > (b) that > > IPv6 was correctly operating...which were the two issues in this > > discussion. > We currently do not have any IPv6 setup in our MPI testing equipment We automatically check every trunk commit against our IPv6 tes

Re: [OMPI devel] Add a bug fix to 1.2.x version

2007-05-02 Thread Adrian Knoth
On Wed, May 02, 2007 at 02:07:17PM +0300, Sharon Melamed wrote: > Hi, Hi! > Change set 14463 - [1]https://svn.open-mpi.org/trac/ompi/changeset/14463. > I would like to integrate this change to version 1.2.x. I guess you're looking for https://svn.open-mpi.org/trac/ompi/wiki/SubmittingChange

Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid

2007-07-24 Thread Adrian Knoth
On Tue, Jul 24, 2007 at 08:41:27AM -0600, Brian Barrett wrote: > > man malloc tells me this: > > "If size was equal to 0, either NULL or a pointer suitable to be > > passed to free() > > is returned". So may be we should just return NULL and be done with > > it? > > Which is also what POSIX s

Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#433142: openmpi: FTBFS on GNU/kFreeBSD

2007-07-24 Thread Adrian Knoth
On Sat, Jul 14, 2007 at 03:55:12PM -0500, Dirk Eddelbuettel wrote: > | the current version fails to build on GNU/kFreeBSD. > | > | It needs small fixups for munmap hackery and stacktrace. > | It also needs to exclude linux specific build-depends. > | Please find attached patch with that. > > Tha

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-02 Thread Adrian Knoth
On Thu, Aug 02, 2007 at 02:31:30AM +, Dirk Eddelbuettel wrote: > Dear Open MPI developers, Hi! > We (as in the Debian maintainer for Open MPI) got this bug report from > Uwe who sees mpi apps segfault on Debian systems with the FreeBSD > kernel. > Any input would be greatly appreciated! Uw

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Adrian Knoth
On Thu, Aug 02, 2007 at 10:51:13AM +0200, Adrian Knoth wrote: > > We (as in the Debian maintainer for Open MPI) got this bug report from > > Uwe who sees mpi apps segfault on Debian systems with the FreeBSD > > kernel. > > Any input would be greatly appreciated

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Adrian Knoth
On Mon, Aug 13, 2007 at 04:26:31PM -0500, Dirk Eddelbuettel wrote: > > I'll now compile the 1.2.3 release tarball and see if I can reproduce The 1.2.3 release also works fine: adi@debian:~$ ./ompi123/bin/mpirun -np 2 ring 0: sending message (0) to 1 0: sent message 1: waiting for message 1: got

Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Adrian Knoth
On Fri, Aug 17, 2007 at 02:11:02AM +0200, Uwe Hermann wrote: > > | The 1.2.3 release also works fine: > I think Adrian used a tarball, not the Debian package? > I'll try a local, manual install too, maybe the bug is Debian-related only? I've tried both: the tarball works fine, the Debian package

Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Adrian Knoth
On Fri, Aug 17, 2007 at 09:25:05AM +0200, Adrian Knoth wrote: > I've tried both: the tarball works fine, the Debian package > segfaults. I suspect it's the threading support, so someone (Uwe?) could > try to remove it from debian/rules. Ok, --enable-progress-threads and -

Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Adrian Knoth
On Fri, Aug 17, 2007 at 08:26:50AM -0400, Jeff Squyres wrote: > > Ok, --enable-progress-threads and --enable-mpi-threads cause the > > segfaults. If you compile without, everything works. > > > I'll now try if it's mpi-threads or the progress-threads, and also > > check > > the upcoming v1.2.4.

Re: [OMPI devel] Small manual page patches from Debian package

2007-09-28 Thread Adrian Knoth
On Thu, Sep 27, 2007 at 09:18:39PM -0500, Dirk Eddelbuettel wrote: > Dear Open MPI developers, Hi! > The Debian (source) package for Open MPI still carries a few tiny patches > that we thought we had submitted to you, but then maybe we got that mixed up > with some new manual pages I sent in on

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r16691

2007-11-08 Thread Adrian Knoth
On Thu, Nov 08, 2007 at 07:51:28AM -0500, Jeff Squyres wrote: [r16691] > Whoa; I'm not sure we want to apply this. Me neither. > All ROMIO patches *must* be coordinated with the ROMIO maintainers. Upstream? That's the upstream patch. Jiri Polach has extracted the fix for this problem. Updat

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r16691

2007-11-08 Thread Adrian Knoth
On Thu, Nov 08, 2007 at 08:02:09AM -0500, Jeff Squyres wrote: > >> All ROMIO patches *must* be coordinated with the ROMIO maintainers. > > Upstream? That's the upstream patch. > That was extracted from ROMIO itself? Which release? >From Jiri: The patch was extracted from a ROMIO sources that c

[OMPI devel] IPv4 mapped IPv6 addresses

2007-12-14 Thread Adrian Knoth
Hi! The current BTL/TCP and OOB/TCP code contains separate sockets for IPv4 and IPv6. Though it has never been a problem for me, this might cause an out-of-FDs-error in large clusters. (IIRC, rhc has already pointed out this issue) A possible way to reduce FD consumption would be the use of IPv4

Re: [OMPI devel] Minor patch for !IPV6_V6ONLY

2008-01-01 Thread Adrian Knoth
On Mon, Dec 31, 2007 at 08:05:38PM -0800, Paul H. Hargrove wrote: > I just tried today to build the OMPI trunk on an old RH8 box and found > that for > OPAL_WANT_IPV6 && !defined(IPV6_V6ONLY) > the file oob_tcp.c fails to compile due to unbalanced braces. > > Swapping an #endif with a closing

Re: [OMPI devel] btl tcp port to xensocket

2008-01-09 Thread Adrian Knoth
On Tue, Jan 08, 2008 at 10:51:45PM -0800, Muhammad Atif wrote: > I am planning to port tcp component to xensocket, which is a fast > interdomain communication mechanism for guest domains in Xen. I may Just to get things right: You first partition your SMP/Multicore system with Xen, and then want

Re: [OMPI devel] btl tcp port to xensocket

2008-01-17 Thread Adrian Knoth
On Tue, Jan 15, 2008 at 04:07:02PM -0800, Muhammad Atif wrote: > Just for reference, I am trying to port btl/tcp to xensockets. Now if > i want to do modex send/recv , to my understanding, mca_btl_tcp_addr_t > is used (ref code/function is mca_btl_tcp_component_exchange). For > xensockets, I need

Re: [OMPI devel] Trunk borked

2008-01-28 Thread Adrian Knoth
On Mon, Jan 28, 2008 at 07:26:56AM -0700, Ralph H Castain wrote: > We seem to have a problem on the trunk this morning. I am building on a There are more errors: /tmp/ompi/src/ompi/contrib/vt/vt/vtlib/vt_iowrap.c: In function `fsetpos': /tmp/ompi/src/ompi/contrib/vt/vt/vtlib/vt_iowrap.c:850: err

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

2008-01-30 Thread Adrian Knoth
On Tue, Jan 29, 2008 at 07:37:42PM -0500, George Bosilca wrote: > The previous code was correct. Each IP address correspond to a > specific endpoint, and therefore to a specific BTL. This enable us to > have multiple TCP BTL at the same time, and allow the OB1 PML to > stripe the data over a

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

2008-01-30 Thread Adrian Knoth
On Wed, Jan 30, 2008 at 09:20:45AM -0500, Tim Mattox wrote: > > As mentioned earlier: it's very common to have multiple addresses per > > interface, and it's the kernel who assigns the source address, so > > there's nothing one could say about an incoming connection. Only that it > > could be any

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

2008-01-30 Thread Adrian Knoth
On Wed, Jan 30, 2008 at 12:05:50PM -0500, George Bosilca wrote: > What is the real issue behind this whole discussion? Hanging connections. See https://svn.open-mpi.org/trac/ompi/ticket/1206 The multi-address peer tries to connect, but btl_tcp_proc_accept denies due to not matching addresses

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

2008-01-30 Thread Adrian Knoth
On Wed, Jan 30, 2008 at 03:38:00PM +0100, Bogdan Costescu wrote: > The results is that, with the default Linux kernel settings, there is > no way to tell which way a connection will take in a multi-rail TCP/IP > setup. Even more, when the ARP cache expires and a new ARP request is > made, the a

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

2008-01-31 Thread Adrian Knoth
On Wed, Jan 30, 2008 at 06:48:54PM +0100, Adrian Knoth wrote: > > What is the real issue behind this whole discussion? > Hanging connections. > I'll have a look at it tomorrow. To everybody who's interested in BTL-TCP, especially George and (to a minor degree) rhc: I&#

[OMPI devel] New address selection for btl-tcp (was Re: [OMPI svn] svn:open-mpi r17307)

2008-02-12 Thread Adrian Knoth
On Fri, Feb 01, 2008 at 11:40:20AM -0500, Tim Prins wrote: > Adrian, Hi! Sorry for the late reply and thanks for your testing. > 1. There are some warnings when compiling: I've fixed these issues. > 2. If I exclude all my tcp interfaces, the connection fails properly, > but I do get a malloc

Re: [OMPI devel] New address selection for btl-tcp (was Re: [OMPI svn] svn:open-mpi r17307)

2008-02-22 Thread Adrian Knoth
On Fri, Feb 15, 2008 at 09:02:10AM -0500, Tim Prins wrote: > >> 3. If the exclude list does not contain 'lo', or the include list > >> contains 'lo', the job hangs when using multiple nodes: > > That's weird. Loopback interfaces should automatically be excluded right > > from the beginning. See o

[OMPI devel] Logo as a vector graphic

2008-03-13 Thread Adrian Knoth
Hi! Next week, I'll have a talk at ICNS'08: http://www.iaria.org/conferences2008/ICNS08.html (Is anybody around?) I'd like to show the Open MPI logo on one of my slides, but I cannot find a vectorized version (svg, eps, whatever) or at least a high-res bitmap. Does such a file exist and i

Re: [OMPI devel] Logo as a vector graphic

2008-03-13 Thread Adrian Knoth
On Thu, Mar 13, 2008 at 08:07:18AM -0500, Jeff Squyres wrote: > Try this one. Thanks, that's beautiful. I'll send you the slides once they are ready, the logo really fits well ;) > We usually snip off the words at the bottom. I also did so. How do you crop the image? I used pdfcrop which is pa

Re: [OMPI devel] Logo as a vector graphic

2008-03-13 Thread Adrian Knoth
On Thu, Mar 13, 2008 at 06:06:12PM +0100, Andreas Schäfer wrote: > > Heh. I usually use the png or jpg version and just crop there. :-) > As this seems to be of public interest, please find attached a vector > version of the logo without text. (-8 Now things are getting difficult... why is my v

Re: [OMPI devel] Logo as a vector graphic

2008-03-29 Thread Adrian Knoth
On Thu, Mar 13, 2008 at 02:35:41PM +0100, Adrian Knoth wrote: > > We usually snip off the words at the bottom. > I also did so. How do you crop the image? I used pdfcrop which is part > of the tetex distribution, but I guess there are better PS editors out > for Linux/Unix. I

Re: [OMPI devel] --disable-ipv6 broken on trunk

2008-04-02 Thread Adrian Knoth
On Wed, Apr 02, 2008 at 06:36:02AM -0400, Josh Hursey wrote: > It seems that builds configured with '--disable-ipv6' are broken on > the trunk. I suspect r18055 for this break since the tarball from two > --- > oob_tcp.c: In function `mca_oob_tcp_fini': > oob_tcp.c:1364

[OMPI devel] Change in btl/tcp

2008-04-16 Thread Adrian Knoth
Hi! As of r18169, I've changed the acceptance rules for incoming BTL-TCP connections. The old code would have denied a connection in case of non-matching addresses (comparison between source address and expected source address). Unfortunately, you cannot always say which source address an incomi

Re: [OMPI devel] Change in btl/tcp

2008-04-18 Thread Adrian Knoth
On Fri, Apr 18, 2008 at 08:04:17AM -0400, Tim Prins wrote: > Hi Adrian, Hi! > After this change, I am getting a lot of errors of the form: > [sif2][[12854,1],9][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] > mca_btl_tcp_frag_recv: readv failed: Connection reset by > peer (104) > > See for instanc

Re: [OMPI devel] Change in btl/tcp

2008-04-18 Thread Adrian Knoth
On Fri, Apr 18, 2008 at 01:00:40PM -0400, Josh Hursey wrote: > The trick is to force Open MPI to use only tcp,self and nothing else. > Did you try adding this (-mca btl tcp,self) to the runtime parameter > set? Sure. Even with 64 processes, I cannot trigger this behaviour. Neither on Linux no

Re: [OMPI devel] Change in btl/tcp

2008-04-21 Thread Adrian Knoth
On Mon, Apr 21, 2008 at 09:04:28AM -0400, Josh Hursey wrote: > Adrian, Hi! > Has there been any progress on this bug? If you still cannot reproduce > it, if you send either Tim Prins or I a debugging patch we can run > with it. Or we can try to arrange access to one of our machines for you.

Re: [OMPI devel] multiple GigE interfaces...

2008-06-23 Thread Adrian Knoth
On Wed, Jun 18, 2008 at 05:13:28PM -0700, Muhammad Atif wrote: > Hi again... I was on a break from Xensocket stuff This time some > general questions... Hi. > question. What if I have multiple Ethernet cards (say 5) on two of my > quad core machines. The IP addresses (and the subnets of c

Re: [OMPI devel] Funny warning message

2008-07-28 Thread Adrian Knoth
On Mon, Jul 28, 2008 at 05:14:29PM +0300, Lenny Verkhovsky wrote: > -advisable to configure rd_win smaller then (rd_num - rd_low), but currently > +advisable to configure rd_win bigger then (rd_num - rd_low), but currently ^ a -- Cluster and Metacomputi

Re: [OMPI devel] TCP BTL routability (was: ticket #972)

2008-07-29 Thread Adrian Knoth
On Tue, Jul 29, 2008 at 03:25:00PM -0400, Jeff Squyres wrote: > For reference, the FAQ entry is here: > > http://www.open-mpi.org/faq/?category=tcp#tcp-routability > > It looks like we now *always* assume that two TCP peers are routable. As long as they share the same address family (IPv

Re: [OMPI devel] Additional excluded tcp inteface

2008-11-07 Thread Adrian Knoth
On Fri, Nov 07, 2008 at 09:49:43AM -0500, Rolf Vandevaart wrote: > I do not think anyone will have a problem with this, but just thought I > would mention that I am planning on adding an additional interface to > the excluded list for the tcp btl. I want to add "sppp" to the list. > This is an