Re: [OMPI devel] Fwd: + ummunotify-userspace-support-for-mmu-notifications-v2.patch added to -mm tree

2010-05-11 Thread Roland Dreier
 > Woo hoo!  It looks like ummunotify went into the -mm kernel tree yesterday.

Yeah, I was wondering what Andrew's plans are for this.  Nothing has
changed so I'm not sure whether it gets merged all the way to Linus this
time either.

 - R.
-- 
Roland Dreier <rola...@cisco.com> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html


Re: [OMPI devel] -display-map and mpi_spawn

2008-09-16 Thread Roland Dreier
 > thanks, applied

oops, replied to the wrong message ;)


Re: [OMPI devel] -display-map and mpi_spawn

2008-09-16 Thread Roland Dreier
thanks, applied


Re: [OMPI devel] ibcm private header file

2008-07-22 Thread Roland Dreier
 > So I think using byteorder.h is not a good idea (note the warning).
 > How about just having two #defines, picking the Right one based on
 > WORDS_BIGENDIAN?

On Linux, the public  header might have what you need.
For that matter  has htonll() defined.

 - R.


Re: [OMPI devel] RFC: Linuxes shipping libibverbs

2008-05-29 Thread Roland Dreier
 > Is it possible that /sys/class/infiniband directory exist and it is 
 > empty ? In which cases ?

Do "modprobe ib_core" on a system with no hardware drivers loaded (or no
RDMA hardware installed)


Re: [OMPI devel] Notes from mem hooks call today

2008-05-28 Thread Roland Dreier
 > I think Patrick's point is that it's not too much more expensive to do the 
 > syscall on Linux vs just doing the cache lookup, particularly in the 
 > context of a long message.  And it means that upper layer protocols like 
 > MPI don't have to deal with caches (and since MPI implementors hate 
 > registration caches only slightly less than we hate MPI_CANCEL, that will 
 > make us happy).

Stick in a separate library then?

I don't think we want the complexity in the kernel -- I personally would
argue against merging it upstream; and given that the userspace solution
is actually faster, it becomes pretty hard to justify.


Re: [OMPI devel] Notes from mem hooks call today

2008-05-28 Thread Roland Dreier
 >- gleb asks: don't we want to avoid the system call when possible?
 >- patrick: a single syscall can be/is cheaper than a reg cache
 >  lookup in user space

This doesn't really make sense -- syscall + cache lookup in kernel is
"obviously" more expensive than cache lookup in userspace with no
context switch (I don't see any tricks the kernel can do that make the
cache lookup cheaper there).

However the solution I proposed a long time ago (when Pete Wyckoff
originally did his work on having the kernel track this -- and as a side
note, it's not clear to me whether MMU notifiers really help what Pete
did) is for userspace to provide a pointer to a flag when registering
memory with the kernel, and then the kernel can mark the flag if the
mapping changes -- ie keep the userspace cache but have the kernel
manage invalidation "perfectly" without any malloc hooks.

 - R.


Re: [OMPI devel] RFC: Linuxes shipping libibverbs

2008-05-23 Thread Roland Dreier
 > Either that or udev in not configured properly.

Debian has a correct udev configuration, modulo

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=449081

 > ib_core/mthca/mlx4 should be loaded automatically by hotplug if HW is
 > present. No need for any additional configuration.

Yes (although only mlx4_core and not mlx4_ib will be loaded based on PCI
IDs), but nothing loads ib_uverbs automatically, and systems that have
no RDMA hardware will obviously not have any RDMA drivers autoloaded.

 - R.


Re: [OMPI devel] RFC: Linuxes shipping libibverbs

2008-05-23 Thread Roland Dreier
 > OFED is one distribution of the OpenFabrics software.  It can be  
 > bundled up and packaged differently, too.  I suspect that Debian does  
 > not include OFED directly, because OFED is pretty heavily dependent  
 > upon RPM.  So the OpenFabrics kernel bits must be there somewhere  
 > (libibverbs would be useless, otherwise); it would be nice to  
 > understand how they are activated: either manually or automatically.

"OpenFabrics kernel bits" doesn't really make sense.  Debian just ships
a Linux kernel, which has InfiniBand/RDMA drivers.

Debian doesn't load the ib_uverbs module by default, nor should it,
since the vast majority of users don't have RDMA hardware.  So
libibverbs and Open MPI should act sanely when no kernel drivers are
loaded, /sys/class/infinibad_verbs doesn't exist, etc.

There is already a Debian bug open about this for libibverbs:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=418014

I've been meaning to work on this but sadly I have not been able to put
much time into it.

 - R.


Re: [OMPI devel] OMPI Mercurial read-only mirror

2008-05-04 Thread Roland Dreier
 > >  > Can I make a /tmp branch from the hg read-only branch that is not tied 
 > >  > to the svn /tmp branches.

 > > Why do you want to do that?
 > >
 > > Mercurial is a fully distributed system, so you could just start
 > > committing to one of your local copies of the repository, and I can't
 > > see anything missing that a /tmp branch would give you.

 > Same reason you do an SVN tmp branch.  So others (outside of my 
 > employer's WAN) can actually clone the branch and try it out before you 
 > push it back to the repository.

Mercurial is a fully distributed system.  So instead of thinking of /tmp
branch, you should think of publishing your repository, which has your
commits in it.  As I understand it, open-mpi.org is not set up for
publishing other repositories yet, but it is quite easy to set up a
mercurial server; there are also several places that will host one for
you: http://www.selenic.com/mercurial/wiki/index.cgi/MercurialHosting

 - R.


Re: [OMPI devel] OMPI Mercurial read-only mirror

2008-05-02 Thread Roland Dreier
 > Can I make a /tmp branch from the hg read-only branch that is not tied 
 > to the svn /tmp branches.

Why do you want to do that?

Mercurial is a fully distributed system, so you could just start
committing to one of your local copies of the repository, and I can't
see anything missing that a /tmp branch would give you.

 - R.


Re: [OMPI devel] Affect of compression on modex and launch messages

2008-04-04 Thread Roland Dreier
 > Based on some discussion on this list, I integrated a zlib-based compression
 > ability into ORTE. Since the launch message sent to the orteds and the modex
 > between the application procs are the only places where messages of any size
 > are sent, I only implemented compression for those two exchanges.
 > 
 > I have found virtually no benefit to the compression. Essentially, the
 > overhead consumed in compression/decompressing the messages pretty much
 > balances out any transmission time differences. However, I could only test
 > this for 64 nodes, 8ppn, so perhaps there is some benefit at larger sizes.

A faster compression library might change the balance... eg LZO
(http://www.oberhumer.com/opensource/lzo) might be worth a look although
I'm not an expert on all the compression libraries that are out there.

 - R.


Re: [OMPI devel] Switching away from SVN?

2008-03-24 Thread Roland Dreier
LWN.net has an interesting article about how Emacs chose a new version
control system: 

They were back in the CVS stone ages, but their main contenders were
the same big three of distributed VCSs: git, hg and bzr.  The article
pulls out a couple of very good quotes from their discussion.  The one
that caught my eye was from Richard Stallman:

We already know the most important thing about what we will find from
a careful study of git, mercurial and Bzr. We will find that each has
its advantages and disadvantages -- but none of them conclusive. Each
will be preferred by some people, but any one of them would work out
well enough. 

 - R.


Re: [OMPI devel] Switching away from SVN?

2008-03-20 Thread Roland Dreier
 has some interesting
info about svn->git conversions (and svn vs. next-gen distibuted
systems in general).

Also, out of curiousity I tried doing

git-svn clone --stdlayout http://svn.open-mpi.org/svn/ompi/

and it seemed to work fine (git-svn is part of the main git
distribution).  The only obvious thing missing is that you would
probably want to set up an author file for a real conversion, so that
you get real names instead of just "jsquyres".  It took a while to
run, mostly because it has to grab each svn changeset one by one.

The interesting thing is that a checkout of the current ompi tree
seems to be about 37 MB, while .git directory of my repository, which
has the entire history of all branches of the svn repository plus
1.6MB of svn metadata is 36 MB.  And git can do fun stuff like

git diff v1.1..v1.2

in half a second (it generates a 274858 line diff).  It can generate
the full 116320 line (11164 commit) log of the trunk in .3 seconds.

Jeff, if you want to see the repository, it is in

/data/home/roland/ompi.git

Feel free to make it available however you want (it's your data of course).

 - R.


Re: [OMPI devel] replace 'atoi' with 'strtol'

2007-04-18 Thread Roland Dreier
 > I see... so the right way to right this is really:

err... "right way to WRITE this"

 - R.


Re: [OMPI devel] replace 'atoi' with 'strtol'

2007-04-18 Thread Roland Dreier
 > How about (u)int32_t? When I was an Ada programmer, subtypes with the
 > approriate range were always encouraged (i.e.: define the semantical
 > range and let the compiler/runtime library warn you on range
 > violations (the well-known "CONSTRAINT_ERROR"))

It's OK to use a type with a fixed size when there is some real reason
that you know exactly how many bits you need, but in my opinion it's
much better to use plain int whenever possible.  Otherwise you end up
in a mess when different functions have parameters of different widths
for no good reason, and you're also taking away the compiler's choice
to use the most efficient size of integer.

 - R.


Re: [OMPI devel] replace 'atoi' with 'strtol'

2007-04-18 Thread Roland Dreier
 > I.e., it returns a long.  Although some compilers might do the right  
 > thing, conversions should be explicitly shown.

Isn't the behavior guaranteed by the C standard?  And I can't even
imagine a way for a compiler to get

intvar = strtol(foo);

wrong, and it seems even more implausible that such a bug would be
cured just by adding a cast to int.

Maybe you could get the same effect by leaving the cast out and
wearing a magnetized titanium bracelet while writing the code?

 - R.


Re: [OMPI devel] replace 'atoi' with 'strtol'

2007-04-18 Thread Roland Dreier
 > With the (int) cast, I'm ok with it now.  :-)

What's the point of the cast to int?

 - R.


[O-MPI devel] [PATCH] Update Open MPI for new libibverbs API

2005-09-26 Thread Roland Dreier
[It's somewhat annoying to have to subscribe to de...@open-mpi.org
just to be able to send patches, but oh well...]


This patch updates Open MPI for the new ibv_create_cq() API.

Signed-off-by: Roland Dreier <rola...@cisco.com>

--- ompi/mca/btl/openib/btl_openib.c(revision 7507)
+++ ompi/mca/btl/openib/btl_openib.c(working copy)
@@ -656,7 +656,8 @@ int mca_btl_openib_module_init(mca_btl_o
 }

 /* Create the low and high priority queue pairs */ 
-openib_btl->ib_cq_low = ibv_create_cq(ctx, 
mca_btl_openib_component.ib_cq_size, NULL); 
+openib_btl->ib_cq_low = ibv_create_cq(ctx, 
mca_btl_openib_component.ib_cq_size,
+ NULL, NULL, 0); 

 if(NULL == openib_btl->ib_cq_low) {
 BTL_ERROR(("error creating low priority cq for %s errno says %s\n",
@@ -665,7 +666,8 @@ int mca_btl_openib_module_init(mca_btl_o
 return OMPI_ERROR;
 }

-openib_btl->ib_cq_high = ibv_create_cq(ctx, 
mca_btl_openib_component.ib_cq_size, NULL); 
+openib_btl->ib_cq_high = ibv_create_cq(ctx, 
mca_btl_openib_component.ib_cq_size,
+  NULL, NULL, 0); 

 if(NULL == openib_btl->ib_cq_high) {
 BTL_ERROR(("error creating high priority cq for %s errno says %s\n",