From: devel-boun...@open-mpi.org [devel-boun...@open-mpi.org] on behalf of
Christopher Samuel [sam...@unimelb.edu.au]
Sent: Thursday, March 01, 2012 7:58 PM
To: de...@open-mpi.org
Subject: Re: [OMPI devel] [OMPI svn] svn:open-mpi r26077 (fwd)
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 02/03/12 02:56, Nathan Hjelm wrote:
> Found a pretty nasty frag leak (and a minor one) in ob1 (see
> commit below). If this fix addresses some hangs we are seeing on
> infiniband LANL might want a 1.4.6 rolled (or a faster rollout for
> 1.6.0).
Wh
I can confirm that neither leak is causing my imb hang. Unless there is another
frag leak somewhere (haven't found one) the lockup was simply due to running
out of registered memory. So, I see no need to push for a 1.4.6 unless a btl
other than ugni hits the bug.
Setting an rcache limit doesn'
Good catch!!! That's indeed a quite nasty bug.
If it fixes the IB issues it justifies a 1.4.6 release.
Thanks,
george.
On Mar 1, 2012, at 10:56 , Nathan Hjelm wrote:
> Found a pretty nasty frag leak (and a minor one) in ob1 (see commit below).
> If this fix addresses some hangs we are se
On Thu, 1 Mar 2012, Jeffrey Squyres wrote:
...or in 1.5.5.
Well, we want a "stable" release to deploy on the affected cluster.
How soon will you be able to tell if it fixes some hangs?
I will know in a couple of hours. Tested the fix in 1.4.5 and it appears to
elimiate my IMB hang! I stil
Hopefully by the end of the day - Nathan is testing now.
Sam
On Mar 1, 2012, at 11:36 AM, Jeffrey Squyres wrote:
> ...or in 1.5.5.
>
> How soon will you be able to tell if it fixes some hangs?
>
>
> On Mar 1, 2012, at 10:56 AM, Nathan Hjelm wrote:
>
>> Found a pretty nasty frag leak (and a
...or in 1.5.5.
How soon will you be able to tell if it fixes some hangs?
On Mar 1, 2012, at 10:56 AM, Nathan Hjelm wrote:
> Found a pretty nasty frag leak (and a minor one) in ob1 (see commit below).
> If this fix addresses some hangs we are seeing on infiniband LANL might want
> a 1.4.6 r