> On Nov 6, 2014, at 1:39 PM, Nathan Hjelm wrote:
>
> On Thu, Nov 06, 2014 at 04:29:44PM -0500, Joshua Ladd wrote:
>> On Thursday, November 6, 2014, Nathan Hjelm wrote:
>>
>> On Thu, Nov 06, 2014 at 04:06:23PM -0500, Joshua Ladd wrote:
>>> Nathan,
>>> Has this bug always been present
On Thu, Nov 06, 2014 at 04:29:44PM -0500, Joshua Ladd wrote:
>On Thursday, November 6, 2014, Nathan Hjelm wrote:
>
> On Thu, Nov 06, 2014 at 04:06:23PM -0500, Joshua Ladd wrote:
> >Nathan,
> >Has this bug always been present in OpenIB or is this a recent
> addition
On Thursday, November 6, 2014, Nathan Hjelm wrote:
> On Thu, Nov 06, 2014 at 04:06:23PM -0500, Joshua Ladd wrote:
> >Nathan,
> >Has this bug always been present in OpenIB or is this a recent
> addition?
> >If this is regression, I would also be inclined to say that this is a
>
> The b
On Thu, Nov 06, 2014 at 04:06:23PM -0500, Joshua Ladd wrote:
>Nathan,
>Has this bug always been present in OpenIB or is this a recent addition?
>If this is regression, I would also be inclined to say that this is a
The bug is as old as the message coalescing feature in the openib
btl.
Nathan,
Has this bug always been present in OpenIB or is this a recent addition? If
this is regression, I would also be inclined to say that this is a blocker
for 1.8.4. This is a SIGNIFICANT bug. Both Howard and I were quite
surprised that all the while this code has been in use at LANL
in produc
Thanks, Nathan. After a bit more investigation yesterday, this was our
conclusion too; that it is a longstanding bug in OpenIB BTL we just
happened to start triggering the broken flow with some recent changes made
to the default max_lmc parameter. Let us know if you need anything from our
end.
Jos
I see the problem. The openib btl does not properly handle the following
call sequence (this is an openib btl bug IMHO):
btl_sendi (..., &descriptor);
btl_free (..., descriptor);
The bug is in the message coalescing code and it looks like extra logic
needs to be added to the openib btl's btl_fre
Can you please let me know when you fix this? I intend to release 1.8.4 by the
end of the week. Since Mellanox is the only member with IB, you folks have been
maintaining this BTL.
> On Nov 3, 2014, at 6:26 AM, Alina Sklarevich
> wrote:
>
> Hi,
>
> On 1.8.4rc1 we observe the following asser
Hi,
On 1.8.4rc1 we observe the following assert in the osu_mbw_mr test when
using the openib BTL.
When compiled in production mode (i.e. no --enable-debug) the test simply
hangs.
When using either the tcp BTL or the cm PML, the benchmark completes
without error.
The command line to reproduce th