On Wed, Jan 21, 2015 at 7:19 PM, Roland Dreier <[email protected]> wrote:
> On Tue, Jan 20, 2015 at 8:16 AM, Erez Shitrit <[email protected]> 
> wrote:

>> After trying your V4 patch series, I can tell that first, the endless 
>> scheduling of
>> the mcast task is indeed over, but still, the multicast functionality in 
>> ipoib is unstable.

> Is this worse than 3.18?  (Have you tested that?)

Roland, Doug,

To be fully clear here by "this" we're talking on seven patches of
complexity and volume which I think go way beyond post -rc5 timeline:

  IB/ipoib: Fix failed multicast joins/sends
  IB/ipoib: Add a helper to restart the multicast task
  IB/ipoib: make delayed tasks not hold up everything
  IB/ipoib: Handle -ENETRESET properly in our callback
  IB/ipoib: don't restart our thread on ENETRESET
  IB/ipoib: remove unneeded locks
  IB/ipoib: fix race between mcast_dev_flush and mcast_join

 drivers/infiniband/ulp/ipoib/ipoib.h           |   1 +
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 204 +++++++++++++++----------
 2 files changed, 121 insertions(+), 84 deletions(-)

Doug, I understand your claim and frustration that with 3.18 and such
(older kernels) your ifdown/up loop manages to break the driver, but
fixing the driver such that this test works and in the same time
practically breaking IPv6 and IPv5 multicast introduces a deep
regression vs. 3.18 - which as you wrote here, would be wrong to fix
with such a further big change.

Are you really sure that reverting the offending patch 016d9fb25cd9
"IPoIB: fix MCAST_FLAG_BUSY usage" and maybe some more dependent
related hunks from downstream patches of that series isn't possible?

If this is the case, I would suggest that we either revive the review
on the fix we sent [1] or drop the whole 3.19-rc1 changes. I vote for
the former.

[1] http://marc.info/?l=linux-rdma&m=142064313123254&w=2

> Because Doug's changes fixed some bad, easy-to-reproduce issues.  On
> the other hand we don't want to introduce new regressions to fix the
> old issues.

See above, we did introduced regressions.

> I think we only have a few days to decide whether to revert back to
> 3.18 code, or push forward with these fixes.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to