On Wed, 2015-01-14 at 20:47 +0200, Erez Shitrit wrote: > On 1/14/2015 6:09 PM, Doug Ledford wrote: > > On Wed, 2015-01-14 at 18:02 +0200, Erez Shitrit wrote: > >> Hi Doug, > >> > >> Perhaps I am missing something here, but ping6 still doesn't work for me > >> in many cases. > >> > >> I think the reason is that your origin patch does the following: > >> in function ipoib_mcast_join_task > >> if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) > >> ipoib_mcast_sendonly_join(mcast); > >> else > >> ipoib_mcast_join(dev, mcast, 1); > >> return; > >> The flow for sendonly_join doesn't include handling the mc_task, so only > >> the first mc in the list (if it is sendonly mcg) will be sent, and no > >> more mcg's that are in the ipoib mc list are going to be sent. (see how > >> it is in ipoib_mcast_join flow) > > Yes, I know what you are talking about. However, my patches did not add > > this bug, it was present in the original code. Please check a plain > > v3.18 kernel, which does not have my patches, and you will see that > > ipoib_mcast_sendonly_join_complete also fails to restart the mcast join > > thread there as well. > Agree. > but in 3.18 there was no call from mc_task to sendonly_join, just to the > full-member join, so no need at that point to handle the task. (the call > for sendonly-join was by demand whenever new packet to mcg was sent by > the kernel) > only in 3.19 the sendonly join was called explicitly from the mc_task.
I just sent a patch set that fixes this. > > > >> I can demonstrate it with the log of ipoib: > >> I am trying to ping6 fe80::202:c903:9f:3b0a via ib0 > >> > >> The log is: > >> ib0: restarting multicast task > >> ib0: setting up send only multicast group for > >> ff12:601b:ffff:0000:0000:0000:0000:0016 > >> ib0: adding multicast entry for mgid > >> ff12:601b:ffff:0000:0000:0001:ff43:3bf1 > >> ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0016, > >> starting sendonly join > >> ib0: join completion for ff12:601b:ffff:0000:0000:0000:0000:0001 (status 0) > >> ib0: MGID ff12:601b:ffff:0000:0000:0000:0000:0001 AV ffff88081afb5f40, > >> LID 0xc015, SL 0 > >> ib0: join completion for ff12:401b:ffff:0000:0000:0000:0000:0001 (status 0) > >> ib0: MGID ff12:401b:ffff:0000:0000:0000:0000:0001 AV ffff88081e1c42c0, > >> LID 0xc014, SL 0 > >> ib0: sendonly multicast join failed for > >> ff12:601b:ffff:0000:0000:0000:0000:0016, status -22 > >> ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0016, > >> starting sendonly join > >> ib0: sendonly multicast join failed for > >> ff12:601b:ffff:0000:0000:0000:0000:0016, status -22 > >> ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0016, > >> starting sendonly join > >> ib0: sendonly multicast join failed for > >> ff12:601b:ffff:0000:0000:0000:0000:0016, status -22 > >> ib0: setting up send only multicast group for > >> ff12:601b:ffff:0000:0000:0000:0000:0002 > >> ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0016, > >> starting sendonly join > >> ib0: sendonly multicast join failed for > >> ff12:601b:ffff:0000:0000:0000:0000:0016, status -22 > >> ib0: setting up send only multicast group for > >> ff12:601b:ffff:0000:0000:0001:ff9f:3b0a > >> >>>>>> here you can see that the ipv6 address is added and queued > >> to the list > >> ib0: no multicast record for ff12:601b:ffff:0000:0000:0000:0000:0016, > >> starting sendonly join > >> ib0: sendonly multicast join failed for > >> ff12:601b:ffff:0000:0000:0000:0000:0016, status -22 > >> >>>>>> the ipv6 mcg will not be sent because it is after some other > >> sendonly, and no one in that flow re-queue the mc_task again. > > This is a problem with the design of the original mcast task thread. > > I'm looking at a fix now. Currently the design only allows one join to > > be outstanding at a time. Is there a reason for that that I'm not aware > > of? Some historical context that I don't know about? > IMHO, the reason for only one mc on the air at a time was to make our > life easier, otherwise there are locks to take/manage, races between few > responses, etc. also, the multicast module in the core keeps all the > requests in serialize mode. > perhaps, you can use the relevant code from the full-member join in the > sendonly joinin order to handle the mc_task, or to return the call to > send-only to the mcast_send instead of the mc_task. I reworked things a bit, but yes, the send only task now does the right thing. Please review the latest patchset I posted. It's working just fine for me here. -- Doug Ledford <[email protected]> GPG KeyID: 0E572FDD
signature.asc
Description: This is a digitally signed message part
