Re: dst_ifdown breaks infiniband?

2007-03-20 Thread Michael S. Tsirkin
Quoting Alexey Kuznetsov [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Hello! This might work. Could you post a patch to better show what you mean to do? Here it is. -neigh_destructor() is killed (not used), replaced with -neigh_cleanup(), which is called when

Re: dst_ifdown breaks infiniband?

2007-03-20 Thread David Miller
From: Michael S. Tsirkin [EMAIL PROTECTED] Date: Tue, 20 Mar 2007 18:02:17 +0200 David, Alexey, what do you think about this patch? Is it right? Could this patch be considered for 2.6.21? Acked-by: Michael S. Tsirkin [EMAIL PROTECTED] I plan to apply it and merge. - To unsubscribe from this

Re: [ofa-general] Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Alexey Kuznetsov
Hello! Well I don't think the loopback device is currently but as soon as we get network namespace support we will have multiple loopback devices and they will get unregistered when we remove the network namespace. There is no logical difference. At the moment when namespace is gone there is

Re: [ofa-general] Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Alexey Kuznetsov
Hello! Does this look sane (untested)? It does not, unfortunately. Instead of regular crash in infiniband you will get numerous random NULL pointer dereferences both due to dst-neighbour and due to dst-dev. Alexey - To unsubscribe from this list: send the line unsubscribe netdev in the body

Re: [ofa-general] Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Michael S. Tsirkin
Quoting Alexey Kuznetsov [EMAIL PROTECTED]: Subject: Re: [ofa-general] Re: dst_ifdown breaks infiniband? Does this look sane (untested)? It does not, unfortunately. Instead of regular crash in infiniband you will get numerous random NULL pointer dereferences both due to dst-neighbour

Re: [ofa-general] Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Alexey Kuznetsov
Hello! I think the thing to do is to just leave the loopback references in place, try to unregister the per-namespace loopback device, and that will safely wait for all the references to go away. Yes, it is exactly how it works in openvz. All the sockets are killed, queues are cleared, nobody

Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Michael S. Tsirkin
Any simpler ideas? Well, if inifiniband destructor really needs to take that lock... no. Right now I do not see. OK, this is actually not hard to fix - for infiniband, we can just look at neighbour-dev-type or compare neighbour-dev and neighbour-parms-dev - if they are different, device is

Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Michael S. Tsirkin
Quoting Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Any simpler ideas? Well, if inifiniband destructor really needs to take that lock... no. Right now I do not see. OK, this is actually not hard to fix - for infiniband, we can just look

Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Alexey Kuznetsov
Hello! If a device driver sets neigh_destructor in neigh_params, this could get called after the device has been unregistered and the driver module removed. It is the same problem: if dst-neighbour holds neighbour, it should not hold device. parms-dev is not supposed to be used after

Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Michael S. Tsirkin
Quoting Alexey Kuznetsov [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Hello! If a device driver sets neigh_destructor in neigh_params, this could get called after the device has been unregistered and the driver module removed. It is the same problem: if dst

Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Michael S. Tsirkin
Quoting Alexey Kuznetsov [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Hello! If a device driver sets neigh_destructor in neigh_params, this could get called after the device has been unregistered and the driver module removed. It is the same problem: if dst

Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Alexey Kuznetsov
Hello! infiniband sets parm-neigh_destructor, and I search for a way to prevent this destructor from being called after the module has been unloaded. Ideas? It must be called in any case to update/release internal ipoib structures. The idea is to move call of parm-neigh_destructor from

Re: [ofa-general] Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Eric W. Biederman
David Miller [EMAIL PROTECTED] writes: I think the thing to do is to just leave the loopback references in place, try to unregister the per-namespace loopback device, and that will safely wait for all the references to go away. Right. The only thing I have found that needs to be changed so

Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Michael S. Tsirkin
Quoting Alexey Kuznetsov [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Hello! infiniband sets parm-neigh_destructor, and I search for a way to prevent this destructor from being called after the module has been unloaded. Ideas? It must be called in any case

Re: dst_ifdown breaks infiniband?

2007-03-19 Thread Alexey Kuznetsov
Hello! This might work. Could you post a patch to better show what you mean to do? Here it is. -neigh_destructor() is killed (not used), replaced with -neigh_cleanup(), which is called when neighbor entry goes to dead state. At this point everything is still valid: neigh-dev, neigh-parms etc.

dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
Alexey, Roland, In debugging kernel lockup that occurs with IP over InfiniBand in 2.6.21-rc4: ( https://bugs.openfabrics.org/show_bug.cgi?id=402 ) I noticed the following code in dst_ifdown: /* Dirty hack. We did it in 2.2 (in __dst_free), * we have _very_ good reasons not to repeat * this

Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Alexey Kuznetsov
Hello! This is not new code, and should have triggered long time ago, so I am not sure how come we are triggering this only now, but somehow this did not lead to crashes in 2.6.20 I see. I guess this was plain luck. Why is neighbour-dev changed here? It holds reference to device and

Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
Quoting Alexey Kuznetsov [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Hello! This is not new code, and should have triggered long time ago, so I am not sure how come we are triggering this only now, but somehow this did not lead to crashes in 2.6.20 I see. I guess

Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
Quoting Alexey Kuznetsov [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Hello! This is not new code, and should have triggered long time ago, so I am not sure how come we are triggering this only now, but somehow this did not lead to crashes in 2.6.20 I see. I guess

Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Alexey Kuznetsov
Hello! Hmm. Something I don't understand: does the code in question not run on *each* device unregister? It does. Why do I only see this under stress? You should have some referenced destination entries to trigger bad path. This should happen not only under stress. F.e. just try to ssh to

Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Alexey Kuznetsov
Hello! It should be cleared and we should be sure it will not be destroyed before quiescent state. I'm confused. didn't you say dst_ifdown is called after quiescent state? Quiescent state should happen after dst-neighbour is invalidated. And this implies that all the users of

Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
Quoting Alexey Kuznetsov [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Hello! Hmm. Something I don't understand: does the code in question not run on *each* device unregister? It does. Why do I only see this under stress? You should have some referenced

Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
Why is neighbour-dev changed here? It holds reference to device and prevents its destruction. If dst is held somewhere, we cannot destroy the device and deadlock while unregister. BTW, can this ever happen for the loopback device itself? Is it ever unregistered? -- MST - To unsubscribe

Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
It should be cleared and we should be sure it will not be destroyed before quiescent state. I'm confused. didn't you say dst_ifdown is called after quiescent state? Quiescent state should happen after dst-neighbour is invalidated. And this implies that all the users of dst-neighbour

Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
Quoting Alexey Kuznetsov [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Can dst-neighbour be changed to point to NULL instead, and the neighbour released? It should be cleared and we should be sure it will not be destroyed before quiescent state. Seems, this is the only

Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
Quoting Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Quoting Alexey Kuznetsov [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Can dst-neighbour be changed to point to NULL instead, and the neighbour released? It should be cleared

Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
Quoting Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Quoting Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Quoting Alexey Kuznetsov [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Can dst

Re: [ofa-general] Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
Quoting Eric W. Biederman ebiederman@lnxi.com: Subject: Re: [ofa-general] Re: dst_ifdown breaks infiniband? Michael S. Tsirkin [EMAIL PROTECTED] writes: Why is neighbour-dev changed here? It holds reference to device and prevents its destruction. If dst is held somewhere, we

Re: [ofa-general] Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
Quoting Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Re: [ofa-general] Re: dst_ifdown breaks infiniband? Quoting Eric W. Biederman ebiederman@lnxi.com: Subject: Re: [ofa-general] Re: dst_ifdown breaks infiniband? Michael S. Tsirkin [EMAIL PROTECTED] writes: Why is neighbour

Re: [ofa-general] Re: dst_ifdown breaks infiniband?

2007-03-18 Thread David Miller
From: Michael S. Tsirkin [EMAIL PROTECTED] Date: Mon, 19 Mar 2007 00:42:34 +0200 Quoting Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Re: [ofa-general] Re: dst_ifdown breaks infiniband? Quoting Eric W. Biederman ebiederman@lnxi.com: Subject: Re: [ofa-general] Re: dst_ifdown breaks

Re: [ofa-general] Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Eric W. Biederman
Michael S. Tsirkin [EMAIL PROTECTED] writes: Why is neighbour-dev changed here? It holds reference to device and prevents its destruction. If dst is held somewhere, we cannot destroy the device and deadlock while unregister. BTW, can this ever happen for the loopback device itself? Is

Re: [ofa-general] Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
Quoting David Miller [EMAIL PROTECTED]: Subject: Re: [ofa-general] Re: dst_ifdown breaks infiniband? From: Michael S. Tsirkin [EMAIL PROTECTED] Date: Mon, 19 Mar 2007 00:42:34 +0200 Quoting Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Re: [ofa-general] Re: dst_ifdown breaks

Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Michael S. Tsirkin
Quoting Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Quoting Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Quoting Michael S. Tsirkin [EMAIL PROTECTED]: Subject: Re: dst_ifdown breaks infiniband? Quoting

Re: [ofa-general] Re: dst_ifdown breaks infiniband?

2007-03-18 Thread Eric W. Biederman
David Miller [EMAIL PROTECTED] writes: From: Michael S. Tsirkin [EMAIL PROTECTED] Date: Mon, 19 Mar 2007 00:42:34 +0200 Hmm. Then the code moving dst-dev to point to the loopback device will have to be fixed too. I'll post a patch a bit later. Does this look sane (untested)?

Re: [ofa-general] Re: dst_ifdown breaks infiniband?

2007-03-18 Thread David Miller
From: ebiederman@lnxi.com (Eric W. Biederman) Date: Sun, 18 Mar 2007 23:30:39 -0600 Sure. In the network namespace case I think the careful ordering of the shutdown handles that case. Even with per network namespace lo unregistered it still existed until the network namespace actually