Quoting Alexey Kuznetsov [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Hello!
This might work. Could you post a patch to better show what you mean to do?
Here it is.
-neigh_destructor() is killed (not used), replaced with -neigh_cleanup(),
which is called when
From: Michael S. Tsirkin [EMAIL PROTECTED]
Date: Tue, 20 Mar 2007 18:02:17 +0200
David, Alexey, what do you think about this patch? Is it right?
Could this patch be considered for 2.6.21?
Acked-by: Michael S. Tsirkin [EMAIL PROTECTED]
I plan to apply it and merge.
-
To unsubscribe from this
Hello!
Well I don't think the loopback device is currently but as soon
as we get network namespace support we will have multiple loopback
devices and they will get unregistered when we remove the network
namespace.
There is no logical difference. At the moment when namespace is gone
there is
Hello!
Does this look sane (untested)?
It does not, unfortunately.
Instead of regular crash in infiniband you will get numerous
random NULL pointer dereferences both due to dst-neighbour
and due to dst-dev.
Alexey
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body
Quoting Alexey Kuznetsov [EMAIL PROTECTED]:
Subject: Re: [ofa-general] Re: dst_ifdown breaks infiniband?
Does this look sane (untested)?
It does not, unfortunately.
Instead of regular crash in infiniband you will get numerous
random NULL pointer dereferences both due to dst-neighbour
Hello!
I think the thing to do is to just leave the loopback references
in place, try to unregister the per-namespace loopback device,
and that will safely wait for all the references to go away.
Yes, it is exactly how it works in openvz. All the sockets are killed,
queues are cleared, nobody
Any simpler ideas?
Well, if inifiniband destructor really needs to take that lock... no.
Right now I do not see.
OK, this is actually not hard to fix - for infiniband, we can just look at
neighbour-dev-type or compare neighbour-dev and
neighbour-parms-dev - if they are different, device is
Quoting Michael S. Tsirkin [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Any simpler ideas?
Well, if inifiniband destructor really needs to take that lock... no.
Right now I do not see.
OK, this is actually not hard to fix - for infiniband, we can just look
Hello!
If a device driver sets neigh_destructor in neigh_params, this could
get called after the device has been unregistered and the driver module
removed.
It is the same problem: if dst-neighbour holds neighbour, it should
not hold device. parms-dev is not supposed to be used after
Quoting Alexey Kuznetsov [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Hello!
If a device driver sets neigh_destructor in neigh_params, this could
get called after the device has been unregistered and the driver module
removed.
It is the same problem: if dst
Quoting Alexey Kuznetsov [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Hello!
If a device driver sets neigh_destructor in neigh_params, this could
get called after the device has been unregistered and the driver module
removed.
It is the same problem: if dst
Hello!
infiniband sets parm-neigh_destructor, and I search for a way to prevent
this destructor from being called after the module has been unloaded.
Ideas?
It must be called in any case to update/release internal ipoib structures.
The idea is to move call of parm-neigh_destructor from
David Miller [EMAIL PROTECTED] writes:
I think the thing to do is to just leave the loopback references
in place, try to unregister the per-namespace loopback device,
and that will safely wait for all the references to go away.
Right. The only thing I have found that needs to be changed so
Quoting Alexey Kuznetsov [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Hello!
infiniband sets parm-neigh_destructor, and I search for a way to prevent
this destructor from being called after the module has been unloaded.
Ideas?
It must be called in any case
Hello!
This might work. Could you post a patch to better show what you mean to do?
Here it is.
-neigh_destructor() is killed (not used), replaced with -neigh_cleanup(),
which is called when neighbor entry goes to dead state. At this point
everything is still valid: neigh-dev, neigh-parms etc.
Alexey, Roland,
In debugging kernel lockup that occurs with IP over InfiniBand in 2.6.21-rc4:
( https://bugs.openfabrics.org/show_bug.cgi?id=402 )
I noticed the following code in dst_ifdown:
/* Dirty hack. We did it in 2.2 (in __dst_free),
* we have _very_ good reasons not to repeat
* this
Hello!
This is not new code, and should have triggered long time ago,
so I am not sure how come we are triggering this only now,
but somehow this did not lead to crashes in 2.6.20
I see. I guess this was plain luck.
Why is neighbour-dev changed here?
It holds reference to device and
Quoting Alexey Kuznetsov [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Hello!
This is not new code, and should have triggered long time ago,
so I am not sure how come we are triggering this only now,
but somehow this did not lead to crashes in 2.6.20
I see. I guess
Quoting Alexey Kuznetsov [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Hello!
This is not new code, and should have triggered long time ago,
so I am not sure how come we are triggering this only now,
but somehow this did not lead to crashes in 2.6.20
I see. I guess
Hello!
Hmm. Something I don't understand: does the code
in question not run on *each* device unregister?
It does.
Why do I only see this under stress?
You should have some referenced destination entries to trigger bad path.
This should happen not only under stress.
F.e. just try to ssh to
Hello!
It should be cleared and we should be sure it will not be destroyed
before quiescent state.
I'm confused. didn't you say dst_ifdown is called after quiescent state?
Quiescent state should happen after dst-neighbour is invalidated.
And this implies that all the users of
Quoting Alexey Kuznetsov [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Hello!
Hmm. Something I don't understand: does the code
in question not run on *each* device unregister?
It does.
Why do I only see this under stress?
You should have some referenced
Why is neighbour-dev changed here?
It holds reference to device and prevents its destruction.
If dst is held somewhere, we cannot destroy the device and deadlock
while unregister.
BTW, can this ever happen for the loopback device itself?
Is it ever unregistered?
--
MST
-
To unsubscribe
It should be cleared and we should be sure it will not be destroyed
before quiescent state.
I'm confused. didn't you say dst_ifdown is called after quiescent state?
Quiescent state should happen after dst-neighbour is invalidated.
And this implies that all the users of dst-neighbour
Quoting Alexey Kuznetsov [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Can dst-neighbour be changed to point to NULL instead, and the neighbour
released?
It should be cleared and we should be sure it will not be destroyed
before quiescent state.
Seems, this is the only
Quoting Michael S. Tsirkin [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Quoting Alexey Kuznetsov [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Can dst-neighbour be changed to point to NULL instead, and the neighbour
released?
It should be cleared
Quoting Michael S. Tsirkin [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Quoting Michael S. Tsirkin [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Quoting Alexey Kuznetsov [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Can dst
Quoting Eric W. Biederman ebiederman@lnxi.com:
Subject: Re: [ofa-general] Re: dst_ifdown breaks infiniband?
Michael S. Tsirkin [EMAIL PROTECTED] writes:
Why is neighbour-dev changed here?
It holds reference to device and prevents its destruction.
If dst is held somewhere, we
Quoting Michael S. Tsirkin [EMAIL PROTECTED]:
Subject: Re: [ofa-general] Re: dst_ifdown breaks infiniband?
Quoting Eric W. Biederman ebiederman@lnxi.com:
Subject: Re: [ofa-general] Re: dst_ifdown breaks infiniband?
Michael S. Tsirkin [EMAIL PROTECTED] writes:
Why is neighbour
From: Michael S. Tsirkin [EMAIL PROTECTED]
Date: Mon, 19 Mar 2007 00:42:34 +0200
Quoting Michael S. Tsirkin [EMAIL PROTECTED]:
Subject: Re: [ofa-general] Re: dst_ifdown breaks infiniband?
Quoting Eric W. Biederman ebiederman@lnxi.com:
Subject: Re: [ofa-general] Re: dst_ifdown breaks
Michael S. Tsirkin [EMAIL PROTECTED] writes:
Why is neighbour-dev changed here?
It holds reference to device and prevents its destruction.
If dst is held somewhere, we cannot destroy the device and deadlock
while unregister.
BTW, can this ever happen for the loopback device itself?
Is
Quoting David Miller [EMAIL PROTECTED]:
Subject: Re: [ofa-general] Re: dst_ifdown breaks infiniband?
From: Michael S. Tsirkin [EMAIL PROTECTED]
Date: Mon, 19 Mar 2007 00:42:34 +0200
Quoting Michael S. Tsirkin [EMAIL PROTECTED]:
Subject: Re: [ofa-general] Re: dst_ifdown breaks
Quoting Michael S. Tsirkin [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Quoting Michael S. Tsirkin [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Quoting Michael S. Tsirkin [EMAIL PROTECTED]:
Subject: Re: dst_ifdown breaks infiniband?
Quoting
David Miller [EMAIL PROTECTED] writes:
From: Michael S. Tsirkin [EMAIL PROTECTED]
Date: Mon, 19 Mar 2007 00:42:34 +0200
Hmm. Then the code moving dst-dev to point to the loopback
device will have to be fixed too. I'll post a patch a bit later.
Does this look sane (untested)?
From: ebiederman@lnxi.com (Eric W. Biederman)
Date: Sun, 18 Mar 2007 23:30:39 -0600
Sure. In the network namespace case I think the careful ordering of the
shutdown handles that case. Even with per network namespace lo
unregistered it still existed until the network namespace actually
35 matches
Mail list logo