Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-26 Thread Cornelia Huck
On Thu, 26 Apr 2007 10:58:45 -0400 (EDT),
Alan Stern <[EMAIL PROTECTED]> wrote:

> > > > The remove() method must also unset driver_data.
> > > 
> > > It doesn't really have to.  The driver core could do it.
> > 
> > I think it is more consistent if the driver takes care of the fields
> > specifically designed for its usage.
> 
> Yes.  However if the driver forgets to clear the field it shouldn't cause
> a warning.  After all, there won't be any harm; the next driver to bind 
> to the device will just overwrite the driver_data anyway.

Agreed, but it is still a good practice and should be recommended.

> > Yes. Especially since the "gone"-field may be contained in that
> > embedding structure if the subsystem controls it.
> 
> No, no!  The "gone" flag must be in the private data structure.  If it 
> were in a container of the device structure, then it could be overwritten 
> when a different driver binds to the device.

Argl, thinko again. You're right, of course.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-26 Thread Alan Stern
On Thu, 26 Apr 2007, Cornelia Huck wrote:

> On Wed, 25 Apr 2007 16:13:12 -0400 (EDT),
> Alan Stern <[EMAIL PROTECTED]> wrote:
> 
> > > Documentation/driver-model/lifetime-rules.txt?
> > 
> > When (if) such a file is added, it should contain more than just these few 
> > paragraphs.
> 
> This file may be a good idea in general. I'll see if I can come up with
> something useful.

Good.

> > > The remove() method must also unset driver_data.
> > 
> > It doesn't really have to.  The driver core could do it.
> 
> I think it is more consistent if the driver takes care of the fields
> specifically designed for its usage.

Yes.  However if the driver forgets to clear the field it shouldn't cause
a warning.  After all, there won't be any harm; the next driver to bind 
to the device will just overwrite the driver_data anyway.

> > If every driver follows this rule, we may not need the kobject->owner 
> > stuff.  Although it wouldn't hurt to keep it for safety's sake.
> 
> I'm still concerned about that problem. It is that there is simply no
> guarantee that everybody released their reference before the module's
> exit function returned (the driver itself can and must assure that it
> drops its last reference before it finishes exit, but it just can't
> control who else holds a reference). I'm much in favour of adding a bit
> of code than to risk oopses about which the driver can't do anything
> itself (plus, the race exists now, but the immediate disconnect will
> involve auditing all drivers).

I agree.

> > > > The driver must restrict itself to reading (not
> > > > writing!) the fields in the device structure.  The
> > > > only exception is that the driver may lock/unlock
> > > > dev->sem.
> > > 
> > > And it may decrease the reference count, of course :)
> > 
> > :-)  Actually this should be a little more general, since the device will 
> > generally be embedded in a larger structure created by the subsystem.  The 
> > driver should also be able to acquire and release locks in that larger 
> > structure.
> 
> Yes. Especially since the "gone"-field may be contained in that
> embedding structure if the subsystem controls it.

No, no!  The "gone" flag must be in the private data structure.  If it 
were in a container of the device structure, then it could be overwritten 
when a different driver binds to the device.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-26 Thread Cornelia Huck
On Wed, 25 Apr 2007 16:13:12 -0400 (EDT),
Alan Stern <[EMAIL PROTECTED]> wrote:

> > Documentation/driver-model/lifetime-rules.txt?
> 
> When (if) such a file is added, it should contain more than just these few 
> paragraphs.

This file may be a good idea in general. I'll see if I can come up with
something useful.

> > The remove() method must also unset driver_data.
> 
> It doesn't really have to.  The driver core could do it.

I think it is more consistent if the driver takes care of the fields
specifically designed for its usage.

> It should never need to make that transition.  The only routines in the
> driver which get passed a pointer to the device (as opposed to a pointer
> to the private data) should be the methods called by the driver core or
> sysfs.  Neither will make any calls to the driver after remove() has
> returned, unless the driver does something wrong.

OK, thinko on my side.

> > Uhm. This would imply that a driver may never create a device itself.
> 
> A trivial omission on my part -- "The driver must also retain a module 
> reference to the owner of the device (unless the driver is itself the 
> device's owner)."
> 
> > However, the kobject->owner and module refcounting stuff should remove
> > this restriction.
> 
> If every driver follows this rule, we may not need the kobject->owner 
> stuff.  Although it wouldn't hurt to keep it for safety's sake.

I'm still concerned about that problem. It is that there is simply no
guarantee that everybody released their reference before the module's
exit function returned (the driver itself can and must assure that it
drops its last reference before it finishes exit, but it just can't
control who else holds a reference). I'm much in favour of adding a bit
of code than to risk oopses about which the driver can't do anything
itself (plus, the race exists now, but the immediate disconnect will
involve auditing all drivers).

> > >   The driver must restrict itself to reading (not
> > >   writing!) the fields in the device structure.  The
> > >   only exception is that the driver may lock/unlock
> > >   dev->sem.
> > 
> > And it may decrease the reference count, of course :)
> 
> :-)  Actually this should be a little more general, since the device will 
> generally be embedded in a larger structure created by the subsystem.  The 
> driver should also be able to acquire and release locks in that larger 
> structure.

Yes. Especially since the "gone"-field may be contained in that
embedding structure if the subsystem controls it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-26 Thread Cornelia Huck
On Wed, 25 Apr 2007 16:13:12 -0400 (EDT),
Alan Stern [EMAIL PROTECTED] wrote:

  Documentation/driver-model/lifetime-rules.txt?
 
 When (if) such a file is added, it should contain more than just these few 
 paragraphs.

This file may be a good idea in general. I'll see if I can come up with
something useful.

  The remove() method must also unset driver_data.
 
 It doesn't really have to.  The driver core could do it.

I think it is more consistent if the driver takes care of the fields
specifically designed for its usage.

 It should never need to make that transition.  The only routines in the
 driver which get passed a pointer to the device (as opposed to a pointer
 to the private data) should be the methods called by the driver core or
 sysfs.  Neither will make any calls to the driver after remove() has
 returned, unless the driver does something wrong.

OK, thinko on my side.

  Uhm. This would imply that a driver may never create a device itself.
 
 A trivial omission on my part -- The driver must also retain a module 
 reference to the owner of the device (unless the driver is itself the 
 device's owner).
 
  However, the kobject-owner and module refcounting stuff should remove
  this restriction.
 
 If every driver follows this rule, we may not need the kobject-owner 
 stuff.  Although it wouldn't hurt to keep it for safety's sake.

I'm still concerned about that problem. It is that there is simply no
guarantee that everybody released their reference before the module's
exit function returned (the driver itself can and must assure that it
drops its last reference before it finishes exit, but it just can't
control who else holds a reference). I'm much in favour of adding a bit
of code than to risk oopses about which the driver can't do anything
itself (plus, the race exists now, but the immediate disconnect will
involve auditing all drivers).

 The driver must restrict itself to reading (not
 writing!) the fields in the device structure.  The
 only exception is that the driver may lock/unlock
 dev-sem.
  
  And it may decrease the reference count, of course :)
 
 :-)  Actually this should be a little more general, since the device will 
 generally be embedded in a larger structure created by the subsystem.  The 
 driver should also be able to acquire and release locks in that larger 
 structure.

Yes. Especially since the gone-field may be contained in that
embedding structure if the subsystem controls it.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-26 Thread Alan Stern
On Thu, 26 Apr 2007, Cornelia Huck wrote:

 On Wed, 25 Apr 2007 16:13:12 -0400 (EDT),
 Alan Stern [EMAIL PROTECTED] wrote:
 
   Documentation/driver-model/lifetime-rules.txt?
  
  When (if) such a file is added, it should contain more than just these few 
  paragraphs.
 
 This file may be a good idea in general. I'll see if I can come up with
 something useful.

Good.

   The remove() method must also unset driver_data.
  
  It doesn't really have to.  The driver core could do it.
 
 I think it is more consistent if the driver takes care of the fields
 specifically designed for its usage.

Yes.  However if the driver forgets to clear the field it shouldn't cause
a warning.  After all, there won't be any harm; the next driver to bind 
to the device will just overwrite the driver_data anyway.

  If every driver follows this rule, we may not need the kobject-owner 
  stuff.  Although it wouldn't hurt to keep it for safety's sake.
 
 I'm still concerned about that problem. It is that there is simply no
 guarantee that everybody released their reference before the module's
 exit function returned (the driver itself can and must assure that it
 drops its last reference before it finishes exit, but it just can't
 control who else holds a reference). I'm much in favour of adding a bit
 of code than to risk oopses about which the driver can't do anything
 itself (plus, the race exists now, but the immediate disconnect will
 involve auditing all drivers).

I agree.

The driver must restrict itself to reading (not
writing!) the fields in the device structure.  The
only exception is that the driver may lock/unlock
dev-sem.
   
   And it may decrease the reference count, of course :)
  
  :-)  Actually this should be a little more general, since the device will 
  generally be embedded in a larger structure created by the subsystem.  The 
  driver should also be able to acquire and release locks in that larger 
  structure.
 
 Yes. Especially since the gone-field may be contained in that
 embedding structure if the subsystem controls it.

No, no!  The gone flag must be in the private data structure.  If it 
were in a container of the device structure, then it could be overwritten 
when a different driver binds to the device.

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-26 Thread Cornelia Huck
On Thu, 26 Apr 2007 10:58:45 -0400 (EDT),
Alan Stern [EMAIL PROTECTED] wrote:

The remove() method must also unset driver_data.
   
   It doesn't really have to.  The driver core could do it.
  
  I think it is more consistent if the driver takes care of the fields
  specifically designed for its usage.
 
 Yes.  However if the driver forgets to clear the field it shouldn't cause
 a warning.  After all, there won't be any harm; the next driver to bind 
 to the device will just overwrite the driver_data anyway.

Agreed, but it is still a good practice and should be recommended.

  Yes. Especially since the gone-field may be contained in that
  embedding structure if the subsystem controls it.
 
 No, no!  The gone flag must be in the private data structure.  If it 
 were in a container of the device structure, then it could be overwritten 
 when a different driver binds to the device.

Argl, thinko again. You're right, of course.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-25 Thread Alan Stern
On Wed, 25 Apr 2007, Cornelia Huck wrote:

> On Tue, 24 Apr 2007 15:38:12 -0400 (EDT),
> Alan Stern <[EMAIL PROTECTED]> wrote:
> 
> > We ought to make it explicitly clear that _all_ subsystems should behave
> > this way.  Maybe it isn't necessary to go as far as having device_del()
> > call itself recursively; doing that would open up lots of possible races.  
> > But I think it would be a good idea to add a WARN_ON in device_del, right
> > after the call to bus_remove_device(), that would be triggered if the
> > device still had any children.
> 
> If we decide that this should be policy, WARN_ON may be the least
> invasive option.

Yes.  I have changed my mind regarding your earlier proposal; I now think
it's best to make it "mandatory" for remove() to unregister all children.  
"Mandatory" in the sense that not doing so will provoke a big fat WARN_ON
complaint.

Having device_del() call itself recursively is not a good idea.  It would
destroy that guarantee that no one but the creator of a device will
unregister it.

> Should it be a possible option to move children to a different parent,
> so that the requirement wouldn't be "unregister all children", but "no
> children remain after remove() returns"?

That might confuse udev, but otherwise it could be okay.

> > It would also be good to document (but where?) some lifetime rules for 
> > device drivers.
> 
> Documentation/driver-model/lifetime-rules.txt?

When (if) such a file is added, it should contain more than just these few 
paragraphs.

> > Something like this:
> > 
> > When a driver's remove() method returns, the driver must no
> > longer try to use the device it was just unbound from.  The
> > device may be physically gone, or a different driver may be
> > bound to it.  Most importantly, remove() should unregister
> > all child devices created by the driver.
> 
> s/should/must/, if we agree on the policy above.

Yes.

> The remove() method must also unset driver_data.

It doesn't really have to.  The driver core could do it.

> > To accomplish all this safely, the driver should allocate a
> > private data structure containing at least a "gone" flag and 
> > a mutex or spinlock for synchronization.  Each time the driver
> > needs to use the device, it should first lock the mutex or 
> > spinlock and check the "gone" flag.
> 
> How should a driver make the device -> private data transition if it
> may no longer have private data attached to the device?

It should never need to make that transition.  The only routines in the
driver which get passed a pointer to the device (as opposed to a pointer
to the private data) should be the methods called by the driver core or
sysfs.  Neither will make any calls to the driver after remove() has
returned, unless the driver does something wrong.

> > Ideally remove() should release all of the driver's references
> > to the device, in accord with the "Immediate Detach" principle.
> > However it is acceptable for the driver to retain a reference,
> > provided it meets the following conditions:
> > 
> > The reference must be dropped in a timely manner,
> > such as when the release() methods for all child
> > devices have run.
> > 
> > The driver must also retain a module reference to
> > the owner of the device.  In practice this means the
> > driver must contain static code references to the
> > subsystem which created the device, since struct
> > device doesn't have an "owner" field.
> 
> Uhm. This would imply that a driver may never create a device itself.

A trivial omission on my part -- "The driver must also retain a module 
reference to the owner of the device (unless the driver is itself the 
device's owner)."

> However, the kobject->owner and module refcounting stuff should remove
> this restriction.

If every driver follows this rule, we may not need the kobject->owner 
stuff.  Although it wouldn't hurt to keep it for safety's sake.

> > The driver must restrict itself to reading (not
> > writing!) the fields in the device structure.  The
> > only exception is that the driver may lock/unlock
> > dev->sem.
> 
> And it may decrease the reference count, of course :)

:-)  Actually this should be a little more general, since the device will 
generally be embedded in a larger structure created by the subsystem.  The 
driver should also be able to acquire and release locks in that larger 
structure.


On a related topic, I complained before about the problem with char device
unregistration.  Below is an example patch showing how the USB mechanism
can be made safe.  It should be easy to understand, even for people who
aren't familiar with how the USB core works.  Basically it amounts to
replacing a spinlock with an rwsem, which can then be held throughout an
entire call to an f_op->open() method.

Without 

Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-25 Thread Cornelia Huck
On Tue, 24 Apr 2007 15:38:12 -0400 (EDT),
Alan Stern <[EMAIL PROTECTED]> wrote:

> We ought to make it explicitly clear that _all_ subsystems should behave
> this way.  Maybe it isn't necessary to go as far as having device_del()
> call itself recursively; doing that would open up lots of possible races.  
> But I think it would be a good idea to add a WARN_ON in device_del, right
> after the call to bus_remove_device(), that would be triggered if the
> device still had any children.

If we decide that this should be policy, WARN_ON may be the least
invasive option.

Should it be a possible option to move children to a different parent,
so that the requirement wouldn't be "unregister all children", but "no
children remain after remove() returns"?

> It would also be good to document (but where?) some lifetime rules for 
> device drivers.

Documentation/driver-model/lifetime-rules.txt?

> Something like this:
> 
>   When a driver's remove() method returns, the driver must no
>   longer try to use the device it was just unbound from.  The
>   device may be physically gone, or a different driver may be
>   bound to it.  Most importantly, remove() should unregister
>   all child devices created by the driver.

s/should/must/, if we agree on the policy above.

The remove() method must also unset driver_data.

>   To accomplish all this safely, the driver should allocate a
>   private data structure containing at least a "gone" flag and 
>   a mutex or spinlock for synchronization.  Each time the driver
>   needs to use the device, it should first lock the mutex or 
>   spinlock and check the "gone" flag.

How should a driver make the device -> private data transition if it
may no longer have private data attached to the device?

>   Ideally remove() should release all of the driver's references
>   to the device, in accord with the "Immediate Detach" principle.
>   However it is acceptable for the driver to retain a reference,
>   provided it meets the following conditions:
> 
>   The reference must be dropped in a timely manner,
>   such as when the release() methods for all child
>   devices have run.
> 
>   The driver must also retain a module reference to
>   the owner of the device.  In practice this means the
>   driver must contain static code references to the
>   subsystem which created the device, since struct
>   device doesn't have an "owner" field.

Uhm. This would imply that a driver may never create a device itself.
However, the kobject->owner and module refcounting stuff should remove
this restriction.

>   The driver must restrict itself to reading (not
>   writing!) the fields in the device structure.  The
>   only exception is that the driver may lock/unlock
>   dev->sem.

And it may decrease the reference count, of course :)

> How does that sound?

Yes, we should have some documentation like that.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-25 Thread Cornelia Huck
On Tue, 24 Apr 2007 15:38:12 -0400 (EDT),
Alan Stern [EMAIL PROTECTED] wrote:

 We ought to make it explicitly clear that _all_ subsystems should behave
 this way.  Maybe it isn't necessary to go as far as having device_del()
 call itself recursively; doing that would open up lots of possible races.  
 But I think it would be a good idea to add a WARN_ON in device_del, right
 after the call to bus_remove_device(), that would be triggered if the
 device still had any children.

If we decide that this should be policy, WARN_ON may be the least
invasive option.

Should it be a possible option to move children to a different parent,
so that the requirement wouldn't be unregister all children, but no
children remain after remove() returns?

 It would also be good to document (but where?) some lifetime rules for 
 device drivers.

Documentation/driver-model/lifetime-rules.txt?

 Something like this:
 
   When a driver's remove() method returns, the driver must no
   longer try to use the device it was just unbound from.  The
   device may be physically gone, or a different driver may be
   bound to it.  Most importantly, remove() should unregister
   all child devices created by the driver.

s/should/must/, if we agree on the policy above.

The remove() method must also unset driver_data.

   To accomplish all this safely, the driver should allocate a
   private data structure containing at least a gone flag and 
   a mutex or spinlock for synchronization.  Each time the driver
   needs to use the device, it should first lock the mutex or 
   spinlock and check the gone flag.

How should a driver make the device - private data transition if it
may no longer have private data attached to the device?

   Ideally remove() should release all of the driver's references
   to the device, in accord with the Immediate Detach principle.
   However it is acceptable for the driver to retain a reference,
   provided it meets the following conditions:
 
   The reference must be dropped in a timely manner,
   such as when the release() methods for all child
   devices have run.
 
   The driver must also retain a module reference to
   the owner of the device.  In practice this means the
   driver must contain static code references to the
   subsystem which created the device, since struct
   device doesn't have an owner field.

Uhm. This would imply that a driver may never create a device itself.
However, the kobject-owner and module refcounting stuff should remove
this restriction.

   The driver must restrict itself to reading (not
   writing!) the fields in the device structure.  The
   only exception is that the driver may lock/unlock
   dev-sem.

And it may decrease the reference count, of course :)

 How does that sound?

Yes, we should have some documentation like that.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-25 Thread Alan Stern
On Wed, 25 Apr 2007, Cornelia Huck wrote:

 On Tue, 24 Apr 2007 15:38:12 -0400 (EDT),
 Alan Stern [EMAIL PROTECTED] wrote:
 
  We ought to make it explicitly clear that _all_ subsystems should behave
  this way.  Maybe it isn't necessary to go as far as having device_del()
  call itself recursively; doing that would open up lots of possible races.  
  But I think it would be a good idea to add a WARN_ON in device_del, right
  after the call to bus_remove_device(), that would be triggered if the
  device still had any children.
 
 If we decide that this should be policy, WARN_ON may be the least
 invasive option.

Yes.  I have changed my mind regarding your earlier proposal; I now think
it's best to make it mandatory for remove() to unregister all children.  
Mandatory in the sense that not doing so will provoke a big fat WARN_ON
complaint.

Having device_del() call itself recursively is not a good idea.  It would
destroy that guarantee that no one but the creator of a device will
unregister it.

 Should it be a possible option to move children to a different parent,
 so that the requirement wouldn't be unregister all children, but no
 children remain after remove() returns?

That might confuse udev, but otherwise it could be okay.

  It would also be good to document (but where?) some lifetime rules for 
  device drivers.
 
 Documentation/driver-model/lifetime-rules.txt?

When (if) such a file is added, it should contain more than just these few 
paragraphs.

  Something like this:
  
  When a driver's remove() method returns, the driver must no
  longer try to use the device it was just unbound from.  The
  device may be physically gone, or a different driver may be
  bound to it.  Most importantly, remove() should unregister
  all child devices created by the driver.
 
 s/should/must/, if we agree on the policy above.

Yes.

 The remove() method must also unset driver_data.

It doesn't really have to.  The driver core could do it.

  To accomplish all this safely, the driver should allocate a
  private data structure containing at least a gone flag and 
  a mutex or spinlock for synchronization.  Each time the driver
  needs to use the device, it should first lock the mutex or 
  spinlock and check the gone flag.
 
 How should a driver make the device - private data transition if it
 may no longer have private data attached to the device?

It should never need to make that transition.  The only routines in the
driver which get passed a pointer to the device (as opposed to a pointer
to the private data) should be the methods called by the driver core or
sysfs.  Neither will make any calls to the driver after remove() has
returned, unless the driver does something wrong.

  Ideally remove() should release all of the driver's references
  to the device, in accord with the Immediate Detach principle.
  However it is acceptable for the driver to retain a reference,
  provided it meets the following conditions:
  
  The reference must be dropped in a timely manner,
  such as when the release() methods for all child
  devices have run.
  
  The driver must also retain a module reference to
  the owner of the device.  In practice this means the
  driver must contain static code references to the
  subsystem which created the device, since struct
  device doesn't have an owner field.
 
 Uhm. This would imply that a driver may never create a device itself.

A trivial omission on my part -- The driver must also retain a module 
reference to the owner of the device (unless the driver is itself the 
device's owner).

 However, the kobject-owner and module refcounting stuff should remove
 this restriction.

If every driver follows this rule, we may not need the kobject-owner 
stuff.  Although it wouldn't hurt to keep it for safety's sake.

  The driver must restrict itself to reading (not
  writing!) the fields in the device structure.  The
  only exception is that the driver may lock/unlock
  dev-sem.
 
 And it may decrease the reference count, of course :)

:-)  Actually this should be a little more general, since the device will 
generally be embedded in a larger structure created by the subsystem.  The 
driver should also be able to acquire and release locks in that larger 
structure.


On a related topic, I complained before about the problem with char device
unregistration.  Below is an example patch showing how the USB mechanism
can be made safe.  It should be easy to understand, even for people who
aren't familiar with how the USB core works.  Basically it amounts to
replacing a spinlock with an rwsem, which can then be held throughout an
entire call to an f_op-open() method.

Without something like this, we face a potential race:

Driver  User
--

Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-24 Thread Alan Stern
On Sun, 22 Apr 2007, Greg KH wrote:

> > Looking some more, kobject_get_path() is used for kobject renaming,
> > uevent handling, and a little bit in the input core.  None of these things 
> > should try to access a kobject after it has been del()ed.  After all, it's 
> > no longer present in the filesystem so it doesn't _have_ a path.
> 
> But we _have_ to have a full path at that time to tell userspace what
> just went away.  That is the main reason we enforce this (there were
> tons of issues with scsi devices and this in the past which is what
> caused us to enforce this.)

The SCSI subsystem has undergone many, many changes since 2.6.0.  In 
particular, it has implemented a full-fledged device-state model, complete 
with spinlock-protected state transitions.

I'll have to check with James Bottomley, but I bet that SCSI now always 
unregisters all the children of a device before unregistering the device 
itself.


For everyone:

We ought to make it explicitly clear that _all_ subsystems should behave
this way.  Maybe it isn't necessary to go as far as having device_del()
call itself recursively; doing that would open up lots of possible races.  
But I think it would be a good idea to add a WARN_ON in device_del, right
after the call to bus_remove_device(), that would be triggered if the
device still had any children.

It would also be good to document (but where?) some lifetime rules for 
device drivers.  Something like this:

When a driver's remove() method returns, the driver must no
longer try to use the device it was just unbound from.  The
device may be physically gone, or a different driver may be
bound to it.  Most importantly, remove() should unregister
all child devices created by the driver.

To accomplish all this safely, the driver should allocate a
private data structure containing at least a "gone" flag and 
a mutex or spinlock for synchronization.  Each time the driver
needs to use the device, it should first lock the mutex or 
spinlock and check the "gone" flag.

Ideally remove() should release all of the driver's references
to the device, in accord with the "Immediate Detach" principle.
However it is acceptable for the driver to retain a reference,
provided it meets the following conditions:

The reference must be dropped in a timely manner,
such as when the release() methods for all child
devices have run.

The driver must also retain a module reference to
the owner of the device.  In practice this means the
driver must contain static code references to the
subsystem which created the device, since struct
device doesn't have an "owner" field.

The driver must restrict itself to reading (not
writing!) the fields in the device structure.  The
only exception is that the driver may lock/unlock
dev->sem.

How does that sound?

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-24 Thread Alan Stern
On Sun, 22 Apr 2007, Greg KH wrote:

  Looking some more, kobject_get_path() is used for kobject renaming,
  uevent handling, and a little bit in the input core.  None of these things 
  should try to access a kobject after it has been del()ed.  After all, it's 
  no longer present in the filesystem so it doesn't _have_ a path.
 
 But we _have_ to have a full path at that time to tell userspace what
 just went away.  That is the main reason we enforce this (there were
 tons of issues with scsi devices and this in the past which is what
 caused us to enforce this.)

The SCSI subsystem has undergone many, many changes since 2.6.0.  In 
particular, it has implemented a full-fledged device-state model, complete 
with spinlock-protected state transitions.

I'll have to check with James Bottomley, but I bet that SCSI now always 
unregisters all the children of a device before unregistering the device 
itself.


For everyone:

We ought to make it explicitly clear that _all_ subsystems should behave
this way.  Maybe it isn't necessary to go as far as having device_del()
call itself recursively; doing that would open up lots of possible races.  
But I think it would be a good idea to add a WARN_ON in device_del, right
after the call to bus_remove_device(), that would be triggered if the
device still had any children.

It would also be good to document (but where?) some lifetime rules for 
device drivers.  Something like this:

When a driver's remove() method returns, the driver must no
longer try to use the device it was just unbound from.  The
device may be physically gone, or a different driver may be
bound to it.  Most importantly, remove() should unregister
all child devices created by the driver.

To accomplish all this safely, the driver should allocate a
private data structure containing at least a gone flag and 
a mutex or spinlock for synchronization.  Each time the driver
needs to use the device, it should first lock the mutex or 
spinlock and check the gone flag.

Ideally remove() should release all of the driver's references
to the device, in accord with the Immediate Detach principle.
However it is acceptable for the driver to retain a reference,
provided it meets the following conditions:

The reference must be dropped in a timely manner,
such as when the release() methods for all child
devices have run.

The driver must also retain a module reference to
the owner of the device.  In practice this means the
driver must contain static code references to the
subsystem which created the device, since struct
device doesn't have an owner field.

The driver must restrict itself to reading (not
writing!) the fields in the device structure.  The
only exception is that the driver may lock/unlock
dev-sem.

How does that sound?

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-23 Thread Alan Stern
On Mon, 23 Apr 2007, Cornelia Huck wrote:

> On Sun, 22 Apr 2007 10:40:51 -0700,
> Greg KH <[EMAIL PROTECTED]> wrote:
> 
> > > Looking some more, kobject_get_path() is used for kobject renaming,
> > > uevent handling, and a little bit in the input core.  None of these 
> > > things 
> > > should try to access a kobject after it has been del()ed.  After all, 
> > > it's 
> > > no longer present in the filesystem so it doesn't _have_ a path.
> > 
> > But we _have_ to have a full path at that time to tell userspace what
> > just went away.  That is the main reason we enforce this (there were
> > tons of issues with scsi devices and this in the past which is what
> > caused us to enforce this.)
> 
> What we need to ensure is that the parent device is kept at least until
> all children, grandchildren and so on are done with their uevent needs.
> This would imply it needed to stay as long as those children,
> grandchildren, ... are still registered. Would it be save to suggest
> that a ->remove callback would always need to unregister the children?
> Then putting the parent reference at the end of kobject_del() (which is
> after kobject_uevent() in kobject_unregister()) should be safe.

Yes, this is the weak spot.

For some reason I had assumed that it was illegal to unregister a device
while it had registered children (just as it is illegal to rmdir a
non-empty directory).  If it isn't illegal, then perhaps we should arrange
things so that device_del() will recursively call itself for all the
device's children.

> Question: What now?
> 
> 1. Make it mandatory that all children must be unregistered when
> device_del() returns.
> 
> 2. Don't demand an empty directory in sysfs_drop_dentry().

I'm in favor of (1).  But instead of making it mandatory, simply force it
to be true by having device_del() call itself for all remaining children.

For this to be safe, we also have to allow device_del() to be called 
multiple times (since the device's owner might not be aware that the core 
had already unregistered it).  That's no problem; just add:

if (!device_is_registered(dev))
return;

to the start of the routine.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-23 Thread Cornelia Huck
On Sun, 22 Apr 2007 10:40:51 -0700,
Greg KH <[EMAIL PROTECTED]> wrote:

> > Looking some more, kobject_get_path() is used for kobject renaming,
> > uevent handling, and a little bit in the input core.  None of these things 
> > should try to access a kobject after it has been del()ed.  After all, it's 
> > no longer present in the filesystem so it doesn't _have_ a path.
> 
> But we _have_ to have a full path at that time to tell userspace what
> just went away.  That is the main reason we enforce this (there were
> tons of issues with scsi devices and this in the past which is what
> caused us to enforce this.)

What we need to ensure is that the parent device is kept at least until
all children, grandchildren and so on are done with their uevent needs.
This would imply it needed to stay as long as those children,
grandchildren, ... are still registered. Would it be save to suggest
that a ->remove callback would always need to unregister the children?
Then putting the parent reference at the end of kobject_del() (which is
after kobject_uevent() in kobject_unregister()) should be safe.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-23 Thread Greg KH
On Mon, Apr 23, 2007 at 03:40:21PM +0900, Tejun Heo wrote:
> Hello, Dmitry.
> 
> Dmitry Torokhov wrote:
> > Isn't think a good thing? By decoupling the 2 layers we insulate them
> > from changes in each other. This allows bug subsystems to concentrate
> > on topics that important to them instead of worying about refcounting
> > objects that are not directly interesting for the subsystem in
> > question.
> 
> I think the best thing would be make struct device's lifetime rules
> simple enough such that it doesn't really matter to driver subsystems
> and drivers can just do what they wanna do.

I agree.

> Also, separate struct device from the actual implementation has problem
> in that struct device is widely used to refer to the device by many
> layers drivers register devices to.  Basically, you'll have to implement
> immediate-disconnect between struct device and the actual
> implementation.  So, it just shifts the problem from struct device to
> the place between struct device and actual implementation and I think
> struct device itself is better place to deal with that than somewhere
> inbetween it and driver private data.

I also agree.

> > Now for smaller subsystems it may make sense to embed stuct devices
> > into subsystem objects and manage it all together. In fact input
> > system does this but I think it is much simlpier than SCSI or IDE.
> 
> Well, both SCSI and IDE heavily depend on struct device acting as 'base
> class'.  It's all over the place and almost a basic assumption about the
> driver model.

And that's how it should be.

And your sysfs patches now make it a lot easier than before, and I can't
thank you enough for doing that work.

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-23 Thread Tejun Heo
Hello, Dmitry.

Dmitry Torokhov wrote:
> Isn't think a good thing? By decoupling the 2 layers we insulate them
> from changes in each other. This allows bug subsystems to concentrate
> on topics that important to them instead of worying about refcounting
> objects that are not directly interesting for the subsystem in
> question.

I think the best thing would be make struct device's lifetime rules
simple enough such that it doesn't really matter to driver subsystems
and drivers can just do what they wanna do.

Also, separate struct device from the actual implementation has problem
in that struct device is widely used to refer to the device by many
layers drivers register devices to.  Basically, you'll have to implement
immediate-disconnect between struct device and the actual
implementation.  So, it just shifts the problem from struct device to
the place between struct device and actual implementation and I think
struct device itself is better place to deal with that than somewhere
inbetween it and driver private data.

> Now for smaller subsystems it may make sense to embed stuct devices
> into subsystem objects and manage it all together. In fact input
> system does this but I think it is much simlpier than SCSI or IDE.

Well, both SCSI and IDE heavily depend on struct device acting as 'base
class'.  It's all over the place and almost a basic assumption about the
driver model.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-23 Thread Tejun Heo
Hello, Dmitry.

Dmitry Torokhov wrote:
 Isn't think a good thing? By decoupling the 2 layers we insulate them
 from changes in each other. This allows bug subsystems to concentrate
 on topics that important to them instead of worying about refcounting
 objects that are not directly interesting for the subsystem in
 question.

I think the best thing would be make struct device's lifetime rules
simple enough such that it doesn't really matter to driver subsystems
and drivers can just do what they wanna do.

Also, separate struct device from the actual implementation has problem
in that struct device is widely used to refer to the device by many
layers drivers register devices to.  Basically, you'll have to implement
immediate-disconnect between struct device and the actual
implementation.  So, it just shifts the problem from struct device to
the place between struct device and actual implementation and I think
struct device itself is better place to deal with that than somewhere
inbetween it and driver private data.

 Now for smaller subsystems it may make sense to embed stuct devices
 into subsystem objects and manage it all together. In fact input
 system does this but I think it is much simlpier than SCSI or IDE.

Well, both SCSI and IDE heavily depend on struct device acting as 'base
class'.  It's all over the place and almost a basic assumption about the
driver model.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-23 Thread Greg KH
On Mon, Apr 23, 2007 at 03:40:21PM +0900, Tejun Heo wrote:
 Hello, Dmitry.
 
 Dmitry Torokhov wrote:
  Isn't think a good thing? By decoupling the 2 layers we insulate them
  from changes in each other. This allows bug subsystems to concentrate
  on topics that important to them instead of worying about refcounting
  objects that are not directly interesting for the subsystem in
  question.
 
 I think the best thing would be make struct device's lifetime rules
 simple enough such that it doesn't really matter to driver subsystems
 and drivers can just do what they wanna do.

I agree.

 Also, separate struct device from the actual implementation has problem
 in that struct device is widely used to refer to the device by many
 layers drivers register devices to.  Basically, you'll have to implement
 immediate-disconnect between struct device and the actual
 implementation.  So, it just shifts the problem from struct device to
 the place between struct device and actual implementation and I think
 struct device itself is better place to deal with that than somewhere
 inbetween it and driver private data.

I also agree.

  Now for smaller subsystems it may make sense to embed stuct devices
  into subsystem objects and manage it all together. In fact input
  system does this but I think it is much simlpier than SCSI or IDE.
 
 Well, both SCSI and IDE heavily depend on struct device acting as 'base
 class'.  It's all over the place and almost a basic assumption about the
 driver model.

And that's how it should be.

And your sysfs patches now make it a lot easier than before, and I can't
thank you enough for doing that work.

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-23 Thread Cornelia Huck
On Sun, 22 Apr 2007 10:40:51 -0700,
Greg KH [EMAIL PROTECTED] wrote:

  Looking some more, kobject_get_path() is used for kobject renaming,
  uevent handling, and a little bit in the input core.  None of these things 
  should try to access a kobject after it has been del()ed.  After all, it's 
  no longer present in the filesystem so it doesn't _have_ a path.
 
 But we _have_ to have a full path at that time to tell userspace what
 just went away.  That is the main reason we enforce this (there were
 tons of issues with scsi devices and this in the past which is what
 caused us to enforce this.)

What we need to ensure is that the parent device is kept at least until
all children, grandchildren and so on are done with their uevent needs.
This would imply it needed to stay as long as those children,
grandchildren, ... are still registered. Would it be save to suggest
that a -remove callback would always need to unregister the children?
Then putting the parent reference at the end of kobject_del() (which is
after kobject_uevent() in kobject_unregister()) should be safe.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-23 Thread Alan Stern
On Mon, 23 Apr 2007, Cornelia Huck wrote:

 On Sun, 22 Apr 2007 10:40:51 -0700,
 Greg KH [EMAIL PROTECTED] wrote:
 
   Looking some more, kobject_get_path() is used for kobject renaming,
   uevent handling, and a little bit in the input core.  None of these 
   things 
   should try to access a kobject after it has been del()ed.  After all, 
   it's 
   no longer present in the filesystem so it doesn't _have_ a path.
  
  But we _have_ to have a full path at that time to tell userspace what
  just went away.  That is the main reason we enforce this (there were
  tons of issues with scsi devices and this in the past which is what
  caused us to enforce this.)
 
 What we need to ensure is that the parent device is kept at least until
 all children, grandchildren and so on are done with their uevent needs.
 This would imply it needed to stay as long as those children,
 grandchildren, ... are still registered. Would it be save to suggest
 that a -remove callback would always need to unregister the children?
 Then putting the parent reference at the end of kobject_del() (which is
 after kobject_uevent() in kobject_unregister()) should be safe.

Yes, this is the weak spot.

For some reason I had assumed that it was illegal to unregister a device
while it had registered children (just as it is illegal to rmdir a
non-empty directory).  If it isn't illegal, then perhaps we should arrange
things so that device_del() will recursively call itself for all the
device's children.

 Question: What now?
 
 1. Make it mandatory that all children must be unregistered when
 device_del() returns.
 
 2. Don't demand an empty directory in sysfs_drop_dentry().

I'm in favor of (1).  But instead of making it mandatory, simply force it
to be true by having device_del() call itself for all remaining children.

For this to be safe, we also have to allow device_del() to be called 
multiple times (since the device's owner might not be aware that the core 
had already unregistered it).  That's no problem; just add:

if (!device_is_registered(dev))
return;

to the start of the routine.

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-22 Thread Greg KH
On Sat, Apr 21, 2007 at 05:36:51PM -0400, Alan Stern wrote:
> On Fri, 20 Apr 2007, Greg KH wrote:
> 
> > > Greg, do you know of anything in particular that depends on a kobjects 
> > > not 
> > > being released before their children are released?
> > 
> > Yes, the whole driver model :)
> 
> But anything in particular?  Looking through the source code, I see 
> kobj->parent gets used mainly by kobject_get_path() and not by much else.

Which is essencial for when a kobject is removed from the system and we
need to tell userspace which kobject it was.

> Looking some more, kobject_get_path() is used for kobject renaming,
> uevent handling, and a little bit in the input core.  None of these things 
> should try to access a kobject after it has been del()ed.  After all, it's 
> no longer present in the filesystem so it doesn't _have_ a path.

But we _have_ to have a full path at that time to tell userspace what
just went away.  That is the main reason we enforce this (there were
tons of issues with scsi devices and this in the past which is what
caused us to enforce this.)

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-22 Thread Greg KH
On Sat, Apr 21, 2007 at 05:36:51PM -0400, Alan Stern wrote:
 On Fri, 20 Apr 2007, Greg KH wrote:
 
   Greg, do you know of anything in particular that depends on a kobjects 
   not 
   being released before their children are released?
  
  Yes, the whole driver model :)
 
 But anything in particular?  Looking through the source code, I see 
 kobj-parent gets used mainly by kobject_get_path() and not by much else.

Which is essencial for when a kobject is removed from the system and we
need to tell userspace which kobject it was.

 Looking some more, kobject_get_path() is used for kobject renaming,
 uevent handling, and a little bit in the input core.  None of these things 
 should try to access a kobject after it has been del()ed.  After all, it's 
 no longer present in the filesystem so it doesn't _have_ a path.

But we _have_ to have a full path at that time to tell userspace what
just went away.  That is the main reason we enforce this (there were
tons of issues with scsi devices and this in the past which is what
caused us to enforce this.)

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-21 Thread Alan Stern
On Fri, 20 Apr 2007, Greg KH wrote:

> > Greg, do you know of anything in particular that depends on a kobjects not 
> > being released before their children are released?
> 
> Yes, the whole driver model :)

But anything in particular?  Looking through the source code, I see 
kobj->parent gets used mainly by kobject_get_path() and not by much else.

Looking some more, kobject_get_path() is used for kobject renaming,
uevent handling, and a little bit in the input core.  None of these things 
should try to access a kobject after it has been del()ed.  After all, it's 
no longer present in the filesystem so it doesn't _have_ a path.

So I don't see any immediate problem.  A quick boot with my patch applied,
during which I installed and removed various modules and hot-pluggable
devices, didn't cause anything strange to happen.

> When adding a new device, we always grab a reference to the parent
> device so it can not go away before we do.
> 
> Look at the last kobject_put(parent); in kobject_cleanup() which ensures
> this.

Yes, I know.  The question is: What (if anything) is wrong with the parent
going away first?  So long as the parent remains present while the child
is _registered_, who will care if the parent is deallocated after the
child is unregistered but before it is released?

And if there is any code which does care, don't you think we should be
able to change it so that it doesn't?


> Ick, no, I think this used to be the way things worked, but bad things
> would end up happening, so we fixed it up to be the way things are
> today.  Read the comments for the changelog for this file for details.
> 
> Specifically, look at commit 10921a8f1305b8ec97794941db78b825db5839bc
> in the history.git repo which is almost exactly what you are proposing
> to be reverted...

Yes, it is.  I had a little trouble finding it; the search facility in the
gitweb system at git.kernel.org doesn't seem to work right.  Who should
I complain to about that?

Anyway, the patch itself is available at

http://marc.info/?l=linux-kernel=107116644617624=2

Here's what the changelog comment says:

It fixes a kobject bug where the parent could be deleted before the
child object, causing all sorts of badness later when we clean up the
child object.  It's been acked by Pat.

Not terribly explicit.  As far as I can tell, cleaning up the child object
doesn't do much except to kfree() a few items and call the ktype's
release() routine.  For a struct device, the release() routine merely
calls dev->release(), or dev->type->release(), or
dev->class->dev_release() as the case may be.  None of these should try to
access the device's parent unless they made special arrangements to
acquire a reference to it beforehand.  Which is what we're trying to
eliminate -- that's what immediate detach means.

The change was made back in December 2003, for 2.6.0-test11.  Since then
the driver-model core and its users have evolved an awful lot.  Perhaps
reverting it now won't hurt anything.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-21 Thread Alan Stern
On Fri, 20 Apr 2007, Dmitry Torokhov wrote:

> On 4/19/07, Alan Stern <[EMAIL PROTECTED]> wrote:
> >
> > Among the worst offenders are character devices.  None of the subsystem
> > providers offering char device registration performs immediate detach --
> > they are a lot like sysfs used to be.  (In fact, they probably _can't_
> > provide it since read() or write() calls can block indefinitely in the
> > lower-level driver; the subsystem may have no way to force them to
> > fail-fast.)
> 
> 
> That shoudl be doable - the read/write routines should check if device
> is still alive and return immideately when device is dead. When
> disconnecting device just wake up all readers and writers and they
> should eventually fall off.

Read and write don't matter so much; they can be handled easily enough in
the driver.  The real problem is the race between open() and unregister().  
We rely on the low-level drivers to make them mutually exclusive, but it
really should be done by the subsystem.  More precisely, the subsystem
should guarantee that all outstanding calls to open() have completed
before unregister() returns.

> Hmm, I guess am starting to think that using refcounting everywhere is
> not a good idea. We are trying to have "immediate disconnect" behavior
> and refcounting is an antithesis of it. Refcounting works well when it
> is contained - register grabs reference; unregister puts it back; but
> there is no passing references down between the layers. When device is
> being removed it needs to signal downstream that it is gone and should
> not be accessed anymore. And we need to do that anyway because if
> device is really gone but its users ignore it they will get endless
> stream of errors when trying to access it.

"Should not be accessed" is correct, but there's nothing wrong in 
principle with acquiring a lock in the device structure, provided the 
driver does nothing else.  And provided that the driver pins the subsystem 
module, so the device's release() routine will be called _before_ the 
subsystem can be unloaded.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-21 Thread Alan Stern
On Fri, 20 Apr 2007, Dmitry Torokhov wrote:

> On 4/20/07, Alan Stern <[EMAIL PROTECTED]> wrote:
> >
> > Dmitry, in thinking things over some more I realized there's going to be a
> > problem with the autosuspend support in USB.  It has to do with the way a
> > driver needs to prevent (or block) suspends from occurring while it is
> > actively using a device.
> >
> > To understand this, you need to know that USB adds a pm_mutex to the
> > device structure.  It gets used for synchronizing all suspend- or
> > resume-related activities.  In particular, the driver's suspend() method
> > is called with usbdev->pm_mutex held.
> >
> > The next idea is that a driver needs to synchronize its remove() method
> > with other methods such as read() -- we mustn't allow read() to try and
> > refer to the device after the driver has been unbound.  Let's say the
> > driver has unbind_mutex embedded in its private data structure for this
> > purpose.
> >
> > Now consider what read() has to do.  It needs to block suspends from
> > occurring while it runs, and it needs to do an autoresume if the device
> > was already suspended.  So read() will look like this:
> >
> >mutex_lock(>unbind_mutex);
> >if (private->gone) {
> >ret = -ENODEV;
> >goto done;
> >}
> >if (private->suspended)
> >autoresume(private->usbdev);
> >...
> >  done:
> >mutex_unlock(>unbind_mutex);
> >return ret;
> >
> > Meanwhile, the suspend() method needs to block while read() is running.
> > So it will look like this:
> >
> >private->suspended = 1;
> >mutex_lock(>unbind_mutex);
> >/* Now the driver has quiesced */
> >mutex_unlock(>unbind_mutex);
> >
> > Here's the problem:  The autoresume() call inside read() tries to acquire
> > usbdev->pm_mutex while holding private->unbind_mutex.  The suspend()
> > method does the reverse, a locking-order violation.
> >
> > So far I haven't figured out any way to make this work.  Do you have a
> > suggestion?  (Don't say read() should hold pm_mutex as well as
> > unbind_mutex; that's no good -- although the reason is rather obscure.)
> >
> 
> First of all I think that I would merge pm and unbind mutex into one -
> you also need to synchronize resume and suspend with bind and unbind
> so you pretty much need to always acquire both of them.

The USB core already insures that suspend and resume are mutually 
exclusive with bind and unbind, so that part doesn't matter.

Besides, there's a more important reason for not merging the pm and unbind
mutexes: The pm mutex lies in the usb_device structure but the unbind
mutex lies in the private data structure.

It has to be this way.  The pm mutex is used by the USB core, which
doesn't know anything about the private data.  Conversely, the driver has
to acquire the unbind mutex at times when it doesn't know whether or not
it has already been unbound, so that mutex must go in the private data.  
Otherwise the driver might try to acquire the mutex -- thereby touching
the usb_device -- after it was unbound, violating immediate detachment.

> Second (and I admit I have not followed USB autoresume discussions, my
> usb-devle backlog is over 5000 messages ;( ) is I am not sure why
> woudl a read block autoresume. I can see write doing that but passive
> reads should not affect device state.

I was just using read() as an example.  And it wouldn't block autoresume;
it would _do_ an autoresume, thereby preventing autosuspend.

Another possible arrangement would be to have open() do an autoresume
(preventing autosuspend during the entire time the device file is held
open) and have close() re-enable autosuspend.  But the principle remains
the same.

I don't know, perhaps adding another mutex would work.  I need to think 
about it some more.

There's always the easy solution: Don't do a complete immediate detach.  
The driver could keep a reference to the device structure for an extended
period, provided it always checks private->gone whenever it locks
usbdev->sem.  This would mean that the USB core might not be able to
release usbdev until the lower-level driver was unloaded.  But that's
okay: Since the lower-level driver contains code references to routines in
the USB core, the core would be pinned anyway.

This example does show that implementing immediate detach comes with a
price.  It makes things more complicated than they would be otherwise.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-21 Thread Alan Stern
On Fri, 20 Apr 2007, Dmitry Torokhov wrote:

 On 4/20/07, Alan Stern [EMAIL PROTECTED] wrote:
 
  Dmitry, in thinking things over some more I realized there's going to be a
  problem with the autosuspend support in USB.  It has to do with the way a
  driver needs to prevent (or block) suspends from occurring while it is
  actively using a device.
 
  To understand this, you need to know that USB adds a pm_mutex to the
  device structure.  It gets used for synchronizing all suspend- or
  resume-related activities.  In particular, the driver's suspend() method
  is called with usbdev-pm_mutex held.
 
  The next idea is that a driver needs to synchronize its remove() method
  with other methods such as read() -- we mustn't allow read() to try and
  refer to the device after the driver has been unbound.  Let's say the
  driver has unbind_mutex embedded in its private data structure for this
  purpose.
 
  Now consider what read() has to do.  It needs to block suspends from
  occurring while it runs, and it needs to do an autoresume if the device
  was already suspended.  So read() will look like this:
 
 mutex_lock(private-unbind_mutex);
 if (private-gone) {
 ret = -ENODEV;
 goto done;
 }
 if (private-suspended)
 autoresume(private-usbdev);
 ...
   done:
 mutex_unlock(private-unbind_mutex);
 return ret;
 
  Meanwhile, the suspend() method needs to block while read() is running.
  So it will look like this:
 
 private-suspended = 1;
 mutex_lock(private-unbind_mutex);
 /* Now the driver has quiesced */
 mutex_unlock(private-unbind_mutex);
 
  Here's the problem:  The autoresume() call inside read() tries to acquire
  usbdev-pm_mutex while holding private-unbind_mutex.  The suspend()
  method does the reverse, a locking-order violation.
 
  So far I haven't figured out any way to make this work.  Do you have a
  suggestion?  (Don't say read() should hold pm_mutex as well as
  unbind_mutex; that's no good -- although the reason is rather obscure.)
 
 
 First of all I think that I would merge pm and unbind mutex into one -
 you also need to synchronize resume and suspend with bind and unbind
 so you pretty much need to always acquire both of them.

The USB core already insures that suspend and resume are mutually 
exclusive with bind and unbind, so that part doesn't matter.

Besides, there's a more important reason for not merging the pm and unbind
mutexes: The pm mutex lies in the usb_device structure but the unbind
mutex lies in the private data structure.

It has to be this way.  The pm mutex is used by the USB core, which
doesn't know anything about the private data.  Conversely, the driver has
to acquire the unbind mutex at times when it doesn't know whether or not
it has already been unbound, so that mutex must go in the private data.  
Otherwise the driver might try to acquire the mutex -- thereby touching
the usb_device -- after it was unbound, violating immediate detachment.

 Second (and I admit I have not followed USB autoresume discussions, my
 usb-devle backlog is over 5000 messages ;( ) is I am not sure why
 woudl a read block autoresume. I can see write doing that but passive
 reads should not affect device state.

I was just using read() as an example.  And it wouldn't block autoresume;
it would _do_ an autoresume, thereby preventing autosuspend.

Another possible arrangement would be to have open() do an autoresume
(preventing autosuspend during the entire time the device file is held
open) and have close() re-enable autosuspend.  But the principle remains
the same.

I don't know, perhaps adding another mutex would work.  I need to think 
about it some more.

There's always the easy solution: Don't do a complete immediate detach.  
The driver could keep a reference to the device structure for an extended
period, provided it always checks private-gone whenever it locks
usbdev-sem.  This would mean that the USB core might not be able to
release usbdev until the lower-level driver was unloaded.  But that's
okay: Since the lower-level driver contains code references to routines in
the USB core, the core would be pinned anyway.

This example does show that implementing immediate detach comes with a
price.  It makes things more complicated than they would be otherwise.

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-21 Thread Alan Stern
On Fri, 20 Apr 2007, Dmitry Torokhov wrote:

 On 4/19/07, Alan Stern [EMAIL PROTECTED] wrote:
 
  Among the worst offenders are character devices.  None of the subsystem
  providers offering char device registration performs immediate detach --
  they are a lot like sysfs used to be.  (In fact, they probably _can't_
  provide it since read() or write() calls can block indefinitely in the
  lower-level driver; the subsystem may have no way to force them to
  fail-fast.)
 
 
 That shoudl be doable - the read/write routines should check if device
 is still alive and return immideately when device is dead. When
 disconnecting device just wake up all readers and writers and they
 should eventually fall off.

Read and write don't matter so much; they can be handled easily enough in
the driver.  The real problem is the race between open() and unregister().  
We rely on the low-level drivers to make them mutually exclusive, but it
really should be done by the subsystem.  More precisely, the subsystem
should guarantee that all outstanding calls to open() have completed
before unregister() returns.

 Hmm, I guess am starting to think that using refcounting everywhere is
 not a good idea. We are trying to have immediate disconnect behavior
 and refcounting is an antithesis of it. Refcounting works well when it
 is contained - register grabs reference; unregister puts it back; but
 there is no passing references down between the layers. When device is
 being removed it needs to signal downstream that it is gone and should
 not be accessed anymore. And we need to do that anyway because if
 device is really gone but its users ignore it they will get endless
 stream of errors when trying to access it.

Should not be accessed is correct, but there's nothing wrong in 
principle with acquiring a lock in the device structure, provided the 
driver does nothing else.  And provided that the driver pins the subsystem 
module, so the device's release() routine will be called _before_ the 
subsystem can be unloaded.

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-21 Thread Alan Stern
On Fri, 20 Apr 2007, Greg KH wrote:

  Greg, do you know of anything in particular that depends on a kobjects not 
  being released before their children are released?
 
 Yes, the whole driver model :)

But anything in particular?  Looking through the source code, I see 
kobj-parent gets used mainly by kobject_get_path() and not by much else.

Looking some more, kobject_get_path() is used for kobject renaming,
uevent handling, and a little bit in the input core.  None of these things 
should try to access a kobject after it has been del()ed.  After all, it's 
no longer present in the filesystem so it doesn't _have_ a path.

So I don't see any immediate problem.  A quick boot with my patch applied,
during which I installed and removed various modules and hot-pluggable
devices, didn't cause anything strange to happen.

 When adding a new device, we always grab a reference to the parent
 device so it can not go away before we do.
 
 Look at the last kobject_put(parent); in kobject_cleanup() which ensures
 this.

Yes, I know.  The question is: What (if anything) is wrong with the parent
going away first?  So long as the parent remains present while the child
is _registered_, who will care if the parent is deallocated after the
child is unregistered but before it is released?

And if there is any code which does care, don't you think we should be
able to change it so that it doesn't?


 Ick, no, I think this used to be the way things worked, but bad things
 would end up happening, so we fixed it up to be the way things are
 today.  Read the comments for the changelog for this file for details.
 
 Specifically, look at commit 10921a8f1305b8ec97794941db78b825db5839bc
 in the history.git repo which is almost exactly what you are proposing
 to be reverted...

Yes, it is.  I had a little trouble finding it; the search facility in the
gitweb system at git.kernel.org doesn't seem to work right.  Who should
I complain to about that?

Anyway, the patch itself is available at

http://marc.info/?l=linux-kernelm=107116644617624w=2

Here's what the changelog comment says:

It fixes a kobject bug where the parent could be deleted before the
child object, causing all sorts of badness later when we clean up the
child object.  It's been acked by Pat.

Not terribly explicit.  As far as I can tell, cleaning up the child object
doesn't do much except to kfree() a few items and call the ktype's
release() routine.  For a struct device, the release() routine merely
calls dev-release(), or dev-type-release(), or
dev-class-dev_release() as the case may be.  None of these should try to
access the device's parent unless they made special arrangements to
acquire a reference to it beforehand.  Which is what we're trying to
eliminate -- that's what immediate detach means.

The change was made back in December 2003, for 2.6.0-test11.  Since then
the driver-model core and its users have evolved an awful lot.  Perhaps
reverting it now won't hurt anything.

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Greg KH
On Fri, Apr 20, 2007 at 11:40:39AM -0400, Alan Stern wrote:
> Here's a patch to do what I mentioned earlier.  Not tested -- it may 
> expose some existing bugs.  It may even break something, but I'm not aware 
> of anything that depends on it explicitly.
> 
> Greg, do you know of anything in particular that depends on a kobjects not 
> being released before their children are released?

Yes, the whole driver model :)

When adding a new device, we always grab a reference to the parent
device so it can not go away before we do.

Look at the last kobject_put(parent); in kobject_cleanup() which ensures
this.

> Index: usb-2.6/lib/kobject.c
> ===
> --- usb-2.6.orig/lib/kobject.c
> +++ usb-2.6/lib/kobject.c
> @@ -192,12 +192,15 @@ void kobject_init(struct kobject * kobj)
>  
>  static void unlink(struct kobject * kobj)
>  {
> + struct kobject *parent = kobj->parent;
> +
>   if (kobj->kset) {
>   spin_lock(>kset->list_lock);
>   list_del_init(>entry);
>   spin_unlock(>kset->list_lock);
>   }
>   kobject_put(kobj);
> + kobject_put(parent);
>  }
>  
>  /**
> @@ -241,7 +244,6 @@ int kobject_shadow_add(struct kobject * 
>   if (error) {
>   /* unlink does the kobject_put() for us */
>   unlink(kobj);
> - kobject_put(parent);
>  
>   /* be noisy on error issues */
>   if (error == -EEXIST)
> @@ -489,7 +491,6 @@ void kobject_cleanup(struct kobject * ko
>  {
>   struct kobj_type * t = get_ktype(kobj);
>   struct kset * s = kobj->kset;
> - struct kobject * parent = kobj->parent;
>  
>   pr_debug("kobject %s: cleaning up\n",kobject_name(kobj));
>   if (kobj->k_name != kobj->name)
> @@ -505,7 +506,6 @@ void kobject_cleanup(struct kobject * ko
>  
>   if (s)
>   kset_put(s);
> - kobject_put(parent);
>  }

Ick, no, I think this used to be the way things worked, but bad things
would end up happening, so we fixed it up to be the way things are
today.  Read the comments for the changelog for this file for details.

Specifically, look at commit 10921a8f1305b8ec97794941db78b825db5839bc
in the history.git repo which is almost exactly what you are proposing
to be reverted...

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Dmitry Torokhov

On 4/20/07, Tejun Heo <[EMAIL PROTECTED]> wrote:

Hello, Dmitry.

Dmitry Torokhov wrote:
>> Many drivers (at least all the SCSI/IDE ones) consider struct device as
>> the base class of the devices those drivers implement.  I don't think we
>> can just consider those drivers to be wrong.
>
> I am not saying they are wrong I am just saying that driver core is
> not where most work is done. Every subsystem has its own locking rules
> and lifetime rules and they are what is important. Whether subsystem
> uses embedding or referencing of struct devices is implementation
> detail.
>
> What is the goal of driver core? I thought the main goal was to have
> an uniform interface for device power management and presentation of
> device tree to userspace. It has nothing to do with main purposes of
> every individual subsystem - making some set of devices/services work.

I think we're running in circle here.

1. The driver's lifetime rules matters but currently the driver model
imposes reference counted model to each struct device.

2. You can decouple struct device completely from actual driver
implementation.

I agree with you that #2 is possible but just don't think it's the right
thing to do.  By making struct device independent from driver
implementation doesn't help solving lifetime problems in drivers.  It
just insulates driver model from those.



Isn't think a good thing? By decoupling the 2 layers we insulate them
from changes in each other. This allows bug subsystems to concentrate
on topics that important to them instead of worying about refcounting
objects that are not directly interesting for the subsystem in
question.

Now for smaller subsystems it may make sense to embed stuct devices
into subsystem objects and manage it all together. In fact input
system does this but I think it is much simlpier than SCSI or IDE.

--
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Tejun Heo
Hello, Dmitry.

Dmitry Torokhov wrote:
>> Many drivers (at least all the SCSI/IDE ones) consider struct device as
>> the base class of the devices those drivers implement.  I don't think we
>> can just consider those drivers to be wrong.
> 
> I am not saying they are wrong I am just saying that driver core is
> not where most work is done. Every subsystem has its own locking rules
> and lifetime rules and they are what is important. Whether subsystem
> uses embedding or referencing of struct devices is implementation
> detail.
> 
> What is the goal of driver core? I thought the main goal was to have
> an uniform interface for device power management and presentation of
> device tree to userspace. It has nothing to do with main purposes of
> every individual subsystem - making some set of devices/services work.

I think we're running in circle here.

1. The driver's lifetime rules matters but currently the driver model
imposes reference counted model to each struct device.

2. You can decouple struct device completely from actual driver
implementation.

I agree with you that #2 is possible but just don't think it's the right
thing to do.  By making struct device independent from driver
implementation doesn't help solving lifetime problems in drivers.  It
just insulates driver model from those.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Dmitry Torokhov

On 4/19/07, Alan Stern <[EMAIL PROTECTED]> wrote:


Among the worst offenders are character devices.  None of the subsystem
providers offering char device registration performs immediate detach --
they are a lot like sysfs used to be.  (In fact, they probably _can't_
provide it since read() or write() calls can block indefinitely in the
lower-level driver; the subsystem may have no way to force them to
fail-fast.)



That shoudl be doable - the read/write routines should check if device
is still alive and return immideately when device is dead. When
disconnecting device just wake up all readers and writers and they
should eventually fall off.

Hmm, I guess am starting to think that using refcounting everywhere is
not a good idea. We are trying to have "immediate disconnect" behavior
and refcounting is an antithesis of it. Refcounting works well when it
is contained - register grabs reference; unregister puts it back; but
there is no passing references down between the layers. When device is
being removed it needs to signal downstream that it is gone and should
not be accessed anymore. And we need to do that anyway because if
device is really gone but its users ignore it they will get endless
stream of errors when trying to access it.

--
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Dmitry Torokhov

Hi Tejun,

On 4/20/07, Tejun Heo <[EMAIL PROTECTED]> wrote:

Hello, Dmitry.

Dmitry Torokhov wrote:
> On 4/19/07, Cornelia Huck <[EMAIL PROTECTED]> wrote:
>> On Thu, 19 Apr 2007 09:13:43 -0400,
>> "Dmitry Torokhov" <[EMAIL PROTECTED]> wrote:
>>
>> > Because they are managed by 2 different entities. the struct device
>> > objects are managed by device core and driver-specific objects are
>> > managed by their respective driver.
>>
>> Not sure if I understand you here. My view of this was always that the
>> embedding object was kind of an extended device and that the relevant
>> driver/subsystem managed it through the driver core infrastructure.
>>
>
> I am not sure if I agree with this point of view. Driver (or
> subsystem) provides an instance of struct device for the rest of the
> system to iteract uniformly with (suspend/resume/tree
> visualization/etc) i.e. struct device implement an interface for
> subsystems. However most of the system use their own mechanisms to
> manage their devices. They can rely on the driver core to a certain
> degree but driver core is mostly a carries out helper functions, not
> the meat.

Many drivers (at least all the SCSI/IDE ones) consider struct device as
the base class of the devices those drivers implement.  I don't think we
can just consider those drivers to be wrong.


I am not saying they are wrong I am just saying that driver core is
not where most work is done. Every subsystem has its own locking rules
and lifetime rules and they are what is important. Whether subsystem
uses embedding or referencing of struct devices is implementation
detail.

What is the goal of driver core? I thought the main goal was to have
an uniform interface for device power management and presentation of
device tree to userspace. It has nothing to do with main purposes of
every individual subsystem - making some set of devices/services work.

--
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Dmitry Torokhov

On 4/20/07, Alan Stern <[EMAIL PROTECTED]> wrote:


Dmitry, in thinking things over some more I realized there's going to be a
problem with the autosuspend support in USB.  It has to do with the way a
driver needs to prevent (or block) suspends from occurring while it is
actively using a device.

To understand this, you need to know that USB adds a pm_mutex to the
device structure.  It gets used for synchronizing all suspend- or
resume-related activities.  In particular, the driver's suspend() method
is called with usbdev->pm_mutex held.

The next idea is that a driver needs to synchronize its remove() method
with other methods such as read() -- we mustn't allow read() to try and
refer to the device after the driver has been unbound.  Let's say the
driver has unbind_mutex embedded in its private data structure for this
purpose.

Now consider what read() has to do.  It needs to block suspends from
occurring while it runs, and it needs to do an autoresume if the device
was already suspended.  So read() will look like this:

   mutex_lock(>unbind_mutex);
   if (private->gone) {
   ret = -ENODEV;
   goto done;
   }
   if (private->suspended)
   autoresume(private->usbdev);
   ...
 done:
   mutex_unlock(>unbind_mutex);
   return ret;

Meanwhile, the suspend() method needs to block while read() is running.
So it will look like this:

   private->suspended = 1;
   mutex_lock(>unbind_mutex);
   /* Now the driver has quiesced */
   mutex_unlock(>unbind_mutex);

Here's the problem:  The autoresume() call inside read() tries to acquire
usbdev->pm_mutex while holding private->unbind_mutex.  The suspend()
method does the reverse, a locking-order violation.

So far I haven't figured out any way to make this work.  Do you have a
suggestion?  (Don't say read() should hold pm_mutex as well as
unbind_mutex; that's no good -- although the reason is rather obscure.)



First of all I think that I would merge pm and unbind mutex into one -
you also need to synchronize resume and suspend with bind and unbind
so you pretty much need to always acquire both of them.

Second (and I admit I have not followed USB autoresume discussions, my
usb-devle backlog is over 5000 messages ;( ) is I am not sure why
woudl a read block autoresume. I can see write doing that but passive
reads should not affect device state.

--
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Alan Stern
Here's a patch to do what I mentioned earlier.  Not tested -- it may 
expose some existing bugs.  It may even break something, but I'm not aware 
of anything that depends on it explicitly.

Greg, do you know of anything in particular that depends on a kobjects not 
being released before their children are released?

This is meant to be used without any of the other changes Cornelia and I 
have posted.  Although it might not hurt to have them all present...

Alan Stern



Index: usb-2.6/lib/kobject.c
===
--- usb-2.6.orig/lib/kobject.c
+++ usb-2.6/lib/kobject.c
@@ -192,12 +192,15 @@ void kobject_init(struct kobject * kobj)
 
 static void unlink(struct kobject * kobj)
 {
+   struct kobject *parent = kobj->parent;
+
if (kobj->kset) {
spin_lock(>kset->list_lock);
list_del_init(>entry);
spin_unlock(>kset->list_lock);
}
kobject_put(kobj);
+   kobject_put(parent);
 }
 
 /**
@@ -241,7 +244,6 @@ int kobject_shadow_add(struct kobject * 
if (error) {
/* unlink does the kobject_put() for us */
unlink(kobj);
-   kobject_put(parent);
 
/* be noisy on error issues */
if (error == -EEXIST)
@@ -489,7 +491,6 @@ void kobject_cleanup(struct kobject * ko
 {
struct kobj_type * t = get_ktype(kobj);
struct kset * s = kobj->kset;
-   struct kobject * parent = kobj->parent;
 
pr_debug("kobject %s: cleaning up\n",kobject_name(kobj));
if (kobj->k_name != kobj->name)
@@ -505,7 +506,6 @@ void kobject_cleanup(struct kobject * ko
 
if (s)
kset_put(s);
-   kobject_put(parent);
 }
 
 static void kobject_release(struct kref *kref)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Alan Stern
On Fri, 20 Apr 2007, Tejun Heo wrote:

> Hello, Alan.
> 
> Alan Stern wrote:
> > This doesn't solve a related problem: a subsystem wants to register
> > devices and to provide a set of mutually-exclusive services to the
> > devices' drivers.  The mutual exclusion has to be provided by a mutex or
> > something similar, and the drivers need a way to unbind even while waiting
> > to acquire the mutex.
> 
> I don't really follow why the drivers need a way to unbind even while
> waiting to acquire the mutex.  Care to enlighten me?

Forget I mentioned it.  It isn't a problem if the subsystem uses its own
mutex to provide the mutual exclusion.  Things become difficult only
if the subsystem tries to use dev->sem.

> > I thought of something else that could also be done: There should be a way
> > to cancel an outstanding workqueue request.  At the moment all you can do
> > is call flush_workqueue(), which will hang if you are already executing in
> > a workqueue routine.  You should be able to delete a particular entry from
> > the workqueue (or wait for it to complete if it has already started
> > running).  This could be implemented right away.
> 
> It all depends on how a particular subsystem is shaped but having such
> thing would definitely help.

I saw something recently suggesting such a thing is already present in
Andrew Morton's queue.  Great minds think alike...  :-)


> Yeah, exactly.  My argument is that that impedance matching between
> lifetime rules must happen at some place and it's better if we can do in
> the higher layer where we can afford more effort and thus complexity.
> We're currently pushing that down to each drivers and not too many are
> getting it right.  I think it's just unrealistic to expect every and
> each driver subsystems to get it right, so some overhead at higher layer
> is acceptable and we can definitely afford much more optimization at
> higher layer.

Agreed.


You know, things may not be quite as bad as I had thought.  I was under 
the impression that the driver core did put_device(dev->parent) during 
device_release().  But that's not the case; the call is made during 
device_del().

This makes things very different.  Failure to drop references to a device
during remove() won't cause any lingering references to the device's
parent.  The effects of one badly-behaved driver won't propagate
indefinitely far up the device tree.

On the other hand, although devices behave this way, kobjects do not.  The 
call to kobject_put(kobj->parent) is in kobject_cleanup(), not in 
kobject_del() or unlink().  So this still needs to be fixed.  It may be 
related to the sysfs <-> kobject link you have been trying to break.


Dmitry, in thinking things over some more I realized there's going to be a
problem with the autosuspend support in USB.  It has to do with the way a
driver needs to prevent (or block) suspends from occurring while it is 
actively using a device.

To understand this, you need to know that USB adds a pm_mutex to the
device structure.  It gets used for synchronizing all suspend- or
resume-related activities.  In particular, the driver's suspend() method
is called with usbdev->pm_mutex held.

The next idea is that a driver needs to synchronize its remove() method
with other methods such as read() -- we mustn't allow read() to try and
refer to the device after the driver has been unbound.  Let's say the
driver has unbind_mutex embedded in its private data structure for this 
purpose.

Now consider what read() has to do.  It needs to block suspends from 
occurring while it runs, and it needs to do an autoresume if the device 
was already suspended.  So read() will look like this:

mutex_lock(>unbind_mutex);
if (private->gone) {
ret = -ENODEV;
goto done;
}
if (private->suspended)
autoresume(private->usbdev);
...
 done:
mutex_unlock(>unbind_mutex);
return ret;

Meanwhile, the suspend() method needs to block while read() is running.  
So it will look like this:

private->suspended = 1;
mutex_lock(>unbind_mutex);
/* Now the driver has quiesced */
mutex_unlock(>unbind_mutex);

Here's the problem:  The autoresume() call inside read() tries to acquire
usbdev->pm_mutex while holding private->unbind_mutex.  The suspend()  
method does the reverse, a locking-order violation.

So far I haven't figured out any way to make this work.  Do you have a
suggestion?  (Don't say read() should hold pm_mutex as well as
unbind_mutex; that's no good -- although the reason is rather obscure.)

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Cornelia Huck
On Fri, 20 Apr 2007 14:27:06 +0900,
Tejun Heo <[EMAIL PROTECTED]> wrote:

> Hello, Alan.
> 
> Alan Stern wrote:
> > This doesn't solve a related problem: a subsystem wants to register
> > devices and to provide a set of mutually-exclusive services to the
> > devices' drivers.  The mutual exclusion has to be provided by a mutex or
> > something similar, and the drivers need a way to unbind even while waiting
> > to acquire the mutex.
> 
> I don't really follow why the drivers need a way to unbind even while
> waiting to acquire the mutex.  Care to enlighten me?

I guess when the driver is just being ripped out or an unbind has been
triggered or similar.

> Yeah, exactly.  My argument is that that impedance matching between
> lifetime rules must happen at some place and it's better if we can do in
> the higher layer where we can afford more effort and thus complexity.
> We're currently pushing that down to each drivers and not too many are
> getting it right.  I think it's just unrealistic to expect every and
> each driver subsystems to get it right, so some overhead at higher layer
> is acceptable and we can definitely afford much more optimization at
> higher layer.

I agree. It isn't easy to get this right, so just we should just solve
that once for all drivers.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Cornelia Huck
On Thu, 19 Apr 2007 16:49:11 +0200,
Cornelia Huck <[EMAIL PROTECTED]> wrote:

> On Thu, 19 Apr 2007 10:20:51 -0400 (EDT),
> Alan Stern <[EMAIL PROTECTED]> wrote:
> 
> > The patch below, applied on top of Cornelia's changes plus the 
> > kobject_init() patch I posted earlier, actually seems to work.  And it 
> > prevents an oops which I was able to trigger without it.
> 
> Looks nice at first glance. I'll try to test it a bit.

OK, I've tested this a bit (with a silly dummy driver and zfcp with
hacked-back-in unload support) and I haven't encountered any problems
so far. So I second the seems-to-work statement :)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Tejun Heo
Hello, Dmitry.

Dmitry Torokhov wrote:
> On 4/19/07, Cornelia Huck <[EMAIL PROTECTED]> wrote:
>> On Thu, 19 Apr 2007 09:13:43 -0400,
>> "Dmitry Torokhov" <[EMAIL PROTECTED]> wrote:
>>
>> > Because they are managed by 2 different entities. the struct device
>> > objects are managed by device core and driver-specific objects are
>> > managed by their respective driver.
>>
>> Not sure if I understand you here. My view of this was always that the
>> embedding object was kind of an extended device and that the relevant
>> driver/subsystem managed it through the driver core infrastructure.
>>
> 
> I am not sure if I agree with this point of view. Driver (or
> subsystem) provides an instance of struct device for the rest of the
> system to iteract uniformly with (suspend/resume/tree
> visualization/etc) i.e. struct device implement an interface for
> subsystems. However most of the system use their own mechanisms to
> manage their devices. They can rely on the driver core to a certain
> degree but driver core is mostly a carries out helper functions, not
> the meat.

Many drivers (at least all the SCSI/IDE ones) consider struct device as
the base class of the devices those drivers implement.  I don't think we
can just consider those drivers to be wrong.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Tejun Heo
Hello, Dmitry.

Dmitry Torokhov wrote:
 On 4/19/07, Cornelia Huck [EMAIL PROTECTED] wrote:
 On Thu, 19 Apr 2007 09:13:43 -0400,
 Dmitry Torokhov [EMAIL PROTECTED] wrote:

  Because they are managed by 2 different entities. the struct device
  objects are managed by device core and driver-specific objects are
  managed by their respective driver.

 Not sure if I understand you here. My view of this was always that the
 embedding object was kind of an extended device and that the relevant
 driver/subsystem managed it through the driver core infrastructure.

 
 I am not sure if I agree with this point of view. Driver (or
 subsystem) provides an instance of struct device for the rest of the
 system to iteract uniformly with (suspend/resume/tree
 visualization/etc) i.e. struct device implement an interface for
 subsystems. However most of the system use their own mechanisms to
 manage their devices. They can rely on the driver core to a certain
 degree but driver core is mostly a carries out helper functions, not
 the meat.

Many drivers (at least all the SCSI/IDE ones) consider struct device as
the base class of the devices those drivers implement.  I don't think we
can just consider those drivers to be wrong.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Cornelia Huck
On Thu, 19 Apr 2007 16:49:11 +0200,
Cornelia Huck [EMAIL PROTECTED] wrote:

 On Thu, 19 Apr 2007 10:20:51 -0400 (EDT),
 Alan Stern [EMAIL PROTECTED] wrote:
 
  The patch below, applied on top of Cornelia's changes plus the 
  kobject_init() patch I posted earlier, actually seems to work.  And it 
  prevents an oops which I was able to trigger without it.
 
 Looks nice at first glance. I'll try to test it a bit.

OK, I've tested this a bit (with a silly dummy driver and zfcp with
hacked-back-in unload support) and I haven't encountered any problems
so far. So I second the seems-to-work statement :)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Cornelia Huck
On Fri, 20 Apr 2007 14:27:06 +0900,
Tejun Heo [EMAIL PROTECTED] wrote:

 Hello, Alan.
 
 Alan Stern wrote:
  This doesn't solve a related problem: a subsystem wants to register
  devices and to provide a set of mutually-exclusive services to the
  devices' drivers.  The mutual exclusion has to be provided by a mutex or
  something similar, and the drivers need a way to unbind even while waiting
  to acquire the mutex.
 
 I don't really follow why the drivers need a way to unbind even while
 waiting to acquire the mutex.  Care to enlighten me?

I guess when the driver is just being ripped out or an unbind has been
triggered or similar.

 Yeah, exactly.  My argument is that that impedance matching between
 lifetime rules must happen at some place and it's better if we can do in
 the higher layer where we can afford more effort and thus complexity.
 We're currently pushing that down to each drivers and not too many are
 getting it right.  I think it's just unrealistic to expect every and
 each driver subsystems to get it right, so some overhead at higher layer
 is acceptable and we can definitely afford much more optimization at
 higher layer.

I agree. It isn't easy to get this right, so just we should just solve
that once for all drivers.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Alan Stern
Here's a patch to do what I mentioned earlier.  Not tested -- it may 
expose some existing bugs.  It may even break something, but I'm not aware 
of anything that depends on it explicitly.

Greg, do you know of anything in particular that depends on a kobjects not 
being released before their children are released?

This is meant to be used without any of the other changes Cornelia and I 
have posted.  Although it might not hurt to have them all present...

Alan Stern



Index: usb-2.6/lib/kobject.c
===
--- usb-2.6.orig/lib/kobject.c
+++ usb-2.6/lib/kobject.c
@@ -192,12 +192,15 @@ void kobject_init(struct kobject * kobj)
 
 static void unlink(struct kobject * kobj)
 {
+   struct kobject *parent = kobj-parent;
+
if (kobj-kset) {
spin_lock(kobj-kset-list_lock);
list_del_init(kobj-entry);
spin_unlock(kobj-kset-list_lock);
}
kobject_put(kobj);
+   kobject_put(parent);
 }
 
 /**
@@ -241,7 +244,6 @@ int kobject_shadow_add(struct kobject * 
if (error) {
/* unlink does the kobject_put() for us */
unlink(kobj);
-   kobject_put(parent);
 
/* be noisy on error issues */
if (error == -EEXIST)
@@ -489,7 +491,6 @@ void kobject_cleanup(struct kobject * ko
 {
struct kobj_type * t = get_ktype(kobj);
struct kset * s = kobj-kset;
-   struct kobject * parent = kobj-parent;
 
pr_debug(kobject %s: cleaning up\n,kobject_name(kobj));
if (kobj-k_name != kobj-name)
@@ -505,7 +506,6 @@ void kobject_cleanup(struct kobject * ko
 
if (s)
kset_put(s);
-   kobject_put(parent);
 }
 
 static void kobject_release(struct kref *kref)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Alan Stern
On Fri, 20 Apr 2007, Tejun Heo wrote:

 Hello, Alan.
 
 Alan Stern wrote:
  This doesn't solve a related problem: a subsystem wants to register
  devices and to provide a set of mutually-exclusive services to the
  devices' drivers.  The mutual exclusion has to be provided by a mutex or
  something similar, and the drivers need a way to unbind even while waiting
  to acquire the mutex.
 
 I don't really follow why the drivers need a way to unbind even while
 waiting to acquire the mutex.  Care to enlighten me?

Forget I mentioned it.  It isn't a problem if the subsystem uses its own
mutex to provide the mutual exclusion.  Things become difficult only
if the subsystem tries to use dev-sem.

  I thought of something else that could also be done: There should be a way
  to cancel an outstanding workqueue request.  At the moment all you can do
  is call flush_workqueue(), which will hang if you are already executing in
  a workqueue routine.  You should be able to delete a particular entry from
  the workqueue (or wait for it to complete if it has already started
  running).  This could be implemented right away.
 
 It all depends on how a particular subsystem is shaped but having such
 thing would definitely help.

I saw something recently suggesting such a thing is already present in
Andrew Morton's queue.  Great minds think alike...  :-)


 Yeah, exactly.  My argument is that that impedance matching between
 lifetime rules must happen at some place and it's better if we can do in
 the higher layer where we can afford more effort and thus complexity.
 We're currently pushing that down to each drivers and not too many are
 getting it right.  I think it's just unrealistic to expect every and
 each driver subsystems to get it right, so some overhead at higher layer
 is acceptable and we can definitely afford much more optimization at
 higher layer.

Agreed.


You know, things may not be quite as bad as I had thought.  I was under 
the impression that the driver core did put_device(dev-parent) during 
device_release().  But that's not the case; the call is made during 
device_del().

This makes things very different.  Failure to drop references to a device
during remove() won't cause any lingering references to the device's
parent.  The effects of one badly-behaved driver won't propagate
indefinitely far up the device tree.

On the other hand, although devices behave this way, kobjects do not.  The 
call to kobject_put(kobj-parent) is in kobject_cleanup(), not in 
kobject_del() or unlink().  So this still needs to be fixed.  It may be 
related to the sysfs - kobject link you have been trying to break.


Dmitry, in thinking things over some more I realized there's going to be a
problem with the autosuspend support in USB.  It has to do with the way a
driver needs to prevent (or block) suspends from occurring while it is 
actively using a device.

To understand this, you need to know that USB adds a pm_mutex to the
device structure.  It gets used for synchronizing all suspend- or
resume-related activities.  In particular, the driver's suspend() method
is called with usbdev-pm_mutex held.

The next idea is that a driver needs to synchronize its remove() method
with other methods such as read() -- we mustn't allow read() to try and
refer to the device after the driver has been unbound.  Let's say the
driver has unbind_mutex embedded in its private data structure for this 
purpose.

Now consider what read() has to do.  It needs to block suspends from 
occurring while it runs, and it needs to do an autoresume if the device 
was already suspended.  So read() will look like this:

mutex_lock(private-unbind_mutex);
if (private-gone) {
ret = -ENODEV;
goto done;
}
if (private-suspended)
autoresume(private-usbdev);
...
 done:
mutex_unlock(private-unbind_mutex);
return ret;

Meanwhile, the suspend() method needs to block while read() is running.  
So it will look like this:

private-suspended = 1;
mutex_lock(private-unbind_mutex);
/* Now the driver has quiesced */
mutex_unlock(private-unbind_mutex);

Here's the problem:  The autoresume() call inside read() tries to acquire
usbdev-pm_mutex while holding private-unbind_mutex.  The suspend()  
method does the reverse, a locking-order violation.

So far I haven't figured out any way to make this work.  Do you have a
suggestion?  (Don't say read() should hold pm_mutex as well as
unbind_mutex; that's no good -- although the reason is rather obscure.)

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Dmitry Torokhov

On 4/20/07, Alan Stern [EMAIL PROTECTED] wrote:


Dmitry, in thinking things over some more I realized there's going to be a
problem with the autosuspend support in USB.  It has to do with the way a
driver needs to prevent (or block) suspends from occurring while it is
actively using a device.

To understand this, you need to know that USB adds a pm_mutex to the
device structure.  It gets used for synchronizing all suspend- or
resume-related activities.  In particular, the driver's suspend() method
is called with usbdev-pm_mutex held.

The next idea is that a driver needs to synchronize its remove() method
with other methods such as read() -- we mustn't allow read() to try and
refer to the device after the driver has been unbound.  Let's say the
driver has unbind_mutex embedded in its private data structure for this
purpose.

Now consider what read() has to do.  It needs to block suspends from
occurring while it runs, and it needs to do an autoresume if the device
was already suspended.  So read() will look like this:

   mutex_lock(private-unbind_mutex);
   if (private-gone) {
   ret = -ENODEV;
   goto done;
   }
   if (private-suspended)
   autoresume(private-usbdev);
   ...
 done:
   mutex_unlock(private-unbind_mutex);
   return ret;

Meanwhile, the suspend() method needs to block while read() is running.
So it will look like this:

   private-suspended = 1;
   mutex_lock(private-unbind_mutex);
   /* Now the driver has quiesced */
   mutex_unlock(private-unbind_mutex);

Here's the problem:  The autoresume() call inside read() tries to acquire
usbdev-pm_mutex while holding private-unbind_mutex.  The suspend()
method does the reverse, a locking-order violation.

So far I haven't figured out any way to make this work.  Do you have a
suggestion?  (Don't say read() should hold pm_mutex as well as
unbind_mutex; that's no good -- although the reason is rather obscure.)



First of all I think that I would merge pm and unbind mutex into one -
you also need to synchronize resume and suspend with bind and unbind
so you pretty much need to always acquire both of them.

Second (and I admit I have not followed USB autoresume discussions, my
usb-devle backlog is over 5000 messages ;( ) is I am not sure why
woudl a read block autoresume. I can see write doing that but passive
reads should not affect device state.

--
Dmitry
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Dmitry Torokhov

On 4/19/07, Alan Stern [EMAIL PROTECTED] wrote:


Among the worst offenders are character devices.  None of the subsystem
providers offering char device registration performs immediate detach --
they are a lot like sysfs used to be.  (In fact, they probably _can't_
provide it since read() or write() calls can block indefinitely in the
lower-level driver; the subsystem may have no way to force them to
fail-fast.)



That shoudl be doable - the read/write routines should check if device
is still alive and return immideately when device is dead. When
disconnecting device just wake up all readers and writers and they
should eventually fall off.

Hmm, I guess am starting to think that using refcounting everywhere is
not a good idea. We are trying to have immediate disconnect behavior
and refcounting is an antithesis of it. Refcounting works well when it
is contained - register grabs reference; unregister puts it back; but
there is no passing references down between the layers. When device is
being removed it needs to signal downstream that it is gone and should
not be accessed anymore. And we need to do that anyway because if
device is really gone but its users ignore it they will get endless
stream of errors when trying to access it.

--
Dmitry
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Dmitry Torokhov

Hi Tejun,

On 4/20/07, Tejun Heo [EMAIL PROTECTED] wrote:

Hello, Dmitry.

Dmitry Torokhov wrote:
 On 4/19/07, Cornelia Huck [EMAIL PROTECTED] wrote:
 On Thu, 19 Apr 2007 09:13:43 -0400,
 Dmitry Torokhov [EMAIL PROTECTED] wrote:

  Because they are managed by 2 different entities. the struct device
  objects are managed by device core and driver-specific objects are
  managed by their respective driver.

 Not sure if I understand you here. My view of this was always that the
 embedding object was kind of an extended device and that the relevant
 driver/subsystem managed it through the driver core infrastructure.


 I am not sure if I agree with this point of view. Driver (or
 subsystem) provides an instance of struct device for the rest of the
 system to iteract uniformly with (suspend/resume/tree
 visualization/etc) i.e. struct device implement an interface for
 subsystems. However most of the system use their own mechanisms to
 manage their devices. They can rely on the driver core to a certain
 degree but driver core is mostly a carries out helper functions, not
 the meat.

Many drivers (at least all the SCSI/IDE ones) consider struct device as
the base class of the devices those drivers implement.  I don't think we
can just consider those drivers to be wrong.


I am not saying they are wrong I am just saying that driver core is
not where most work is done. Every subsystem has its own locking rules
and lifetime rules and they are what is important. Whether subsystem
uses embedding or referencing of struct devices is implementation
detail.

What is the goal of driver core? I thought the main goal was to have
an uniform interface for device power management and presentation of
device tree to userspace. It has nothing to do with main purposes of
every individual subsystem - making some set of devices/services work.

--
Dmitry
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Tejun Heo
Hello, Dmitry.

Dmitry Torokhov wrote:
 Many drivers (at least all the SCSI/IDE ones) consider struct device as
 the base class of the devices those drivers implement.  I don't think we
 can just consider those drivers to be wrong.
 
 I am not saying they are wrong I am just saying that driver core is
 not where most work is done. Every subsystem has its own locking rules
 and lifetime rules and they are what is important. Whether subsystem
 uses embedding or referencing of struct devices is implementation
 detail.
 
 What is the goal of driver core? I thought the main goal was to have
 an uniform interface for device power management and presentation of
 device tree to userspace. It has nothing to do with main purposes of
 every individual subsystem - making some set of devices/services work.

I think we're running in circle here.

1. The driver's lifetime rules matters but currently the driver model
imposes reference counted model to each struct device.

2. You can decouple struct device completely from actual driver
implementation.

I agree with you that #2 is possible but just don't think it's the right
thing to do.  By making struct device independent from driver
implementation doesn't help solving lifetime problems in drivers.  It
just insulates driver model from those.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Dmitry Torokhov

On 4/20/07, Tejun Heo [EMAIL PROTECTED] wrote:

Hello, Dmitry.

Dmitry Torokhov wrote:
 Many drivers (at least all the SCSI/IDE ones) consider struct device as
 the base class of the devices those drivers implement.  I don't think we
 can just consider those drivers to be wrong.

 I am not saying they are wrong I am just saying that driver core is
 not where most work is done. Every subsystem has its own locking rules
 and lifetime rules and they are what is important. Whether subsystem
 uses embedding or referencing of struct devices is implementation
 detail.

 What is the goal of driver core? I thought the main goal was to have
 an uniform interface for device power management and presentation of
 device tree to userspace. It has nothing to do with main purposes of
 every individual subsystem - making some set of devices/services work.

I think we're running in circle here.

1. The driver's lifetime rules matters but currently the driver model
imposes reference counted model to each struct device.

2. You can decouple struct device completely from actual driver
implementation.

I agree with you that #2 is possible but just don't think it's the right
thing to do.  By making struct device independent from driver
implementation doesn't help solving lifetime problems in drivers.  It
just insulates driver model from those.



Isn't think a good thing? By decoupling the 2 layers we insulate them
from changes in each other. This allows bug subsystems to concentrate
on topics that important to them instead of worying about refcounting
objects that are not directly interesting for the subsystem in
question.

Now for smaller subsystems it may make sense to embed stuct devices
into subsystem objects and manage it all together. In fact input
system does this but I think it is much simlpier than SCSI or IDE.

--
Dmitry
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-20 Thread Greg KH
On Fri, Apr 20, 2007 at 11:40:39AM -0400, Alan Stern wrote:
 Here's a patch to do what I mentioned earlier.  Not tested -- it may 
 expose some existing bugs.  It may even break something, but I'm not aware 
 of anything that depends on it explicitly.
 
 Greg, do you know of anything in particular that depends on a kobjects not 
 being released before their children are released?

Yes, the whole driver model :)

When adding a new device, we always grab a reference to the parent
device so it can not go away before we do.

Look at the last kobject_put(parent); in kobject_cleanup() which ensures
this.

 Index: usb-2.6/lib/kobject.c
 ===
 --- usb-2.6.orig/lib/kobject.c
 +++ usb-2.6/lib/kobject.c
 @@ -192,12 +192,15 @@ void kobject_init(struct kobject * kobj)
  
  static void unlink(struct kobject * kobj)
  {
 + struct kobject *parent = kobj-parent;
 +
   if (kobj-kset) {
   spin_lock(kobj-kset-list_lock);
   list_del_init(kobj-entry);
   spin_unlock(kobj-kset-list_lock);
   }
   kobject_put(kobj);
 + kobject_put(parent);
  }
  
  /**
 @@ -241,7 +244,6 @@ int kobject_shadow_add(struct kobject * 
   if (error) {
   /* unlink does the kobject_put() for us */
   unlink(kobj);
 - kobject_put(parent);
  
   /* be noisy on error issues */
   if (error == -EEXIST)
 @@ -489,7 +491,6 @@ void kobject_cleanup(struct kobject * ko
  {
   struct kobj_type * t = get_ktype(kobj);
   struct kset * s = kobj-kset;
 - struct kobject * parent = kobj-parent;
  
   pr_debug(kobject %s: cleaning up\n,kobject_name(kobj));
   if (kobj-k_name != kobj-name)
 @@ -505,7 +506,6 @@ void kobject_cleanup(struct kobject * ko
  
   if (s)
   kset_put(s);
 - kobject_put(parent);
  }

Ick, no, I think this used to be the way things worked, but bad things
would end up happening, so we fixed it up to be the way things are
today.  Read the comments for the changelog for this file for details.

Specifically, look at commit 10921a8f1305b8ec97794941db78b825db5839bc
in the history.git repo which is almost exactly what you are proposing
to be reverted...

thanks,

greg k-h
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Tejun Heo
Hello, Alan.

Alan Stern wrote:
> This doesn't solve a related problem: a subsystem wants to register
> devices and to provide a set of mutually-exclusive services to the
> devices' drivers.  The mutual exclusion has to be provided by a mutex or
> something similar, and the drivers need a way to unbind even while waiting
> to acquire the mutex.

I don't really follow why the drivers need a way to unbind even while
waiting to acquire the mutex.  Care to enlighten me?

> The obvious answer is to introduce a different sort of synchronization 
> primitive: a mutex (or semaphore or rwsem) which can be invalidated.
> 
> The semantics would be straightforward.  When mutex_invalidate() is
> called, it marks the mutex so that all future lock attempts will fail with
> -ENODEV.  It also wakes up all threads that are blocked trying to lock the
> mutex and causes them to fail with the same error.  Once all that is done
> mutex_invalidate() returns.  In particular, it doesn't wait for the
> current lock to be released -- in fact, you would call it while holding
> the lock.
> 
> This would solve a lot of your problems.  But it would also mean making 
> extensive changes to the kernel.  For one thing, mutex_lock() would return 
> int instead of void, and you would want to mark it __must_check.  Every 
> place where a mutex is locked, the code would have to be changed to add an 
> error pathway.  That's the sort of thing I was talking about when I said 
> it was going to be a tremendous job.

I think we both agree that's not a good idea.  :-)

> I thought of something else that could also be done: There should be a way
> to cancel an outstanding workqueue request.  At the moment all you can do
> is call flush_workqueue(), which will hang if you are already executing in
> a workqueue routine.  You should be able to delete a particular entry from
> the workqueue (or wait for it to complete if it has already started
> running).  This could be implemented right away.

It all depends on how a particular subsystem is shaped but having such
thing would definitely help.

> More problems with immediate detach -- it would have to apply to char
> devices.  When a char device is unregistered you can't force user
> processes to close their open file handles.  Instead something like your
> change to sysfs is needed -- wait for outstanding calls to complete and
> fail any future calls.  This means that registering a device will use up
> more than just a pointer in a table of minor device numbers.  Each entry
> would require at least an rwsem, and device I/O would be slowed down by
> the need to get a read-lock each time before entering the device driver.
> 
> The same idea applies to block devices, although here the problems center 
> more around the block core and request queues.

Yeah, exactly.  My argument is that that impedance matching between
lifetime rules must happen at some place and it's better if we can do in
the higher layer where we can afford more effort and thus complexity.
We're currently pushing that down to each drivers and not too many are
getting it right.  I think it's just unrealistic to expect every and
each driver subsystems to get it right, so some overhead at higher layer
is acceptable and we can definitely afford much more optimization at
higher layer.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Alan Stern
On Thu, 19 Apr 2007, Dmitry Torokhov wrote:

> On 4/19/07, Alan Stern <[EMAIL PROTECTED]> wrote:
> > On Wed, 18 Apr 2007, Dmitry Torokhov wrote:
> >
> > > I am still do not understand why this is needed. Would it not be
> > > simplier just to use a reference to struct device instead of embedding
> > > it in a larger structure if their lifetimes are different and one does
> > > not have a subsystem that takes care of releasing logic.
> > >
> > >
> > > Pretty much drivers have 2 options:
> > >
> > > struct my_device {
> > > void *private_data;
> > > struct device dev;
> > > };
> >
> > Actually people use dev_[gs]et_drvdata() instead of a separate
> > private_data pointer.  That way there's no need for the my_device
> > container.
> >
> 
> No, drvdata belongs to driver that is bound to a device. We are
> talking about private data created and managed by driver that provides
> device.

Oh, sorry, I misunderstood.

> Again, if we are talking about driver bound to a device then devices
> ->release() is irrelevant. If we are talking about driver that created
> device then driver's ->remove() is irrelevant.

The problem is that device structures carry references to their parents.  
So even if a driver does use this option and is able to give up its own 
references immediately, it can't make any guarantees about drivers of 
child devices.  If every driver followed your scheme we'd be okay; until 
then any ancestor of a device with a non-immediate-detach driver would not 
be able to do immediate detach.

There's also another problem.  Even though all the references to
my_dev->dev may be gone, there will probably still be outstanding
references to private_data structure.  The driver's exit() routine will 
have to wait until all those references are gone.  How can it do that 
safely?  Presumably the private_data will include its own refcount and 
some release routine will run when the count drops to 0.  That routine 
will be part of the driver's module -- so how can it enable itself to be 
unloaded?


> > > With current sysfs orphaning attributes upon removal request there is
> > > no issue of accessing driver-private data through references obtained
> > > via ether embedded or referenced dev structure so everything is fine.
> >
> > Not so.  There are other pathways besides sysfs which can cause a driver
> > to access its data structures.
> >
> 
> Which ones? These needs to be identified and treated with "immediate
> disconnect" that you advocated earlier. Once active users of device's
> services are gone you can zap it.

Among the worst offenders are character devices.  None of the subsystem
providers offering char device registration performs immediate detach --
they are a lot like sysfs used to be.  (In fact, they probably _can't_
provide it since read() or write() calls can block indefinitely in the
lower-level driver; the subsystem may have no way to force them to
fail-fast.)  So every lower-level driver which registers a char device
needs to provide its own synchronization and immediate detach, and at the
moment I doubt very many of them do.  And of those which do, a large
percentage rely on the BKL for synchronization!

Here's another awkward possibility.  Suppose a workqueue routine tries to 
unregister a device.  Suppose the device's driver needs to do something in 
a process context, so it has a work item pending in the same workqueue.  
The driver's remove() method would need to flush the workqueue (in order 
to perform immediate detach), but doing so would cause a deadlock.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Dmitry Torokhov

On 4/19/07, Alan Stern <[EMAIL PROTECTED]> wrote:

On Wed, 18 Apr 2007, Dmitry Torokhov wrote:

> I am still do not understand why this is needed. Would it not be
> simplier just to use a reference to struct device instead of embedding
> it in a larger structure if their lifetimes are different and one does
> not have a subsystem that takes care of releasing logic.
>
>
> Pretty much drivers have 2 options:
>
> struct my_device {
> void *private_data;
> struct device dev;
> };

Actually people use dev_[gs]et_drvdata() instead of a separate
private_data pointer.  That way there's no need for the my_device
container.



No, drvdata belongs to driver that is bound to a device. We are
talking about private data created and managed by driver that provides
device.


> In this case ->release must live in a subsystem code; individual
> drivers kfree(my_dev->private) and do any additional cleanup after
> calling device_unregister(_dev->dev);

That doesn't sound right.  Generally the call to device_unregister() and
the release method live in the same module.  Maybe you meant to say
individual drivers kfree(my_dev->private_data) and do any additional
cleanup in their remove() routine.


Again, if we are talking about driver bound to a device then devices
->release() is irrelevant. If we are talking about driver that created
device then driver's ->remove() is irrelevant.



This approach seems dangerous.  Suppose there's mutex embedded in
my_dev->private_data, and suppose some other thread is blocked waiting on
that mutex when remove() is called.  That other thread will then oops when
my_dev->private_data is deallocated.


What other thread? I suppose it is module-local thread or
subsystem-local thread. Let's that particular subsystem take care of
it's own data and use its own ->release() when it is ready. What I
mean is there is no need to perform clean-up at once; every layer can
do its own cleanup.



> Second option:
>
> struct my_device {
> type member1;
> type member2;
>
>struct device *dev;
> };
>
> dev is coming from _device_create(). Driver core takes care of
> releasing dev structure; driver does cleanup of my_device.

Lots of drivers create devices dynamically without using device_create().

More to the point, how does the driver clean up my_device?  It probably
has a reference count somewhere in my_device, especially if my_device is
shared with other threads or other drivers.  We then face exactly the same
problem: What happens if the driver's module is unloaded before all the
references to my_device are gone?



This is up to subsystem to ensure that it does not access dead devices.


> With current sysfs orphaning attributes upon removal request there is
> no issue of accessing driver-private data through references obtained
> via ether embedded or referenced dev structure so everything is fine.

Not so.  There are other pathways besides sysfs which can cause a driver
to access its data structures.



Which ones? These needs to be identified and treated with "immediate
disconnect" that you advocated earlier. Once active users of device's
services are gone you can zap it.

--
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Alan Stern
On Wed, 18 Apr 2007, Dmitry Torokhov wrote:

> I am still do not understand why this is needed. Would it not be
> simplier just to use a reference to struct device instead of embedding
> it in a larger structure if their lifetimes are different and one does
> not have a subsystem that takes care of releasing logic.
> 
> 
> Pretty much drivers have 2 options:
> 
> struct my_device {
> void *private_data;
> struct device dev;
> };

Actually people use dev_[gs]et_drvdata() instead of a separate
private_data pointer.  That way there's no need for the my_device
container.

> In this case ->release must live in a subsystem code; individual
> drivers kfree(my_dev->private) and do any additional cleanup after
> calling device_unregister(_dev->dev);

That doesn't sound right.  Generally the call to device_unregister() and
the release method live in the same module.  Maybe you meant to say
individual drivers kfree(my_dev->private_data) and do any additional
cleanup in their remove() routine.

This approach seems dangerous.  Suppose there's mutex embedded in 
my_dev->private_data, and suppose some other thread is blocked waiting on 
that mutex when remove() is called.  That other thread will then oops when 
my_dev->private_data is deallocated.

> Second option:
> 
> struct my_device {
> type member1;
> type member2;
> 
>struct device *dev;
> };
> 
> dev is coming from _device_create(). Driver core takes care of
> releasing dev structure; driver does cleanup of my_device.

Lots of drivers create devices dynamically without using device_create().

More to the point, how does the driver clean up my_device?  It probably 
has a reference count somewhere in my_device, especially if my_device is 
shared with other threads or other drivers.  We then face exactly the same 
problem: What happens if the driver's module is unloaded before all the 
references to my_device are gone?

> With current sysfs orphaning attributes upon removal request there is
> no issue of accessing driver-private data through references obtained
> via ether embedded or referenced dev structure so everything is fine.

Not so.  There are other pathways besides sysfs which can cause a driver
to access its data structures.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Cornelia Huck
On Thu, 19 Apr 2007 10:20:51 -0400 (EDT),
Alan Stern <[EMAIL PROTECTED]> wrote:

> The patch below, applied on top of Cornelia's changes plus the 
> kobject_init() patch I posted earlier, actually seems to work.  And it 
> prevents an oops which I was able to trigger without it.

Looks nice at first glance. I'll try to test it a bit.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Alan Stern
On Thu, 19 Apr 2007, Cornelia Huck wrote:

> On Thu, 19 Apr 2007 01:06:25 +0900,
> Tejun Heo <[EMAIL PROTECTED]> wrote:
> 
> > Heh heh, I'm amazed it actually works.  Agreed that the list walking
> > isn't pretty but adding completion to each kobject still feels like too
> > much of overhead just for release waiting.  Any better ideas?
> 
> Not really. But I'm more in favour on adding a new field than on adding
> complicated code. (Although, the whole device infrastructure wouldn't
> fare bad with some diet...)

The patch below, applied on top of Cornelia's changes plus the 
kobject_init() patch I posted earlier, actually seems to work.  And it 
prevents an oops which I was able to trigger without it.

Alan Stern



Index: usb-2.6/drivers/base/core.c
===
--- usb-2.6.orig/drivers/base/core.c
+++ usb-2.6/drivers/base/core.c
@@ -479,8 +479,9 @@ static void klist_children_put(struct kl
 
 
 /**
- * device_initialize - init device structure.
+ * device_initialize_owner - init device structure.
  * @dev:   device.
+ * @owner: module owing the release routine.
  *
  * This prepares the device for use by other layers,
  * including adding it to the device hierarchy.
@@ -489,10 +490,10 @@ static void klist_children_put(struct kl
  * may use @dev's fields (e.g. the refcount).
  */
 
-void device_initialize(struct device *dev)
+void device_initialize_owner(struct device *dev, struct module *owner)
 {
kobj_set_kset_s(dev, devices_subsys);
-   kobject_init(>kobj);
+   kobject_init_owner(>kobj, owner);
klist_init(>klist_children, klist_children_get,
   klist_children_put);
INIT_LIST_HEAD(>dma_pools);
@@ -756,8 +757,9 @@ int device_add(struct device *dev)
 
 
 /**
- * device_register - register a device with the system.
+ * device_register_owner - register a device with the system.
  * @dev:   pointer to the device structure
+ * @owner: module owning the release routine
  *
  * This happens in two clean steps - initialize the device
  * and add it to the system. The two steps can be called
@@ -767,9 +769,9 @@ int device_add(struct device *dev)
  * before it is added to the hierarchy.
  */
 
-int device_register(struct device *dev)
+int device_register_owner(struct device *dev, struct module *owner)
 {
-   device_initialize(dev);
+   device_initialize_owner(dev, owner);
return device_add(dev);
 }
 
@@ -1013,9 +1015,9 @@ int __init devices_init(void)
 EXPORT_SYMBOL_GPL(device_for_each_child);
 EXPORT_SYMBOL_GPL(device_find_child);
 
-EXPORT_SYMBOL_GPL(device_initialize);
+EXPORT_SYMBOL_GPL(device_initialize_owner);
 EXPORT_SYMBOL_GPL(device_add);
-EXPORT_SYMBOL_GPL(device_register);
+EXPORT_SYMBOL_GPL(device_register_owner);
 
 EXPORT_SYMBOL_GPL(device_del);
 EXPORT_SYMBOL_GPL(device_unregister);
Index: usb-2.6/include/linux/device.h
===
--- usb-2.6.orig/include/linux/device.h
+++ usb-2.6/include/linux/device.h
@@ -509,9 +509,12 @@ void driver_init(void);
 /*
  * High level routines for use by the bus drivers
  */
-extern int __must_check device_register(struct device * dev);
+extern int __must_check device_register_owner(struct device * dev,
+   struct module *owner);
+#define device_register(dev)   device_register_owner(dev, THIS_MODULE)
 extern void device_unregister(struct device * dev);
-extern void device_initialize(struct device * dev);
+extern void device_initialize_owner(struct device * dev, struct module *owner);
+#define device_initialize(dev) device_initialize_owner(dev, THIS_MODULE)
 extern int __must_check device_add(struct device * dev);
 extern void device_del(struct device * dev);
 extern int device_for_each_child(struct device *, void *,
Index: usb-2.6/drivers/base/platform.c
===
--- usb-2.6.orig/drivers/base/platform.c
+++ usb-2.6/drivers/base/platform.c
@@ -106,13 +106,15 @@ EXPORT_SYMBOL_GPL(platform_get_irq_bynam
  * platform_add_devices - add a numbers of platform devices
  * @devs: array of platform devices to add
  * @num: number of platform devices in array
+ * @owner: module owning the devices in @devs
  */
-int platform_add_devices(struct platform_device **devs, int num)
+int platform_add_devices_owner(struct platform_device **devs, int num,
+   struct module *owner)
 {
int i, ret = 0;
 
for (i = 0; i < num; i++) {
-   ret = platform_device_register(devs[i]);
+   ret = platform_device_register_owner(devs[i], owner);
if (ret) {
while (--i >= 0)
platform_device_unregister(devs[i]);
@@ -122,7 +124,7 @@ int platform_add_devices(struct platform
 
return ret;
 }
-EXPORT_SYMBOL_GPL(platform_add_devices);

Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Dmitry Torokhov

On 4/19/07, Cornelia Huck <[EMAIL PROTECTED]> wrote:

On Thu, 19 Apr 2007 09:13:43 -0400,
"Dmitry Torokhov" <[EMAIL PROTECTED]> wrote:

> Because they are managed by 2 different entities. the struct device
> objects are managed by device core and driver-specific objects are
> managed by their respective driver.

Not sure if I understand you here. My view of this was always that the
embedding object was kind of an extended device and that the relevant
driver/subsystem managed it through the driver core infrastructure.



I am not sure if I agree with this point of view. Driver (or
subsystem) provides an instance of struct device for the rest of the
system to iteract uniformly with (suspend/resume/tree
visualization/etc) i.e. struct device implement an interface for
subsystems. However most of the system use their own mechanisms to
manage their devices. They can rely on the driver core to a certain
degree but driver core is mostly a carries out helper functions, not
the meat.


>
> > > Pretty much drivers have 2 options:
> > >
> > > struct my_device {
> > > void *private_data;
> > > struct device dev;
> > > };
> > >
> > > In this case ->release must live in a subsystem code; individual
> > > drivers kfree(my_dev->private) and do any additional cleanup after
> > > calling device_unregister(_dev->dev);
> >
> > They must do this in the ->remove callback.
>
> Why? If the driver truly stops hardware then any driver-specific data
> is not needed. With sysfs severing access to removed attributes there
> is no need to gave "global release", cleanup can be done in stages.

I think I meant the same thing :) Freeing the data in the ->release
callback is obviously too late. Freeing it in the ->remove callback
means that the device is no longer really used (and can't be looked up
any more); only some further refrences may linger (and those are of no
consequence with the sysfs disconnect).



Ah, right, I confused ->remove() with ->release() in your post. Sorry
about that.

--
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Cornelia Huck
On Thu, 19 Apr 2007 09:13:43 -0400,
"Dmitry Torokhov" <[EMAIL PROTECTED]> wrote:

> Because they are managed by 2 different entities. the struct device
> objects are managed by device core and driver-specific objects are
> managed by their respective driver.

Not sure if I understand you here. My view of this was always that the
embedding object was kind of an extended device and that the relevant
driver/subsystem managed it through the driver core infrastructure.

> 
> > > Pretty much drivers have 2 options:
> > >
> > > struct my_device {
> > > void *private_data;
> > > struct device dev;
> > > };
> > >
> > > In this case ->release must live in a subsystem code; individual
> > > drivers kfree(my_dev->private) and do any additional cleanup after
> > > calling device_unregister(_dev->dev);
> >
> > They must do this in the ->remove callback.
> 
> Why? If the driver truly stops hardware then any driver-specific data
> is not needed. With sysfs severing access to removed attributes there
> is no need to gave "global release", cleanup can be done in stages.

I think I meant the same thing :) Freeing the data in the ->release
callback is obviously too late. Freeing it in the ->remove callback
means that the device is no longer really used (and can't be looked up
any more); only some further refrences may linger (and those are of no
consequence with the sysfs disconnect).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Cornelia Huck
On Thu, 19 Apr 2007 01:06:25 +0900,
Tejun Heo <[EMAIL PROTECTED]> wrote:

> Heh heh, I'm amazed it actually works.  Agreed that the list walking
> isn't pretty but adding completion to each kobject still feels like too
> much of overhead just for release waiting.  Any better ideas?

Not really. But I'm more in favour on adding a new field than on adding
complicated code. (Although, the whole device infrastructure wouldn't
fare bad with some diet...)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Dmitry Torokhov

On 4/19/07, Cornelia Huck <[EMAIL PROTECTED]> wrote:

On Wed, 18 Apr 2007 12:41:36 -0400,
"Dmitry Torokhov" <[EMAIL PROTECTED]> wrote:

> I am still do not understand why this is needed. Would it not be
> simplier just to use a reference to struct device instead of embedding
> it in a larger structure if their lifetimes are different and one does
> not have a subsystem that takes care of releasing logic.

Why are their lifetimes different? Usually, if I hold on to the device,
I also want to be able to use the structure that embeds the device.



Because they are managed by 2 different entities. the struct device
objects are managed by device core and driver-specific objects are
managed by their respective driver.


> Pretty much drivers have 2 options:
>
> struct my_device {
> void *private_data;
> struct device dev;
> };
>
> In this case ->release must live in a subsystem code; individual
> drivers kfree(my_dev->private) and do any additional cleanup after
> calling device_unregister(_dev->dev);

They must do this in the ->remove callback.


Why? If the driver truly stops hardware then any driver-specific data
is not needed. With sysfs severing access to removed attributes there
is no need to gave "global release", cleanup can be done in stages.



>
> Second option:
>
> struct my_device {
> type member1;
> type member2;
>
>struct device *dev;
> };
>
> dev is coming from _device_create(). Driver core takes care of
> releasing dev structure; driver does cleanup of my_device.

device_create() would need to not expect a class then, or it's not
universally usable. Also, the driver would need a method to get back
from the device to my_device. We're practically back at the first
option again, only that now the ->release function is sitting in the
driver core instead of the subsystem.



To a degree.

--
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Cornelia Huck
On Wed, 18 Apr 2007 12:41:36 -0400,
"Dmitry Torokhov" <[EMAIL PROTECTED]> wrote:

> I am still do not understand why this is needed. Would it not be
> simplier just to use a reference to struct device instead of embedding
> it in a larger structure if their lifetimes are different and one does
> not have a subsystem that takes care of releasing logic.

Why are their lifetimes different? Usually, if I hold on to the device,
I also want to be able to use the structure that embeds the device.
 
> Pretty much drivers have 2 options:
> 
> struct my_device {
> void *private_data;
> struct device dev;
> };
> 
> In this case ->release must live in a subsystem code; individual
> drivers kfree(my_dev->private) and do any additional cleanup after
> calling device_unregister(_dev->dev);

They must do this in the ->remove callback.

> 
> Second option:
> 
> struct my_device {
> type member1;
> type member2;
> 
>struct device *dev;
> };
> 
> dev is coming from _device_create(). Driver core takes care of
> releasing dev structure; driver does cleanup of my_device.

device_create() would need to not expect a class then, or it's not
universally usable. Also, the driver would need a method to get back
from the device to my_device. We're practically back at the first
option again, only that now the ->release function is sitting in the
driver core instead of the subsystem.

> With current sysfs orphaning attributes upon removal request there is
> no issue of accessing driver-private data through references obtained
> via ether embedded or referenced dev structure so everything is fine.

Discoupling of sysfs and kobject lifetime rules definetly eliminates a
lot of issues.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Cornelia Huck
On Wed, 18 Apr 2007 12:41:36 -0400,
Dmitry Torokhov [EMAIL PROTECTED] wrote:

 I am still do not understand why this is needed. Would it not be
 simplier just to use a reference to struct device instead of embedding
 it in a larger structure if their lifetimes are different and one does
 not have a subsystem that takes care of releasing logic.

Why are their lifetimes different? Usually, if I hold on to the device,
I also want to be able to use the structure that embeds the device.
 
 Pretty much drivers have 2 options:
 
 struct my_device {
 void *private_data;
 struct device dev;
 };
 
 In this case -release must live in a subsystem code; individual
 drivers kfree(my_dev-private) and do any additional cleanup after
 calling device_unregister(my_dev-dev);

They must do this in the -remove callback.

 
 Second option:
 
 struct my_device {
 type member1;
 type member2;
 
struct device *dev;
 };
 
 dev is coming from _device_create(). Driver core takes care of
 releasing dev structure; driver does cleanup of my_device.

device_create() would need to not expect a class then, or it's not
universally usable. Also, the driver would need a method to get back
from the device to my_device. We're practically back at the first
option again, only that now the -release function is sitting in the
driver core instead of the subsystem.

 With current sysfs orphaning attributes upon removal request there is
 no issue of accessing driver-private data through references obtained
 via ether embedded or referenced dev structure so everything is fine.

Discoupling of sysfs and kobject lifetime rules definetly eliminates a
lot of issues.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Dmitry Torokhov

On 4/19/07, Cornelia Huck [EMAIL PROTECTED] wrote:

On Wed, 18 Apr 2007 12:41:36 -0400,
Dmitry Torokhov [EMAIL PROTECTED] wrote:

 I am still do not understand why this is needed. Would it not be
 simplier just to use a reference to struct device instead of embedding
 it in a larger structure if their lifetimes are different and one does
 not have a subsystem that takes care of releasing logic.

Why are their lifetimes different? Usually, if I hold on to the device,
I also want to be able to use the structure that embeds the device.



Because they are managed by 2 different entities. the struct device
objects are managed by device core and driver-specific objects are
managed by their respective driver.


 Pretty much drivers have 2 options:

 struct my_device {
 void *private_data;
 struct device dev;
 };

 In this case -release must live in a subsystem code; individual
 drivers kfree(my_dev-private) and do any additional cleanup after
 calling device_unregister(my_dev-dev);

They must do this in the -remove callback.


Why? If the driver truly stops hardware then any driver-specific data
is not needed. With sysfs severing access to removed attributes there
is no need to gave global release, cleanup can be done in stages.




 Second option:

 struct my_device {
 type member1;
 type member2;

struct device *dev;
 };

 dev is coming from _device_create(). Driver core takes care of
 releasing dev structure; driver does cleanup of my_device.

device_create() would need to not expect a class then, or it's not
universally usable. Also, the driver would need a method to get back
from the device to my_device. We're practically back at the first
option again, only that now the -release function is sitting in the
driver core instead of the subsystem.



To a degree.

--
Dmitry
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Cornelia Huck
On Thu, 19 Apr 2007 01:06:25 +0900,
Tejun Heo [EMAIL PROTECTED] wrote:

 Heh heh, I'm amazed it actually works.  Agreed that the list walking
 isn't pretty but adding completion to each kobject still feels like too
 much of overhead just for release waiting.  Any better ideas?

Not really. But I'm more in favour on adding a new field than on adding
complicated code. (Although, the whole device infrastructure wouldn't
fare bad with some diet...)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Cornelia Huck
On Thu, 19 Apr 2007 09:13:43 -0400,
Dmitry Torokhov [EMAIL PROTECTED] wrote:

 Because they are managed by 2 different entities. the struct device
 objects are managed by device core and driver-specific objects are
 managed by their respective driver.

Not sure if I understand you here. My view of this was always that the
embedding object was kind of an extended device and that the relevant
driver/subsystem managed it through the driver core infrastructure.

 
   Pretty much drivers have 2 options:
  
   struct my_device {
   void *private_data;
   struct device dev;
   };
  
   In this case -release must live in a subsystem code; individual
   drivers kfree(my_dev-private) and do any additional cleanup after
   calling device_unregister(my_dev-dev);
 
  They must do this in the -remove callback.
 
 Why? If the driver truly stops hardware then any driver-specific data
 is not needed. With sysfs severing access to removed attributes there
 is no need to gave global release, cleanup can be done in stages.

I think I meant the same thing :) Freeing the data in the -release
callback is obviously too late. Freeing it in the -remove callback
means that the device is no longer really used (and can't be looked up
any more); only some further refrences may linger (and those are of no
consequence with the sysfs disconnect).
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Alan Stern
On Thu, 19 Apr 2007, Cornelia Huck wrote:

 On Thu, 19 Apr 2007 01:06:25 +0900,
 Tejun Heo [EMAIL PROTECTED] wrote:
 
  Heh heh, I'm amazed it actually works.  Agreed that the list walking
  isn't pretty but adding completion to each kobject still feels like too
  much of overhead just for release waiting.  Any better ideas?
 
 Not really. But I'm more in favour on adding a new field than on adding
 complicated code. (Although, the whole device infrastructure wouldn't
 fare bad with some diet...)

The patch below, applied on top of Cornelia's changes plus the 
kobject_init() patch I posted earlier, actually seems to work.  And it 
prevents an oops which I was able to trigger without it.

Alan Stern



Index: usb-2.6/drivers/base/core.c
===
--- usb-2.6.orig/drivers/base/core.c
+++ usb-2.6/drivers/base/core.c
@@ -479,8 +479,9 @@ static void klist_children_put(struct kl
 
 
 /**
- * device_initialize - init device structure.
+ * device_initialize_owner - init device structure.
  * @dev:   device.
+ * @owner: module owing the release routine.
  *
  * This prepares the device for use by other layers,
  * including adding it to the device hierarchy.
@@ -489,10 +490,10 @@ static void klist_children_put(struct kl
  * may use @dev's fields (e.g. the refcount).
  */
 
-void device_initialize(struct device *dev)
+void device_initialize_owner(struct device *dev, struct module *owner)
 {
kobj_set_kset_s(dev, devices_subsys);
-   kobject_init(dev-kobj);
+   kobject_init_owner(dev-kobj, owner);
klist_init(dev-klist_children, klist_children_get,
   klist_children_put);
INIT_LIST_HEAD(dev-dma_pools);
@@ -756,8 +757,9 @@ int device_add(struct device *dev)
 
 
 /**
- * device_register - register a device with the system.
+ * device_register_owner - register a device with the system.
  * @dev:   pointer to the device structure
+ * @owner: module owning the release routine
  *
  * This happens in two clean steps - initialize the device
  * and add it to the system. The two steps can be called
@@ -767,9 +769,9 @@ int device_add(struct device *dev)
  * before it is added to the hierarchy.
  */
 
-int device_register(struct device *dev)
+int device_register_owner(struct device *dev, struct module *owner)
 {
-   device_initialize(dev);
+   device_initialize_owner(dev, owner);
return device_add(dev);
 }
 
@@ -1013,9 +1015,9 @@ int __init devices_init(void)
 EXPORT_SYMBOL_GPL(device_for_each_child);
 EXPORT_SYMBOL_GPL(device_find_child);
 
-EXPORT_SYMBOL_GPL(device_initialize);
+EXPORT_SYMBOL_GPL(device_initialize_owner);
 EXPORT_SYMBOL_GPL(device_add);
-EXPORT_SYMBOL_GPL(device_register);
+EXPORT_SYMBOL_GPL(device_register_owner);
 
 EXPORT_SYMBOL_GPL(device_del);
 EXPORT_SYMBOL_GPL(device_unregister);
Index: usb-2.6/include/linux/device.h
===
--- usb-2.6.orig/include/linux/device.h
+++ usb-2.6/include/linux/device.h
@@ -509,9 +509,12 @@ void driver_init(void);
 /*
  * High level routines for use by the bus drivers
  */
-extern int __must_check device_register(struct device * dev);
+extern int __must_check device_register_owner(struct device * dev,
+   struct module *owner);
+#define device_register(dev)   device_register_owner(dev, THIS_MODULE)
 extern void device_unregister(struct device * dev);
-extern void device_initialize(struct device * dev);
+extern void device_initialize_owner(struct device * dev, struct module *owner);
+#define device_initialize(dev) device_initialize_owner(dev, THIS_MODULE)
 extern int __must_check device_add(struct device * dev);
 extern void device_del(struct device * dev);
 extern int device_for_each_child(struct device *, void *,
Index: usb-2.6/drivers/base/platform.c
===
--- usb-2.6.orig/drivers/base/platform.c
+++ usb-2.6/drivers/base/platform.c
@@ -106,13 +106,15 @@ EXPORT_SYMBOL_GPL(platform_get_irq_bynam
  * platform_add_devices - add a numbers of platform devices
  * @devs: array of platform devices to add
  * @num: number of platform devices in array
+ * @owner: module owning the devices in @devs
  */
-int platform_add_devices(struct platform_device **devs, int num)
+int platform_add_devices_owner(struct platform_device **devs, int num,
+   struct module *owner)
 {
int i, ret = 0;
 
for (i = 0; i  num; i++) {
-   ret = platform_device_register(devs[i]);
+   ret = platform_device_register_owner(devs[i], owner);
if (ret) {
while (--i = 0)
platform_device_unregister(devs[i]);
@@ -122,7 +124,7 @@ int platform_add_devices(struct platform
 
return ret;
 }
-EXPORT_SYMBOL_GPL(platform_add_devices);

Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Dmitry Torokhov

On 4/19/07, Cornelia Huck [EMAIL PROTECTED] wrote:

On Thu, 19 Apr 2007 09:13:43 -0400,
Dmitry Torokhov [EMAIL PROTECTED] wrote:

 Because they are managed by 2 different entities. the struct device
 objects are managed by device core and driver-specific objects are
 managed by their respective driver.

Not sure if I understand you here. My view of this was always that the
embedding object was kind of an extended device and that the relevant
driver/subsystem managed it through the driver core infrastructure.



I am not sure if I agree with this point of view. Driver (or
subsystem) provides an instance of struct device for the rest of the
system to iteract uniformly with (suspend/resume/tree
visualization/etc) i.e. struct device implement an interface for
subsystems. However most of the system use their own mechanisms to
manage their devices. They can rely on the driver core to a certain
degree but driver core is mostly a carries out helper functions, not
the meat.



   Pretty much drivers have 2 options:
  
   struct my_device {
   void *private_data;
   struct device dev;
   };
  
   In this case -release must live in a subsystem code; individual
   drivers kfree(my_dev-private) and do any additional cleanup after
   calling device_unregister(my_dev-dev);
 
  They must do this in the -remove callback.

 Why? If the driver truly stops hardware then any driver-specific data
 is not needed. With sysfs severing access to removed attributes there
 is no need to gave global release, cleanup can be done in stages.

I think I meant the same thing :) Freeing the data in the -release
callback is obviously too late. Freeing it in the -remove callback
means that the device is no longer really used (and can't be looked up
any more); only some further refrences may linger (and those are of no
consequence with the sysfs disconnect).



Ah, right, I confused -remove() with -release() in your post. Sorry
about that.

--
Dmitry
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Cornelia Huck
On Thu, 19 Apr 2007 10:20:51 -0400 (EDT),
Alan Stern [EMAIL PROTECTED] wrote:

 The patch below, applied on top of Cornelia's changes plus the 
 kobject_init() patch I posted earlier, actually seems to work.  And it 
 prevents an oops which I was able to trigger without it.

Looks nice at first glance. I'll try to test it a bit.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Alan Stern
On Wed, 18 Apr 2007, Dmitry Torokhov wrote:

 I am still do not understand why this is needed. Would it not be
 simplier just to use a reference to struct device instead of embedding
 it in a larger structure if their lifetimes are different and one does
 not have a subsystem that takes care of releasing logic.
 
 
 Pretty much drivers have 2 options:
 
 struct my_device {
 void *private_data;
 struct device dev;
 };

Actually people use dev_[gs]et_drvdata() instead of a separate
private_data pointer.  That way there's no need for the my_device
container.

 In this case -release must live in a subsystem code; individual
 drivers kfree(my_dev-private) and do any additional cleanup after
 calling device_unregister(my_dev-dev);

That doesn't sound right.  Generally the call to device_unregister() and
the release method live in the same module.  Maybe you meant to say
individual drivers kfree(my_dev-private_data) and do any additional
cleanup in their remove() routine.

This approach seems dangerous.  Suppose there's mutex embedded in 
my_dev-private_data, and suppose some other thread is blocked waiting on 
that mutex when remove() is called.  That other thread will then oops when 
my_dev-private_data is deallocated.

 Second option:
 
 struct my_device {
 type member1;
 type member2;
 
struct device *dev;
 };
 
 dev is coming from _device_create(). Driver core takes care of
 releasing dev structure; driver does cleanup of my_device.

Lots of drivers create devices dynamically without using device_create().

More to the point, how does the driver clean up my_device?  It probably 
has a reference count somewhere in my_device, especially if my_device is 
shared with other threads or other drivers.  We then face exactly the same 
problem: What happens if the driver's module is unloaded before all the 
references to my_device are gone?

 With current sysfs orphaning attributes upon removal request there is
 no issue of accessing driver-private data through references obtained
 via ether embedded or referenced dev structure so everything is fine.

Not so.  There are other pathways besides sysfs which can cause a driver
to access its data structures.

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Dmitry Torokhov

On 4/19/07, Alan Stern [EMAIL PROTECTED] wrote:

On Wed, 18 Apr 2007, Dmitry Torokhov wrote:

 I am still do not understand why this is needed. Would it not be
 simplier just to use a reference to struct device instead of embedding
 it in a larger structure if their lifetimes are different and one does
 not have a subsystem that takes care of releasing logic.


 Pretty much drivers have 2 options:

 struct my_device {
 void *private_data;
 struct device dev;
 };

Actually people use dev_[gs]et_drvdata() instead of a separate
private_data pointer.  That way there's no need for the my_device
container.



No, drvdata belongs to driver that is bound to a device. We are
talking about private data created and managed by driver that provides
device.


 In this case -release must live in a subsystem code; individual
 drivers kfree(my_dev-private) and do any additional cleanup after
 calling device_unregister(my_dev-dev);

That doesn't sound right.  Generally the call to device_unregister() and
the release method live in the same module.  Maybe you meant to say
individual drivers kfree(my_dev-private_data) and do any additional
cleanup in their remove() routine.


Again, if we are talking about driver bound to a device then devices
-release() is irrelevant. If we are talking about driver that created
device then driver's -remove() is irrelevant.



This approach seems dangerous.  Suppose there's mutex embedded in
my_dev-private_data, and suppose some other thread is blocked waiting on
that mutex when remove() is called.  That other thread will then oops when
my_dev-private_data is deallocated.


What other thread? I suppose it is module-local thread or
subsystem-local thread. Let's that particular subsystem take care of
it's own data and use its own -release() when it is ready. What I
mean is there is no need to perform clean-up at once; every layer can
do its own cleanup.



 Second option:

 struct my_device {
 type member1;
 type member2;

struct device *dev;
 };

 dev is coming from _device_create(). Driver core takes care of
 releasing dev structure; driver does cleanup of my_device.

Lots of drivers create devices dynamically without using device_create().

More to the point, how does the driver clean up my_device?  It probably
has a reference count somewhere in my_device, especially if my_device is
shared with other threads or other drivers.  We then face exactly the same
problem: What happens if the driver's module is unloaded before all the
references to my_device are gone?



This is up to subsystem to ensure that it does not access dead devices.


 With current sysfs orphaning attributes upon removal request there is
 no issue of accessing driver-private data through references obtained
 via ether embedded or referenced dev structure so everything is fine.

Not so.  There are other pathways besides sysfs which can cause a driver
to access its data structures.



Which ones? These needs to be identified and treated with immediate
disconnect that you advocated earlier. Once active users of device's
services are gone you can zap it.

--
Dmitry
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Alan Stern
On Thu, 19 Apr 2007, Dmitry Torokhov wrote:

 On 4/19/07, Alan Stern [EMAIL PROTECTED] wrote:
  On Wed, 18 Apr 2007, Dmitry Torokhov wrote:
 
   I am still do not understand why this is needed. Would it not be
   simplier just to use a reference to struct device instead of embedding
   it in a larger structure if their lifetimes are different and one does
   not have a subsystem that takes care of releasing logic.
  
  
   Pretty much drivers have 2 options:
  
   struct my_device {
   void *private_data;
   struct device dev;
   };
 
  Actually people use dev_[gs]et_drvdata() instead of a separate
  private_data pointer.  That way there's no need for the my_device
  container.
 
 
 No, drvdata belongs to driver that is bound to a device. We are
 talking about private data created and managed by driver that provides
 device.

Oh, sorry, I misunderstood.

 Again, if we are talking about driver bound to a device then devices
 -release() is irrelevant. If we are talking about driver that created
 device then driver's -remove() is irrelevant.

The problem is that device structures carry references to their parents.  
So even if a driver does use this option and is able to give up its own 
references immediately, it can't make any guarantees about drivers of 
child devices.  If every driver followed your scheme we'd be okay; until 
then any ancestor of a device with a non-immediate-detach driver would not 
be able to do immediate detach.

There's also another problem.  Even though all the references to
my_dev-dev may be gone, there will probably still be outstanding
references to private_data structure.  The driver's exit() routine will 
have to wait until all those references are gone.  How can it do that 
safely?  Presumably the private_data will include its own refcount and 
some release routine will run when the count drops to 0.  That routine 
will be part of the driver's module -- so how can it enable itself to be 
unloaded?


   With current sysfs orphaning attributes upon removal request there is
   no issue of accessing driver-private data through references obtained
   via ether embedded or referenced dev structure so everything is fine.
 
  Not so.  There are other pathways besides sysfs which can cause a driver
  to access its data structures.
 
 
 Which ones? These needs to be identified and treated with immediate
 disconnect that you advocated earlier. Once active users of device's
 services are gone you can zap it.

Among the worst offenders are character devices.  None of the subsystem
providers offering char device registration performs immediate detach --
they are a lot like sysfs used to be.  (In fact, they probably _can't_
provide it since read() or write() calls can block indefinitely in the
lower-level driver; the subsystem may have no way to force them to
fail-fast.)  So every lower-level driver which registers a char device
needs to provide its own synchronization and immediate detach, and at the
moment I doubt very many of them do.  And of those which do, a large
percentage rely on the BKL for synchronization!

Here's another awkward possibility.  Suppose a workqueue routine tries to 
unregister a device.  Suppose the device's driver needs to do something in 
a process context, so it has a work item pending in the same workqueue.  
The driver's remove() method would need to flush the workqueue (in order 
to perform immediate detach), but doing so would cause a deadlock.

Alan Stern

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-19 Thread Tejun Heo
Hello, Alan.

Alan Stern wrote:
 This doesn't solve a related problem: a subsystem wants to register
 devices and to provide a set of mutually-exclusive services to the
 devices' drivers.  The mutual exclusion has to be provided by a mutex or
 something similar, and the drivers need a way to unbind even while waiting
 to acquire the mutex.

I don't really follow why the drivers need a way to unbind even while
waiting to acquire the mutex.  Care to enlighten me?

 The obvious answer is to introduce a different sort of synchronization 
 primitive: a mutex (or semaphore or rwsem) which can be invalidated.
 
 The semantics would be straightforward.  When mutex_invalidate() is
 called, it marks the mutex so that all future lock attempts will fail with
 -ENODEV.  It also wakes up all threads that are blocked trying to lock the
 mutex and causes them to fail with the same error.  Once all that is done
 mutex_invalidate() returns.  In particular, it doesn't wait for the
 current lock to be released -- in fact, you would call it while holding
 the lock.
 
 This would solve a lot of your problems.  But it would also mean making 
 extensive changes to the kernel.  For one thing, mutex_lock() would return 
 int instead of void, and you would want to mark it __must_check.  Every 
 place where a mutex is locked, the code would have to be changed to add an 
 error pathway.  That's the sort of thing I was talking about when I said 
 it was going to be a tremendous job.

I think we both agree that's not a good idea.  :-)

 I thought of something else that could also be done: There should be a way
 to cancel an outstanding workqueue request.  At the moment all you can do
 is call flush_workqueue(), which will hang if you are already executing in
 a workqueue routine.  You should be able to delete a particular entry from
 the workqueue (or wait for it to complete if it has already started
 running).  This could be implemented right away.

It all depends on how a particular subsystem is shaped but having such
thing would definitely help.

 More problems with immediate detach -- it would have to apply to char
 devices.  When a char device is unregistered you can't force user
 processes to close their open file handles.  Instead something like your
 change to sysfs is needed -- wait for outstanding calls to complete and
 fail any future calls.  This means that registering a device will use up
 more than just a pointer in a table of minor device numbers.  Each entry
 would require at least an rwsem, and device I/O would be slowed down by
 the need to get a read-lock each time before entering the device driver.
 
 The same idea applies to block devices, although here the problems center 
 more around the block core and request queues.

Yeah, exactly.  My argument is that that impedance matching between
lifetime rules must happen at some place and it's better if we can do in
the higher layer where we can afford more effort and thus complexity.
We're currently pushing that down to each drivers and not too many are
getting it right.  I think it's just unrealistic to expect every and
each driver subsystems to get it right, so some overhead at higher layer
is acceptable and we can definitely afford much more optimization at
higher layer.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Alan Stern
On Thu, 19 Apr 2007, Tejun Heo wrote:

> More afterthoughts.  If a mutex is used to protect access against
> removal.  There is no reason to hold reference to it.
> 
> kernel_thread()
> {
>   /* wanna dereference my_obj */
>   mutex_lock();
>   verify my_obj is there and use it if so.
>   mutex_unlock();
> }
> 
> remove()
> {
>   mutex_lock();
>   kill_it();
>   mutex_unlock();
> }
> 
> I probably have over simplified it but using both mutex and reference
> counts doesn't make much sense.  IOW, you get an active reference when
> you grab the mutex excluding its removal and verified it's still there.
> 
> There probably are other reasons why things are done that way and we can
>  and probably will have to resort to mixed solutions in foreseeable
> future but I don't think there is any inherent problem in applying
> immediate-disconnect in the described situation.
> 
> Feel free to scream at me if I'm getting it totally wrong.  :-)

This doesn't solve a related problem: a subsystem wants to register
devices and to provide a set of mutually-exclusive services to the
devices' drivers.  The mutual exclusion has to be provided by a mutex or
something similar, and the drivers need a way to unbind even while waiting
to acquire the mutex.

The obvious answer is to introduce a different sort of synchronization 
primitive: a mutex (or semaphore or rwsem) which can be invalidated.

The semantics would be straightforward.  When mutex_invalidate() is
called, it marks the mutex so that all future lock attempts will fail with
-ENODEV.  It also wakes up all threads that are blocked trying to lock the
mutex and causes them to fail with the same error.  Once all that is done
mutex_invalidate() returns.  In particular, it doesn't wait for the
current lock to be released -- in fact, you would call it while holding
the lock.

This would solve a lot of your problems.  But it would also mean making 
extensive changes to the kernel.  For one thing, mutex_lock() would return 
int instead of void, and you would want to mark it __must_check.  Every 
place where a mutex is locked, the code would have to be changed to add an 
error pathway.  That's the sort of thing I was talking about when I said 
it was going to be a tremendous job.


I thought of something else that could also be done: There should be a way
to cancel an outstanding workqueue request.  At the moment all you can do
is call flush_workqueue(), which will hang if you are already executing in
a workqueue routine.  You should be able to delete a particular entry from
the workqueue (or wait for it to complete if it has already started
running).  This could be implemented right away.


More problems with immediate detach -- it would have to apply to char
devices.  When a char device is unregistered you can't force user
processes to close their open file handles.  Instead something like your
change to sysfs is needed -- wait for outstanding calls to complete and
fail any future calls.  This means that registering a device will use up
more than just a pointer in a table of minor device numbers.  Each entry
would require at least an rwsem, and device I/O would be slowed down by
the need to get a read-lock each time before entering the device driver.

The same idea applies to block devices, although here the problems center 
more around the block core and request queues.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Dmitry Torokhov

On 4/18/07, Tejun Heo <[EMAIL PROTECTED]> wrote:

Hello,

Alan Stern wrote:
> On Wed, 18 Apr 2007, Tejun Heo wrote:
>
>> Hello, all.
>>
>> Agreed with the problem but I'm not very enthusiastic for adding
>> kobj->owner.  How about the following?  exit() routines will have to
>> do device_unregister_wait() instead of device_unregister().  On return
>> from it, it's guaranteed that all references to it are dropped and
>> ->release is finished.  The caller is responsible for avoiding
>> deadlock, of course.
>
> There's a problem with this approach.
>
> Many drivers, especially those for hot-pluggable buses, register and
> unregister devices dynamically.  These events can occur in time-critical
> situations, where the driver cannot afford to wait for all the references
> to be dropped when unregistering a device.  It's okay to wait in a module
> exit routine, but to make things work the routine would have to wait for
> references to _all_ unregistered objects to go away, not just the
> references for the objects it unregisters at exit time.
>
> So let's see what changes are needed to make the approach workable.  We
> will have to maintain a count of objects whose release methods haven't
> been called yet.  The count has to be incremented every time an object is
> unregistered (or registered, it doesn't matter which) and decremented
> _after_ the release method returns -- meaning somewhere in the driver
> core.  When the count goes to zero, the exit routine is then allowed to
> terminate.
>
> Hmmm, this is beginning to sound like a module-wide refcount which serves
> to block mod->exit().  In fact, it sounds almost identical to what
> Cornelia wrote, except that the refcount refers only to devices rather
> than arbitrary kobjects (and except that the blockage just before
> mod->exit returns instead of just after).  You can see where I'm
> leading...

The goal of immediate-disconnect is to remove such lingering reference
counts so that device_unregister() or driver detach puts the last
reference count.

You tell a higher layer that a device is going away, on return from the
function, that layer isn't gonna access the device anymore.  ie. On
return from the unregistration function, the upper layer shouldn't have
any reference to the device.  If you unregister from all layers a device
is registered to, you are left with only 1 reference which you put with
device_unregister().  After all are converted, reference count doesn't
mean much.  struct device isn't a reference counted object anymore.

I don't think this is gonna be too difficult to do.  I think I can
convert block layer and IDE/SCSI drivers without too much problem.
Dunno much about other layers tho.


I am still do not understand why this is needed. Would it not be
simplier just to use a reference to struct device instead of embedding
it in a larger structure if their lifetimes are different and one does
not have a subsystem that takes care of releasing logic.


Pretty much drivers have 2 options:

struct my_device {
   void *private_data;
   struct device dev;
};

In this case ->release must live in a subsystem code; individual
drivers kfree(my_dev->private) and do any additional cleanup after
calling device_unregister(_dev->dev);

Second option:

struct my_device {
   type member1;
   type member2;

  struct device *dev;
};

dev is coming from _device_create(). Driver core takes care of
releasing dev structure; driver does cleanup of my_device.

With current sysfs orphaning attributes upon removal request there is
no issue of accessing driver-private data through references obtained
via ether embedded or referenced dev structure so everything is fine.

--
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo
Hello,

Alan Stern wrote:
> On Thu, 19 Apr 2007, Tejun Heo wrote:
> 
>> The goal of immediate-disconnect is to remove such lingering reference
>> counts so that device_unregister() or driver detach puts the last
>> reference count.
> 
> Yes, I understand.  If you had immediate-disconnect then you wouldn't need 
> device_unregister_wait().  In fact, you wouldn't need any reference counts 
> at all.  It would be guaranteed that when the unregister call returned, 
> all references would be gone.
> 
>> You tell a higher layer that a device is going away, on return from the
>> function, that layer isn't gonna access the device anymore.
> 
> No, no.  You tell somebody (it might be a higher layer, it might be a 
> lower layer, or it might be a same-height layer -- doesn't matter) that a 
> device is going away.

Yeap, right.  I higher, lower, same, whatever.  I was using the term as
drivers usually register to upper layers.

> On return from the function, that layer isn't going 
> to access the device any more, _nor_ will anyone else who has obtained a 
> reference from that layer.  This last clause is very important.

Agreed.  That layer is responsible for managing lingering objects and
telling its users that the device is a zombie now.

>> I don't think this is gonna be too difficult to do.  I think I can
>> convert block layer and IDE/SCSI drivers without too much problem.
>> Dunno much about other layers tho.
> 
> You have to convert more than layers (or core subsystems).  You also have 
> to audit and convert drivers.  It will be tremendously difficult to do.

I definitely can be mis-assessing the problem.  I'll first give a shot
at the block/SCSI layer.  How about that?

> You did misunderstand.  Here's what I was talking about:
> 
> Driver A:
> -
>   unregister_device(dev);
> 
>   /* inside the driver core */
>   down(>sem);
>   if (dev->driver)
>   dev->driver->remove(dev);
>   up(>sem);
>   device_put(dev);/* or device_put_wait */
> 
> 
> Driver B:
> -
> void remove(struct device *dev)
> {
>   struct my_device *mydev = dev_get_drvdata(dev);
> 
>   mydev->gone = 1;
>   kref_put(>kref, my_device_release);
> }
> 
> 
> Driver B's kernel thread:
> -
>   kref_get(>kref);
>   down(>dev.sem);
>   if (mydev->gone)
>   goto finished;
>   ...
>  finished:
>   up(>dev.sem);
>   kref_put(>kref, my_device_release);
> 
> Consider what happens if the kernel thread blocks on its down() while the
> remove() method is running.  It will be impossible for Driver B to
> eliminate the reference to dev held by mydev and by the down() routine.
> 
> In short, Driver B _can't_ provide an immediate detach.  Not unless 
> someone figures out a way to cancel a blocked down().  And do the same 
> thing for other blocking primitives.

Ah.. I see.  You're right in that driver B cannot wait for disconnect in
its remove routine in the above code but using a separate mutex to
protect ->gone should do the trick, so I don't think the above case is a
big problem.  It's a pretty specific case which is easy to spot and update.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Alan Stern
On Thu, 19 Apr 2007, Tejun Heo wrote:

> The goal of immediate-disconnect is to remove such lingering reference
> counts so that device_unregister() or driver detach puts the last
> reference count.

Yes, I understand.  If you had immediate-disconnect then you wouldn't need 
device_unregister_wait().  In fact, you wouldn't need any reference counts 
at all.  It would be guaranteed that when the unregister call returned, 
all references would be gone.

> You tell a higher layer that a device is going away, on return from the
> function, that layer isn't gonna access the device anymore.

No, no.  You tell somebody (it might be a higher layer, it might be a 
lower layer, or it might be a same-height layer -- doesn't matter) that a 
device is going away.  On return from the function, that layer isn't going 
to access the device any more, _nor_ will anyone else who has obtained a 
reference from that layer.  This last clause is very important.

> I don't think this is gonna be too difficult to do.  I think I can
> convert block layer and IDE/SCSI drivers without too much problem.
> Dunno much about other layers tho.

You have to convert more than layers (or core subsystems).  You also have 
to audit and convert drivers.  It will be tremendously difficult to do.

> Dunno if I understood the problem right but can't we do the following?
> 
> remove()
> {
>   acquire sem;
>   device_del();
>   release sem;
>   device_put_wait();
> }

You did misunderstand.  Here's what I was talking about:

Driver A:
-
unregister_device(dev);

/* inside the driver core */
down(>sem);
if (dev->driver)
dev->driver->remove(dev);
up(>sem);
device_put(dev);/* or device_put_wait */


Driver B:
-
void remove(struct device *dev)
{
struct my_device *mydev = dev_get_drvdata(dev);

mydev->gone = 1;
kref_put(>kref, my_device_release);
}


Driver B's kernel thread:
-
kref_get(>kref);
down(>dev.sem);
if (mydev->gone)
goto finished;
...
 finished:
up(>dev.sem);
kref_put(>kref, my_device_release);

Consider what happens if the kernel thread blocks on its down() while the
remove() method is running.  It will be impossible for Driver B to
eliminate the reference to dev held by mydev and by the down() routine.

In short, Driver B _can't_ provide an immediate detach.  Not unless 
someone figures out a way to cancel a blocked down().  And do the same 
thing for other blocking primitives.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo
Cornelia Huck wrote:
> On Wed, 18 Apr 2007 03:41:10 +0900,
> Tejun Heo <[EMAIL PROTECTED]> wrote:
> 
> 
> 
> OK, I hit a bit on the code. Once I saved a reference to the completion
> in kobject_cleanup, it seemed to survive a load/unload testloop for a
> module registering a device. However, I still dislike this "list of
> waiters" approach, it looks too complex...

Heh heh, I'm amazed it actually works.  Agreed that the list walking
isn't pretty but adding completion to each kobject still feels like too
much of overhead just for release waiting.  Any better ideas?

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo
Tejun Heo wrote:
>> Incidentally, Tejun, I'm all in favor of a immediate-detach driver model 
>> approach.  Unfortunately it's impossible to realize fully, although we 
>> could come much closer than we are now.
>>
>> Here's an example where immediate-detach cannot be implemented.  A driver 
>> binds to a device and uses that device is a kernel thread.  The thread 
>> carries out certain operations which require it to hold the device 
>> semaphore (because, for example, they need to be mutually exclusive with 
>> unbind).
>>
>> The driver's remove() method is called with the semaphore held.  If the
>> thread tries to lock the semaphore at the same time and blocks, there is
>> no way at all for the remove() method to force the thread to drop its
>> reference.
>>
>> This isn't merely a theoretical example.  The USB hub driver works in
>> exactly this way.
> 
> Dunno if I understood the problem right but can't we do the following?
> 
> remove()
> {
>   acquire sem;
>   device_del();
>   release sem;
>   device_put_wait();
> }

More afterthoughts.  If a mutex is used to protect access against
removal.  There is no reason to hold reference to it.

kernel_thread()
{
/* wanna dereference my_obj */
mutex_lock();
verify my_obj is there and use it if so.
mutex_unlock();
}

remove()
{
mutex_lock();
kill_it();
mutex_unlock();
}

I probably have over simplified it but using both mutex and reference
counts doesn't make much sense.  IOW, you get an active reference when
you grab the mutex excluding its removal and verified it's still there.

There probably are other reasons why things are done that way and we can
 and probably will have to resort to mixed solutions in foreseeable
future but I don't think there is any inherent problem in applying
immediate-disconnect in the described situation.

Feel free to scream at me if I'm getting it totally wrong.  :-)

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo
Hello,

Alan Stern wrote:
> On Wed, 18 Apr 2007, Tejun Heo wrote:
> 
>> Hello, all.
>>
>> Agreed with the problem but I'm not very enthusiastic for adding
>> kobj->owner.  How about the following?  exit() routines will have to
>> do device_unregister_wait() instead of device_unregister().  On return
>> from it, it's guaranteed that all references to it are dropped and
>> ->release is finished.  The caller is responsible for avoiding
>> deadlock, of course.
> 
> There's a problem with this approach.
> 
> Many drivers, especially those for hot-pluggable buses, register and
> unregister devices dynamically.  These events can occur in time-critical
> situations, where the driver cannot afford to wait for all the references
> to be dropped when unregistering a device.  It's okay to wait in a module
> exit routine, but to make things work the routine would have to wait for
> references to _all_ unregistered objects to go away, not just the
> references for the objects it unregisters at exit time.
> 
> So let's see what changes are needed to make the approach workable.  We
> will have to maintain a count of objects whose release methods haven't
> been called yet.  The count has to be incremented every time an object is
> unregistered (or registered, it doesn't matter which) and decremented
> _after_ the release method returns -- meaning somewhere in the driver
> core.  When the count goes to zero, the exit routine is then allowed to
> terminate.
> 
> Hmmm, this is beginning to sound like a module-wide refcount which serves
> to block mod->exit().  In fact, it sounds almost identical to what
> Cornelia wrote, except that the refcount refers only to devices rather
> than arbitrary kobjects (and except that the blockage just before
> mod->exit returns instead of just after).  You can see where I'm
> leading...

The goal of immediate-disconnect is to remove such lingering reference
counts so that device_unregister() or driver detach puts the last
reference count.

You tell a higher layer that a device is going away, on return from the
function, that layer isn't gonna access the device anymore.  ie. On
return from the unregistration function, the upper layer shouldn't have
any reference to the device.  If you unregister from all layers a device
is registered to, you are left with only 1 reference which you put with
device_unregister().  After all are converted, reference count doesn't
mean much.  struct device isn't a reference counted object anymore.

I don't think this is gonna be too difficult to do.  I think I can
convert block layer and IDE/SCSI drivers without too much problem.
Dunno much about other layers tho.

> Incidentally, Tejun, I'm all in favor of a immediate-detach driver model 
> approach.  Unfortunately it's impossible to realize fully, although we 
> could come much closer than we are now.
> 
> Here's an example where immediate-detach cannot be implemented.  A driver 
> binds to a device and uses that device is a kernel thread.  The thread 
> carries out certain operations which require it to hold the device 
> semaphore (because, for example, they need to be mutually exclusive with 
> unbind).
> 
> The driver's remove() method is called with the semaphore held.  If the
> thread tries to lock the semaphore at the same time and blocks, there is
> no way at all for the remove() method to force the thread to drop its
> reference.
> 
> This isn't merely a theoretical example.  The USB hub driver works in
> exactly this way.

Dunno if I understood the problem right but can't we do the following?

remove()
{
acquire sem;
device_del();
release sem;
device_put_wait();
}

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Cornelia Huck
On Wed, 18 Apr 2007 10:53:55 -0400 (EDT),
Alan Stern <[EMAIL PROTECTED]> wrote:

> Many drivers, especially those for hot-pluggable buses, register and
> unregister devices dynamically.  These events can occur in time-critical
> situations, where the driver cannot afford to wait for all the references
> to be dropped when unregistering a device.  It's okay to wait in a module
> exit routine, but to make things work the routine would have to wait for
> references to _all_ unregistered objects to go away, not just the
> references for the objects it unregisters at exit time.

Uh, yes, didn't think of that. I'd even say that's the majority of
devices allocated in a driver.

> BTW, Cornelia, there's a small problem with your patch set.  You have
> kobject_init() try to grab a reference to kobj->owner, but it's quite
> possible for kobj->owner to contain uninitialized garbage since the rest
> of the kernel is still unaware of its existence.  There's a patch below to
> fix things up.  I've got a second patch which makes device_initialize()  
> pass an owner to the embedded kobject, but I want to try it out a little
> before posting it.  :-)

I was relying on the object being zeroed after allocation (and the
caller of kobject_initialize already setting ->owner), but passing an
owner via the initializing routine sounds saner. Thanks for looking at
that.


> Index: usb-2.6/include/linux/kobject.h
> ===
> --- usb-2.6.orig/include/linux/kobject.h
> +++ usb-2.6/include/linux/kobject.h
> @@ -71,7 +71,8 @@ static inline const char * kobject_name(
>   return kobj->k_name;
>  }
> 
> -extern void kobject_init(struct kobject *);
> +extern void kobject_init_owner(struct kobject *, struct module *owner);
> +#define kobject_init(kobj)   kobject_init_owner(kobj, THIS_MODULE)
>  extern void kobject_cleanup(struct kobject *);
> 
>  extern int __must_check kobject_add(struct kobject *);
> @@ -84,7 +85,9 @@ extern int __must_check kobject_shadow_r
>   const char *new_name);
>  extern int __must_check kobject_move(struct kobject *, struct kobject *);
> 
> -extern int __must_check kobject_register(struct kobject *);
> +extern int __must_check kobject_register_owner(struct kobject *,
> + struct module *owner);
> +#define kobject_register(kobj)   kobject_register_owner(kobj, 
> THIS_MODULE)
>  extern void kobject_unregister(struct kobject *);
> 
>  extern struct kobject * kobject_get(struct kobject *);
> Index: usb-2.6/lib/kobject.c
> ===
> --- usb-2.6.orig/lib/kobject.c
> +++ usb-2.6/lib/kobject.c
> @@ -164,10 +164,11 @@ static void verify_dynamic_kobject_alloc
>  #endif
> 
>  /**
> - *   kobject_init - initialize object.
> + *   kobject_init_onwer - initialize object.
>   *   @kobj:  object in question.
> + *   @owner: module owning @kobj.
>   */
> -void kobject_init(struct kobject * kobj)
> +void kobject_init_owner(struct kobject * kobj, struct module *owner)
>  {
>   if (!kobj)
>   return;
> @@ -178,6 +179,7 @@ void kobject_init(struct kobject * kobj)
>   init_waitqueue_head(>poll);
>   kobj->kset = kset_get(kobj->kset);
>   /* Attempt to grab reference of owning module's kobject. */
> + kobj->owner = owner;
>   mod_kobject_get(kobj->owner);
>  }
> 
> @@ -272,15 +274,16 @@ int kobject_add(struct kobject * kobj)
> 
> 
>  /**
> - *   kobject_register - initialize and add an object.
> + *   kobject_register_owner - initialize and add an object.
>   *   @kobj:  object in question.
> + *   @owner: module owning @kobj.
>   */
> 
> -int kobject_register(struct kobject * kobj)
> +int kobject_register_owner(struct kobject * kobj, struct module *owner)
>  {
>   int error = -EINVAL;
>   if (kobj) {
> - kobject_init(kobj);
> + kobject_init_owner(kobj, owner);
>   error = kobject_add(kobj);
>   if (!error)
>   kobject_uevent(kobj, KOBJ_ADD);
> @@ -750,8 +753,8 @@ void subsys_remove_file(struct subsystem
>  }
>  #endif  /*  0  */
> 
> -EXPORT_SYMBOL(kobject_init);
> -EXPORT_SYMBOL(kobject_register);
> +EXPORT_SYMBOL(kobject_init_owner);
> +EXPORT_SYMBOL(kobject_register_owner);
>  EXPORT_SYMBOL(kobject_unregister);
>  EXPORT_SYMBOL(kobject_get);
>  EXPORT_SYMBOL(kobject_put);
> 

Hm, this means we always have an owner (until we explicitely call
kobject_{init,register}_owner(kobj, NULL)) - which may be unneeded in
some cases. But better to err on the safe side, I suppose.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Cornelia Huck
On Wed, 18 Apr 2007 03:41:10 +0900,
Tejun Heo <[EMAIL PROTECTED]> wrote:



OK, I hit a bit on the code. Once I saved a reference to the completion
in kobject_cleanup, it seemed to survive a load/unload testloop for a
module registering a device. However, I still dislike this "list of
waiters" approach, it looks too complex...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Alan Stern
On Wed, 18 Apr 2007, Tejun Heo wrote:

> Hello, all.
> 
> Agreed with the problem but I'm not very enthusiastic for adding
> kobj->owner.  How about the following?  exit() routines will have to
> do device_unregister_wait() instead of device_unregister().  On return
> from it, it's guaranteed that all references to it are dropped and
> ->release is finished.  The caller is responsible for avoiding
> deadlock, of course.

There's a problem with this approach.

Many drivers, especially those for hot-pluggable buses, register and
unregister devices dynamically.  These events can occur in time-critical
situations, where the driver cannot afford to wait for all the references
to be dropped when unregistering a device.  It's okay to wait in a module
exit routine, but to make things work the routine would have to wait for
references to _all_ unregistered objects to go away, not just the
references for the objects it unregisters at exit time.

So let's see what changes are needed to make the approach workable.  We
will have to maintain a count of objects whose release methods haven't
been called yet.  The count has to be incremented every time an object is
unregistered (or registered, it doesn't matter which) and decremented
_after_ the release method returns -- meaning somewhere in the driver
core.  When the count goes to zero, the exit routine is then allowed to
terminate.

Hmmm, this is beginning to sound like a module-wide refcount which serves
to block mod->exit().  In fact, it sounds almost identical to what
Cornelia wrote, except that the refcount refers only to devices rather
than arbitrary kobjects (and except that the blockage just before
mod->exit returns instead of just after).  You can see where I'm
leading...

BTW, Cornelia, there's a small problem with your patch set.  You have
kobject_init() try to grab a reference to kobj->owner, but it's quite
possible for kobj->owner to contain uninitialized garbage since the rest
of the kernel is still unaware of its existence.  There's a patch below to
fix things up.  I've got a second patch which makes device_initialize()  
pass an owner to the embedded kobject, but I want to try it out a little
before posting it.  :-)


Incidentally, Tejun, I'm all in favor of a immediate-detach driver model 
approach.  Unfortunately it's impossible to realize fully, although we 
could come much closer than we are now.

Here's an example where immediate-detach cannot be implemented.  A driver 
binds to a device and uses that device is a kernel thread.  The thread 
carries out certain operations which require it to hold the device 
semaphore (because, for example, they need to be mutually exclusive with 
unbind).

The driver's remove() method is called with the semaphore held.  If the
thread tries to lock the semaphore at the same time and blocks, there is
no way at all for the remove() method to force the thread to drop its
reference.

This isn't merely a theoretical example.  The USB hub driver works in
exactly this way.

Alan Stern



Index: usb-2.6/include/linux/kobject.h
===
--- usb-2.6.orig/include/linux/kobject.h
+++ usb-2.6/include/linux/kobject.h
@@ -71,7 +71,8 @@ static inline const char * kobject_name(
return kobj->k_name;
 }
 
-extern void kobject_init(struct kobject *);
+extern void kobject_init_owner(struct kobject *, struct module *owner);
+#define kobject_init(kobj) kobject_init_owner(kobj, THIS_MODULE)
 extern void kobject_cleanup(struct kobject *);
 
 extern int __must_check kobject_add(struct kobject *);
@@ -84,7 +85,9 @@ extern int __must_check kobject_shadow_r
const char *new_name);
 extern int __must_check kobject_move(struct kobject *, struct kobject *);
 
-extern int __must_check kobject_register(struct kobject *);
+extern int __must_check kobject_register_owner(struct kobject *,
+   struct module *owner);
+#define kobject_register(kobj) kobject_register_owner(kobj, THIS_MODULE)
 extern void kobject_unregister(struct kobject *);
 
 extern struct kobject * kobject_get(struct kobject *);
Index: usb-2.6/lib/kobject.c
===
--- usb-2.6.orig/lib/kobject.c
+++ usb-2.6/lib/kobject.c
@@ -164,10 +164,11 @@ static void verify_dynamic_kobject_alloc
 #endif
 
 /**
- * kobject_init - initialize object.
+ * kobject_init_onwer - initialize object.
  * @kobj:  object in question.
+ * @owner: module owning @kobj.
  */
-void kobject_init(struct kobject * kobj)
+void kobject_init_owner(struct kobject * kobj, struct module *owner)
 {
if (!kobj)
return;
@@ -178,6 +179,7 @@ void kobject_init(struct kobject * kobj)
init_waitqueue_head(>poll);
kobj->kset = kset_get(kobj->kset);
/* Attempt to grab reference of owning module's kobject. */
+   kobj->owner = owner;
mod_kobject_get(kobj->owner);
 }
 
@@ -272,15 

Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo

Cornelia Huck wrote:

On Wed, 18 Apr 2007 17:46:09 +0900,
Tejun Heo <[EMAIL PROTECTED]> wrote:


It's debatable but I think things will be safer this way.  If we wait by
default, we are forced to check that all references are dropped and will
have a stack dump indicating which object is causing problem when
something goes wrong, which is better than silent object leaking and/or
jumping to non-existent address way later.


I agree that oopsing is bad. However, lingering references are not
always coding errors. What if it will just take long for a reference to
be given up? You'd have a hanging device_unregister(), with no
particular gain.


It's more like future plan than immediately applicable.  I think most 
high-level driver related interfaces can be converted as sysfs was 
converted such that they disconnect immediately from the device - 
resolving conflicts between higher layer using reference counts and 
device driver layer which expects immediate disconnect is responsibility 
of those interfaces - just as sysfs does it.


If you have lingering reference to struct device after driver is 
detached, you're already screwed.  If there's outstanding reference to 
it from the previous driver, how are you gonna load the next one? 
You're gonna have to wait somewhere for all the references to go away. 
Actually, your patch series is doing exactly this during module 
unloading.  Problem is that you'll need to do the same thing before 
attaching the next driver for the same device.


Immediate-disconnect from all higher interface for device drivers is my 
goal for driver model as I wrote in the RFD about lifetime rules.  I 
think it's doable and should result in easier model to get right, but I 
might be missing something big time, so please point out if you can spot 
holes or don't agree.



I personally think all driver interface should be made this way such
that completion of unregister function guarantees no further access to
the object or module.  IMHO, it's more intuitive and easier to force
correctness.


If we really did this, we should also provide a non-waiting alternative.


For transitional purpose, sure.  In the long term, I think it's better 
if we can do without it.


Thanks.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Cornelia Huck
On Wed, 18 Apr 2007 17:46:09 +0900,
Tejun Heo <[EMAIL PROTECTED]> wrote:

> It's debatable but I think things will be safer this way.  If we wait by
> default, we are forced to check that all references are dropped and will
> have a stack dump indicating which object is causing problem when
> something goes wrong, which is better than silent object leaking and/or
> jumping to non-existent address way later.

I agree that oopsing is bad. However, lingering references are not
always coding errors. What if it will just take long for a reference to
be given up? You'd have a hanging device_unregister(), with no
particular gain.

> 
> I personally think all driver interface should be made this way such
> that completion of unregister function guarantees no further access to
> the object or module.  IMHO, it's more intuitive and easier to force
> correctness.

If we really did this, we should also provide a non-waiting alternative.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo
Cornelia Huck wrote:
> On Wed, 18 Apr 2007 03:49:01 +0900,
> Tejun Heo <[EMAIL PROTECTED]> wrote:
> 
>> Oh, one more thing, with proper code audit, we can just make
>> device_unregister() do the waiting instead of adding separate
>> device_unregister_wait().  I think that will be a good step toward
>> immediate-detach driver model.
> 
> Do we really want that? If the release function doesn't sit in the
> module, or if the device_unregister() is done for other reasons in a
> non-module, we don't care about lingering references. There's no need
> to wait (and possibly lock up) there.

It's debatable but I think things will be safer this way.  If we wait by
default, we are forced to check that all references are dropped and will
have a stack dump indicating which object is causing problem when
something goes wrong, which is better than silent object leaking and/or
jumping to non-existent address way later.

I personally think all driver interface should be made this way such
that completion of unregister function guarantees no further access to
the object or module.  IMHO, it's more intuitive and easier to force
correctness.

I'm not sure about kobject_put() but device_unregister() waiting for
release doesn't sound bad to me.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo
Cornelia Huck wrote:
> On Wed, 18 Apr 2007 03:41:10 +0900,
> Tejun Heo <[EMAIL PROTECTED]> wrote:
> 
>> Hello, all.
>>
>> Agreed with the problem but I'm not very enthusiastic for adding
>> kobj->owner.  How about the following?  exit() routines will have to
>> do device_unregister_wait() instead of device_unregister().  On return
>> from it, it's guaranteed that all references to it are dropped and
>> ->release is finished.
> 
> Sounds interesting. Kind of like the completion approach, but with the
> dangerous bits outside the module.
> 
>> The caller is responsible for avoiding
>> deadlock, of course.
> 
> I think that wording is a bit too strong. Of course, if the device is
> unregistered in the exit routine, the module must make sure it gave up
> all references it itself obtained. However, it doesn't have any control
> about who obtained a reference during the object's lifetime. If an
> outsider holds on to a reference, we'll lock up until this reference
> has been given up (but I guess this is what we want).

Right, the better wording would be "the caller is responsible for not
causing deadlocks by dropping all references it owns on entry".

>> +void kobject_put_wait(struct kobject * kobj)
>> +{
>> +struct kobj_wait kwait;
>> +unsigned long flags;
>> +
>> +if (!kobj)
>> +return;
>> +
>> +BUG_ON(!list_empty(>entry));
>> +
>> +init_completion();
>> +kwait.kobj = kobj;
>> +
>> +spin_lock_irqsave(_wait_lock, flags);
>> +list_add(, _wait_list);
>> +spin_unlock_irqrestore(_wait_lock, flags);
>> +
>> +kobject_put(kobj);
>> +
>> +if (!wait_for_completion_timeout(, 30 * HZ)) {
>> +printk(KERN_WARNING "kobject_put_wait: kobject %p is still "
>> +   "alive after 30s, possible reference count bug\n", kobj);
>> +dump_stack();
>> +wait_for_completion();
>> +}
>> +}
>>
> 
> Couldn't this waiting be made simpler?
> - add completion to kobject in kobject_unregister_wait()
> - call kobject_put(), then wait_for_completion()
> - in kobject_cleanup(), save completion from kobject, call release
> function, then complete() on saved completion

I was trying to avoid adding a completion to all kobjects.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Cornelia Huck
On Wed, 18 Apr 2007 03:49:01 +0900,
Tejun Heo <[EMAIL PROTECTED]> wrote:

> Oh, one more thing, with proper code audit, we can just make
> device_unregister() do the waiting instead of adding separate
> device_unregister_wait().  I think that will be a good step toward
> immediate-detach driver model.

Do we really want that? If the release function doesn't sit in the
module, or if the device_unregister() is done for other reasons in a
non-module, we don't care about lingering references. There's no need
to wait (and possibly lock up) there.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Cornelia Huck
On Wed, 18 Apr 2007 03:41:10 +0900,
Tejun Heo <[EMAIL PROTECTED]> wrote:

> Hello, all.
> 
> Agreed with the problem but I'm not very enthusiastic for adding
> kobj->owner.  How about the following?  exit() routines will have to
> do device_unregister_wait() instead of device_unregister().  On return
> from it, it's guaranteed that all references to it are dropped and
> ->release is finished.

Sounds interesting. Kind of like the completion approach, but with the
dangerous bits outside the module.

> The caller is responsible for avoiding
> deadlock, of course.

I think that wording is a bit too strong. Of course, if the device is
unregistered in the exit routine, the module must make sure it gave up
all references it itself obtained. However, it doesn't have any control
about who obtained a reference during the object's lifetime. If an
outsider holds on to a reference, we'll lock up until this reference
has been given up (but I guess this is what we want).

> 
> The code is only compile-tested and is more of proof-of-concept than
> working code.
> 
> DO NOT APPLY.



> diff --git a/lib/kobject.c b/lib/kobject.c
> index 057921c..22e5148 100644
> --- a/lib/kobject.c
> +++ b/lib/kobject.c
> @@ -16,6 +16,15 @@
>  #include 
>  #include 
> 
> +struct kobj_wait {
> + struct list_headlist;
> + struct kobject  *kobj;
> + struct completion   cmpl;
> +};
> +
> +static spinlock_t kobj_wait_lock = SPIN_LOCK_UNLOCKED;
> +static LIST_HEAD(kobj_wait_list);
> +
>  /**
>   *   populate_dir - populate directory with attributes.
>   *   @kobj:  object we're working on.
> @@ -425,6 +434,23 @@ void kobject_unregister(struct kobject * kobj)
>  }
> 
>  /**
> + *   kobject_unregister_wait - unregister, put and wait
> + *   @kobj:  object going away
> + *
> + *   Remove @kobj from hierarchy, decrement refcount and wait for
> + *   it to die.
> + */
> +void kobject_unregister_wait(struct kobject * kobj)
> +{
> + if (!kobj)
> + return;
> + pr_debug("kobject %s: unregistering\n",kobject_name(kobj));
> + kobject_uevent(kobj, KOBJ_REMOVE);
> + kobject_del(kobj);
> + kobject_put_wait(kobj);
> +}
> +
> +/**
>   *   kobject_get - increment refcount for object.
>   *   @kobj:  object.
>   */
> @@ -446,8 +472,22 @@ void kobject_cleanup(struct kobject * kobj)
>   struct kobj_type * t = get_ktype(kobj);
>   struct kset * s = kobj->kset;
>   struct kobject * parent = kobj->parent;
> + struct kobj_wait *kwait;
> + unsigned long flags;
> 
>   pr_debug("kobject %s: cleaning up\n",kobject_name(kobj));
> +
> + /* is somebody waiting for @kobj to die? */
> + spin_lock_irqsave(_wait_lock, flags);
> + list_for_each_entry(kwait, _wait_list, list) {
> + if (kwait->kobj == kobj) {
> + list_del_init(>list);
> + break;
> + }
> + }
> + spin_unlock_irqrestore(_wait_lock, flags);
> +
> + /* clean up */
>   if (kobj->k_name != kobj->name)
>   kfree(kobj->k_name);
>   kobj->k_name = NULL;
> @@ -456,6 +496,10 @@ void kobject_cleanup(struct kobject * kobj)
>   if (s)
>   kset_put(s);
>   kobject_put(parent);
> +
> + /* notify waiter */
> + if (kwait)
> + complete(>cmpl);
>  }
> 
>  static void kobject_release(struct kref *kref)
> @@ -475,6 +519,42 @@ void kobject_put(struct kobject * kobj)
>   kref_put(>kref, kobject_release);
>  }
> 
> +/**
> + *   kobject_put_wait - put and wait for all references to go away
> + *   @kobj:  object
> + *
> + *   This function is used by @kobj owner to kill it - it puts a
> + *   reference to @kobj and waits till all the references are gone
> + *   and ->release is finished.  @kobj must have been deleted and
> + *   the caller is responsible for guaranteeing that all references
> + *   will be dropped in foreseeable future.
> + */
> +void kobject_put_wait(struct kobject * kobj)
> +{
> + struct kobj_wait kwait;
> + unsigned long flags;
> +
> + if (!kobj)
> + return;
> +
> + BUG_ON(!list_empty(>entry));
> +
> + init_completion();
> + kwait.kobj = kobj;
> +
> + spin_lock_irqsave(_wait_lock, flags);
> + list_add(, _wait_list);
> + spin_unlock_irqrestore(_wait_lock, flags);
> +
> + kobject_put(kobj);
> +
> + if (!wait_for_completion_timeout(, 30 * HZ)) {
> + printk(KERN_WARNING "kobject_put_wait: kobject %p is still "
> +"alive after 30s, possible reference count bug\n", kobj);
> + dump_stack();
> + wait_for_completion();
> + }
> +}
> 

Couldn't this waiting be made simpler?
- add completion to kobject in kobject_unregister_wait()
- call kobject_put(), then wait_for_completion()
- in kobject_cleanup(), save completion from kobject, call release
function, then complete() on saved completion

I'll toy with this a bit.
-
To unsubscribe from this list: send the line "unsubscribe 

Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Cornelia Huck
On Wed, 18 Apr 2007 03:41:10 +0900,
Tejun Heo [EMAIL PROTECTED] wrote:

 Hello, all.
 
 Agreed with the problem but I'm not very enthusiastic for adding
 kobj-owner.  How about the following?  exit() routines will have to
 do device_unregister_wait() instead of device_unregister().  On return
 from it, it's guaranteed that all references to it are dropped and
 -release is finished.

Sounds interesting. Kind of like the completion approach, but with the
dangerous bits outside the module.

 The caller is responsible for avoiding
 deadlock, of course.

I think that wording is a bit too strong. Of course, if the device is
unregistered in the exit routine, the module must make sure it gave up
all references it itself obtained. However, it doesn't have any control
about who obtained a reference during the object's lifetime. If an
outsider holds on to a reference, we'll lock up until this reference
has been given up (but I guess this is what we want).

 
 The code is only compile-tested and is more of proof-of-concept than
 working code.
 
 DO NOT APPLY.

snip

 diff --git a/lib/kobject.c b/lib/kobject.c
 index 057921c..22e5148 100644
 --- a/lib/kobject.c
 +++ b/lib/kobject.c
 @@ -16,6 +16,15 @@
  #include linux/stat.h
  #include linux/slab.h
 
 +struct kobj_wait {
 + struct list_headlist;
 + struct kobject  *kobj;
 + struct completion   cmpl;
 +};
 +
 +static spinlock_t kobj_wait_lock = SPIN_LOCK_UNLOCKED;
 +static LIST_HEAD(kobj_wait_list);
 +
  /**
   *   populate_dir - populate directory with attributes.
   *   @kobj:  object we're working on.
 @@ -425,6 +434,23 @@ void kobject_unregister(struct kobject * kobj)
  }
 
  /**
 + *   kobject_unregister_wait - unregister, put and wait
 + *   @kobj:  object going away
 + *
 + *   Remove @kobj from hierarchy, decrement refcount and wait for
 + *   it to die.
 + */
 +void kobject_unregister_wait(struct kobject * kobj)
 +{
 + if (!kobj)
 + return;
 + pr_debug(kobject %s: unregistering\n,kobject_name(kobj));
 + kobject_uevent(kobj, KOBJ_REMOVE);
 + kobject_del(kobj);
 + kobject_put_wait(kobj);
 +}
 +
 +/**
   *   kobject_get - increment refcount for object.
   *   @kobj:  object.
   */
 @@ -446,8 +472,22 @@ void kobject_cleanup(struct kobject * kobj)
   struct kobj_type * t = get_ktype(kobj);
   struct kset * s = kobj-kset;
   struct kobject * parent = kobj-parent;
 + struct kobj_wait *kwait;
 + unsigned long flags;
 
   pr_debug(kobject %s: cleaning up\n,kobject_name(kobj));
 +
 + /* is somebody waiting for @kobj to die? */
 + spin_lock_irqsave(kobj_wait_lock, flags);
 + list_for_each_entry(kwait, kobj_wait_list, list) {
 + if (kwait-kobj == kobj) {
 + list_del_init(kwait-list);
 + break;
 + }
 + }
 + spin_unlock_irqrestore(kobj_wait_lock, flags);
 +
 + /* clean up */
   if (kobj-k_name != kobj-name)
   kfree(kobj-k_name);
   kobj-k_name = NULL;
 @@ -456,6 +496,10 @@ void kobject_cleanup(struct kobject * kobj)
   if (s)
   kset_put(s);
   kobject_put(parent);
 +
 + /* notify waiter */
 + if (kwait)
 + complete(kwait-cmpl);
  }
 
  static void kobject_release(struct kref *kref)
 @@ -475,6 +519,42 @@ void kobject_put(struct kobject * kobj)
   kref_put(kobj-kref, kobject_release);
  }
 
 +/**
 + *   kobject_put_wait - put and wait for all references to go away
 + *   @kobj:  object
 + *
 + *   This function is used by @kobj owner to kill it - it puts a
 + *   reference to @kobj and waits till all the references are gone
 + *   and -release is finished.  @kobj must have been deleted and
 + *   the caller is responsible for guaranteeing that all references
 + *   will be dropped in foreseeable future.
 + */
 +void kobject_put_wait(struct kobject * kobj)
 +{
 + struct kobj_wait kwait;
 + unsigned long flags;
 +
 + if (!kobj)
 + return;
 +
 + BUG_ON(!list_empty(kobj-entry));
 +
 + init_completion(kwait.cmpl);
 + kwait.kobj = kobj;
 +
 + spin_lock_irqsave(kobj_wait_lock, flags);
 + list_add(kwait.list, kobj_wait_list);
 + spin_unlock_irqrestore(kobj_wait_lock, flags);
 +
 + kobject_put(kobj);
 +
 + if (!wait_for_completion_timeout(kwait.cmpl, 30 * HZ)) {
 + printk(KERN_WARNING kobject_put_wait: kobject %p is still 
 +alive after 30s, possible reference count bug\n, kobj);
 + dump_stack();
 + wait_for_completion(kwait.cmpl);
 + }
 +}
 

Couldn't this waiting be made simpler?
- add completion to kobject in kobject_unregister_wait()
- call kobject_put(), then wait_for_completion()
- in kobject_cleanup(), save completion from kobject, call release
function, then complete() on saved completion

I'll toy with this a bit.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to 

Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Cornelia Huck
On Wed, 18 Apr 2007 03:49:01 +0900,
Tejun Heo [EMAIL PROTECTED] wrote:

 Oh, one more thing, with proper code audit, we can just make
 device_unregister() do the waiting instead of adding separate
 device_unregister_wait().  I think that will be a good step toward
 immediate-detach driver model.

Do we really want that? If the release function doesn't sit in the
module, or if the device_unregister() is done for other reasons in a
non-module, we don't care about lingering references. There's no need
to wait (and possibly lock up) there.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo
Cornelia Huck wrote:
 On Wed, 18 Apr 2007 03:41:10 +0900,
 Tejun Heo [EMAIL PROTECTED] wrote:
 
 Hello, all.

 Agreed with the problem but I'm not very enthusiastic for adding
 kobj-owner.  How about the following?  exit() routines will have to
 do device_unregister_wait() instead of device_unregister().  On return
 from it, it's guaranteed that all references to it are dropped and
 -release is finished.
 
 Sounds interesting. Kind of like the completion approach, but with the
 dangerous bits outside the module.
 
 The caller is responsible for avoiding
 deadlock, of course.
 
 I think that wording is a bit too strong. Of course, if the device is
 unregistered in the exit routine, the module must make sure it gave up
 all references it itself obtained. However, it doesn't have any control
 about who obtained a reference during the object's lifetime. If an
 outsider holds on to a reference, we'll lock up until this reference
 has been given up (but I guess this is what we want).

Right, the better wording would be the caller is responsible for not
causing deadlocks by dropping all references it owns on entry.

 +void kobject_put_wait(struct kobject * kobj)
 +{
 +struct kobj_wait kwait;
 +unsigned long flags;
 +
 +if (!kobj)
 +return;
 +
 +BUG_ON(!list_empty(kobj-entry));
 +
 +init_completion(kwait.cmpl);
 +kwait.kobj = kobj;
 +
 +spin_lock_irqsave(kobj_wait_lock, flags);
 +list_add(kwait.list, kobj_wait_list);
 +spin_unlock_irqrestore(kobj_wait_lock, flags);
 +
 +kobject_put(kobj);
 +
 +if (!wait_for_completion_timeout(kwait.cmpl, 30 * HZ)) {
 +printk(KERN_WARNING kobject_put_wait: kobject %p is still 
 +   alive after 30s, possible reference count bug\n, kobj);
 +dump_stack();
 +wait_for_completion(kwait.cmpl);
 +}
 +}

 
 Couldn't this waiting be made simpler?
 - add completion to kobject in kobject_unregister_wait()
 - call kobject_put(), then wait_for_completion()
 - in kobject_cleanup(), save completion from kobject, call release
 function, then complete() on saved completion

I was trying to avoid adding a completion to all kobjects.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo
Cornelia Huck wrote:
 On Wed, 18 Apr 2007 03:49:01 +0900,
 Tejun Heo [EMAIL PROTECTED] wrote:
 
 Oh, one more thing, with proper code audit, we can just make
 device_unregister() do the waiting instead of adding separate
 device_unregister_wait().  I think that will be a good step toward
 immediate-detach driver model.
 
 Do we really want that? If the release function doesn't sit in the
 module, or if the device_unregister() is done for other reasons in a
 non-module, we don't care about lingering references. There's no need
 to wait (and possibly lock up) there.

It's debatable but I think things will be safer this way.  If we wait by
default, we are forced to check that all references are dropped and will
have a stack dump indicating which object is causing problem when
something goes wrong, which is better than silent object leaking and/or
jumping to non-existent address way later.

I personally think all driver interface should be made this way such
that completion of unregister function guarantees no further access to
the object or module.  IMHO, it's more intuitive and easier to force
correctness.

I'm not sure about kobject_put() but device_unregister() waiting for
release doesn't sound bad to me.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Cornelia Huck
On Wed, 18 Apr 2007 17:46:09 +0900,
Tejun Heo [EMAIL PROTECTED] wrote:

 It's debatable but I think things will be safer this way.  If we wait by
 default, we are forced to check that all references are dropped and will
 have a stack dump indicating which object is causing problem when
 something goes wrong, which is better than silent object leaking and/or
 jumping to non-existent address way later.

I agree that oopsing is bad. However, lingering references are not
always coding errors. What if it will just take long for a reference to
be given up? You'd have a hanging device_unregister(), with no
particular gain.

 
 I personally think all driver interface should be made this way such
 that completion of unregister function guarantees no further access to
 the object or module.  IMHO, it's more intuitive and easier to force
 correctness.

If we really did this, we should also provide a non-waiting alternative.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo

Cornelia Huck wrote:

On Wed, 18 Apr 2007 17:46:09 +0900,
Tejun Heo [EMAIL PROTECTED] wrote:


It's debatable but I think things will be safer this way.  If we wait by
default, we are forced to check that all references are dropped and will
have a stack dump indicating which object is causing problem when
something goes wrong, which is better than silent object leaking and/or
jumping to non-existent address way later.


I agree that oopsing is bad. However, lingering references are not
always coding errors. What if it will just take long for a reference to
be given up? You'd have a hanging device_unregister(), with no
particular gain.


It's more like future plan than immediately applicable.  I think most 
high-level driver related interfaces can be converted as sysfs was 
converted such that they disconnect immediately from the device - 
resolving conflicts between higher layer using reference counts and 
device driver layer which expects immediate disconnect is responsibility 
of those interfaces - just as sysfs does it.


If you have lingering reference to struct device after driver is 
detached, you're already screwed.  If there's outstanding reference to 
it from the previous driver, how are you gonna load the next one? 
You're gonna have to wait somewhere for all the references to go away. 
Actually, your patch series is doing exactly this during module 
unloading.  Problem is that you'll need to do the same thing before 
attaching the next driver for the same device.


Immediate-disconnect from all higher interface for device drivers is my 
goal for driver model as I wrote in the RFD about lifetime rules.  I 
think it's doable and should result in easier model to get right, but I 
might be missing something big time, so please point out if you can spot 
holes or don't agree.



I personally think all driver interface should be made this way such
that completion of unregister function guarantees no further access to
the object or module.  IMHO, it's more intuitive and easier to force
correctness.


If we really did this, we should also provide a non-waiting alternative.


For transitional purpose, sure.  In the long term, I think it's better 
if we can do without it.


Thanks.

--
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Alan Stern
On Wed, 18 Apr 2007, Tejun Heo wrote:

 Hello, all.
 
 Agreed with the problem but I'm not very enthusiastic for adding
 kobj-owner.  How about the following?  exit() routines will have to
 do device_unregister_wait() instead of device_unregister().  On return
 from it, it's guaranteed that all references to it are dropped and
 -release is finished.  The caller is responsible for avoiding
 deadlock, of course.

There's a problem with this approach.

Many drivers, especially those for hot-pluggable buses, register and
unregister devices dynamically.  These events can occur in time-critical
situations, where the driver cannot afford to wait for all the references
to be dropped when unregistering a device.  It's okay to wait in a module
exit routine, but to make things work the routine would have to wait for
references to _all_ unregistered objects to go away, not just the
references for the objects it unregisters at exit time.

So let's see what changes are needed to make the approach workable.  We
will have to maintain a count of objects whose release methods haven't
been called yet.  The count has to be incremented every time an object is
unregistered (or registered, it doesn't matter which) and decremented
_after_ the release method returns -- meaning somewhere in the driver
core.  When the count goes to zero, the exit routine is then allowed to
terminate.

Hmmm, this is beginning to sound like a module-wide refcount which serves
to block mod-exit().  In fact, it sounds almost identical to what
Cornelia wrote, except that the refcount refers only to devices rather
than arbitrary kobjects (and except that the blockage just before
mod-exit returns instead of just after).  You can see where I'm
leading...

BTW, Cornelia, there's a small problem with your patch set.  You have
kobject_init() try to grab a reference to kobj-owner, but it's quite
possible for kobj-owner to contain uninitialized garbage since the rest
of the kernel is still unaware of its existence.  There's a patch below to
fix things up.  I've got a second patch which makes device_initialize()  
pass an owner to the embedded kobject, but I want to try it out a little
before posting it.  :-)


Incidentally, Tejun, I'm all in favor of a immediate-detach driver model 
approach.  Unfortunately it's impossible to realize fully, although we 
could come much closer than we are now.

Here's an example where immediate-detach cannot be implemented.  A driver 
binds to a device and uses that device is a kernel thread.  The thread 
carries out certain operations which require it to hold the device 
semaphore (because, for example, they need to be mutually exclusive with 
unbind).

The driver's remove() method is called with the semaphore held.  If the
thread tries to lock the semaphore at the same time and blocks, there is
no way at all for the remove() method to force the thread to drop its
reference.

This isn't merely a theoretical example.  The USB hub driver works in
exactly this way.

Alan Stern



Index: usb-2.6/include/linux/kobject.h
===
--- usb-2.6.orig/include/linux/kobject.h
+++ usb-2.6/include/linux/kobject.h
@@ -71,7 +71,8 @@ static inline const char * kobject_name(
return kobj-k_name;
 }
 
-extern void kobject_init(struct kobject *);
+extern void kobject_init_owner(struct kobject *, struct module *owner);
+#define kobject_init(kobj) kobject_init_owner(kobj, THIS_MODULE)
 extern void kobject_cleanup(struct kobject *);
 
 extern int __must_check kobject_add(struct kobject *);
@@ -84,7 +85,9 @@ extern int __must_check kobject_shadow_r
const char *new_name);
 extern int __must_check kobject_move(struct kobject *, struct kobject *);
 
-extern int __must_check kobject_register(struct kobject *);
+extern int __must_check kobject_register_owner(struct kobject *,
+   struct module *owner);
+#define kobject_register(kobj) kobject_register_owner(kobj, THIS_MODULE)
 extern void kobject_unregister(struct kobject *);
 
 extern struct kobject * kobject_get(struct kobject *);
Index: usb-2.6/lib/kobject.c
===
--- usb-2.6.orig/lib/kobject.c
+++ usb-2.6/lib/kobject.c
@@ -164,10 +164,11 @@ static void verify_dynamic_kobject_alloc
 #endif
 
 /**
- * kobject_init - initialize object.
+ * kobject_init_onwer - initialize object.
  * @kobj:  object in question.
+ * @owner: module owning @kobj.
  */
-void kobject_init(struct kobject * kobj)
+void kobject_init_owner(struct kobject * kobj, struct module *owner)
 {
if (!kobj)
return;
@@ -178,6 +179,7 @@ void kobject_init(struct kobject * kobj)
init_waitqueue_head(kobj-poll);
kobj-kset = kset_get(kobj-kset);
/* Attempt to grab reference of owning module's kobject. */
+   kobj-owner = owner;
mod_kobject_get(kobj-owner);
 }
 
@@ -272,15 +274,16 @@ int 

Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Cornelia Huck
On Wed, 18 Apr 2007 03:41:10 +0900,
Tejun Heo [EMAIL PROTECTED] wrote:

patch

OK, I hit a bit on the code. Once I saved a reference to the completion
in kobject_cleanup, it seemed to survive a load/unload testloop for a
module registering a device. However, I still dislike this list of
waiters approach, it looks too complex...

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Cornelia Huck
On Wed, 18 Apr 2007 10:53:55 -0400 (EDT),
Alan Stern [EMAIL PROTECTED] wrote:

 Many drivers, especially those for hot-pluggable buses, register and
 unregister devices dynamically.  These events can occur in time-critical
 situations, where the driver cannot afford to wait for all the references
 to be dropped when unregistering a device.  It's okay to wait in a module
 exit routine, but to make things work the routine would have to wait for
 references to _all_ unregistered objects to go away, not just the
 references for the objects it unregisters at exit time.

Uh, yes, didn't think of that. I'd even say that's the majority of
devices allocated in a driver.

 BTW, Cornelia, there's a small problem with your patch set.  You have
 kobject_init() try to grab a reference to kobj-owner, but it's quite
 possible for kobj-owner to contain uninitialized garbage since the rest
 of the kernel is still unaware of its existence.  There's a patch below to
 fix things up.  I've got a second patch which makes device_initialize()  
 pass an owner to the embedded kobject, but I want to try it out a little
 before posting it.  :-)

I was relying on the object being zeroed after allocation (and the
caller of kobject_initialize already setting -owner), but passing an
owner via the initializing routine sounds saner. Thanks for looking at
that.


 Index: usb-2.6/include/linux/kobject.h
 ===
 --- usb-2.6.orig/include/linux/kobject.h
 +++ usb-2.6/include/linux/kobject.h
 @@ -71,7 +71,8 @@ static inline const char * kobject_name(
   return kobj-k_name;
  }
 
 -extern void kobject_init(struct kobject *);
 +extern void kobject_init_owner(struct kobject *, struct module *owner);
 +#define kobject_init(kobj)   kobject_init_owner(kobj, THIS_MODULE)
  extern void kobject_cleanup(struct kobject *);
 
  extern int __must_check kobject_add(struct kobject *);
 @@ -84,7 +85,9 @@ extern int __must_check kobject_shadow_r
   const char *new_name);
  extern int __must_check kobject_move(struct kobject *, struct kobject *);
 
 -extern int __must_check kobject_register(struct kobject *);
 +extern int __must_check kobject_register_owner(struct kobject *,
 + struct module *owner);
 +#define kobject_register(kobj)   kobject_register_owner(kobj, 
 THIS_MODULE)
  extern void kobject_unregister(struct kobject *);
 
  extern struct kobject * kobject_get(struct kobject *);
 Index: usb-2.6/lib/kobject.c
 ===
 --- usb-2.6.orig/lib/kobject.c
 +++ usb-2.6/lib/kobject.c
 @@ -164,10 +164,11 @@ static void verify_dynamic_kobject_alloc
  #endif
 
  /**
 - *   kobject_init - initialize object.
 + *   kobject_init_onwer - initialize object.
   *   @kobj:  object in question.
 + *   @owner: module owning @kobj.
   */
 -void kobject_init(struct kobject * kobj)
 +void kobject_init_owner(struct kobject * kobj, struct module *owner)
  {
   if (!kobj)
   return;
 @@ -178,6 +179,7 @@ void kobject_init(struct kobject * kobj)
   init_waitqueue_head(kobj-poll);
   kobj-kset = kset_get(kobj-kset);
   /* Attempt to grab reference of owning module's kobject. */
 + kobj-owner = owner;
   mod_kobject_get(kobj-owner);
  }
 
 @@ -272,15 +274,16 @@ int kobject_add(struct kobject * kobj)
 
 
  /**
 - *   kobject_register - initialize and add an object.
 + *   kobject_register_owner - initialize and add an object.
   *   @kobj:  object in question.
 + *   @owner: module owning @kobj.
   */
 
 -int kobject_register(struct kobject * kobj)
 +int kobject_register_owner(struct kobject * kobj, struct module *owner)
  {
   int error = -EINVAL;
   if (kobj) {
 - kobject_init(kobj);
 + kobject_init_owner(kobj, owner);
   error = kobject_add(kobj);
   if (!error)
   kobject_uevent(kobj, KOBJ_ADD);
 @@ -750,8 +753,8 @@ void subsys_remove_file(struct subsystem
  }
  #endif  /*  0  */
 
 -EXPORT_SYMBOL(kobject_init);
 -EXPORT_SYMBOL(kobject_register);
 +EXPORT_SYMBOL(kobject_init_owner);
 +EXPORT_SYMBOL(kobject_register_owner);
  EXPORT_SYMBOL(kobject_unregister);
  EXPORT_SYMBOL(kobject_get);
  EXPORT_SYMBOL(kobject_put);
 

Hm, this means we always have an owner (until we explicitely call
kobject_{init,register}_owner(kobj, NULL)) - which may be unneeded in
some cases. But better to err on the safe side, I suppose.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo
Hello,

Alan Stern wrote:
 On Wed, 18 Apr 2007, Tejun Heo wrote:
 
 Hello, all.

 Agreed with the problem but I'm not very enthusiastic for adding
 kobj-owner.  How about the following?  exit() routines will have to
 do device_unregister_wait() instead of device_unregister().  On return
 from it, it's guaranteed that all references to it are dropped and
 -release is finished.  The caller is responsible for avoiding
 deadlock, of course.
 
 There's a problem with this approach.
 
 Many drivers, especially those for hot-pluggable buses, register and
 unregister devices dynamically.  These events can occur in time-critical
 situations, where the driver cannot afford to wait for all the references
 to be dropped when unregistering a device.  It's okay to wait in a module
 exit routine, but to make things work the routine would have to wait for
 references to _all_ unregistered objects to go away, not just the
 references for the objects it unregisters at exit time.
 
 So let's see what changes are needed to make the approach workable.  We
 will have to maintain a count of objects whose release methods haven't
 been called yet.  The count has to be incremented every time an object is
 unregistered (or registered, it doesn't matter which) and decremented
 _after_ the release method returns -- meaning somewhere in the driver
 core.  When the count goes to zero, the exit routine is then allowed to
 terminate.
 
 Hmmm, this is beginning to sound like a module-wide refcount which serves
 to block mod-exit().  In fact, it sounds almost identical to what
 Cornelia wrote, except that the refcount refers only to devices rather
 than arbitrary kobjects (and except that the blockage just before
 mod-exit returns instead of just after).  You can see where I'm
 leading...

The goal of immediate-disconnect is to remove such lingering reference
counts so that device_unregister() or driver detach puts the last
reference count.

You tell a higher layer that a device is going away, on return from the
function, that layer isn't gonna access the device anymore.  ie. On
return from the unregistration function, the upper layer shouldn't have
any reference to the device.  If you unregister from all layers a device
is registered to, you are left with only 1 reference which you put with
device_unregister().  After all are converted, reference count doesn't
mean much.  struct device isn't a reference counted object anymore.

I don't think this is gonna be too difficult to do.  I think I can
convert block layer and IDE/SCSI drivers without too much problem.
Dunno much about other layers tho.

 Incidentally, Tejun, I'm all in favor of a immediate-detach driver model 
 approach.  Unfortunately it's impossible to realize fully, although we 
 could come much closer than we are now.
 
 Here's an example where immediate-detach cannot be implemented.  A driver 
 binds to a device and uses that device is a kernel thread.  The thread 
 carries out certain operations which require it to hold the device 
 semaphore (because, for example, they need to be mutually exclusive with 
 unbind).
 
 The driver's remove() method is called with the semaphore held.  If the
 thread tries to lock the semaphore at the same time and blocks, there is
 no way at all for the remove() method to force the thread to drop its
 reference.
 
 This isn't merely a theoretical example.  The USB hub driver works in
 exactly this way.

Dunno if I understood the problem right but can't we do the following?

remove()
{
acquire sem;
device_del();
release sem;
device_put_wait();
}

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo
Tejun Heo wrote:
 Incidentally, Tejun, I'm all in favor of a immediate-detach driver model 
 approach.  Unfortunately it's impossible to realize fully, although we 
 could come much closer than we are now.

 Here's an example where immediate-detach cannot be implemented.  A driver 
 binds to a device and uses that device is a kernel thread.  The thread 
 carries out certain operations which require it to hold the device 
 semaphore (because, for example, they need to be mutually exclusive with 
 unbind).

 The driver's remove() method is called with the semaphore held.  If the
 thread tries to lock the semaphore at the same time and blocks, there is
 no way at all for the remove() method to force the thread to drop its
 reference.

 This isn't merely a theoretical example.  The USB hub driver works in
 exactly this way.
 
 Dunno if I understood the problem right but can't we do the following?
 
 remove()
 {
   acquire sem;
   device_del();
   release sem;
   device_put_wait();
 }

More afterthoughts.  If a mutex is used to protect access against
removal.  There is no reason to hold reference to it.

kernel_thread()
{
/* wanna dereference my_obj */
mutex_lock();
verify my_obj is there and use it if so.
mutex_unlock();
}

remove()
{
mutex_lock();
kill_it();
mutex_unlock();
}

I probably have over simplified it but using both mutex and reference
counts doesn't make much sense.  IOW, you get an active reference when
you grab the mutex excluding its removal and verified it's still there.

There probably are other reasons why things are done that way and we can
 and probably will have to resort to mixed solutions in foreseeable
future but I don't think there is any inherent problem in applying
immediate-disconnect in the described situation.

Feel free to scream at me if I'm getting it totally wrong.  :-)

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFD] alternative kobject release wait mechanism

2007-04-18 Thread Tejun Heo
Cornelia Huck wrote:
 On Wed, 18 Apr 2007 03:41:10 +0900,
 Tejun Heo [EMAIL PROTECTED] wrote:
 
 patch
 
 OK, I hit a bit on the code. Once I saved a reference to the completion
 in kobject_cleanup, it seemed to survive a load/unload testloop for a
 module registering a device. However, I still dislike this list of
 waiters approach, it looks too complex...

Heh heh, I'm amazed it actually works.  Agreed that the list walking
isn't pretty but adding completion to each kobject still feels like too
much of overhead just for release waiting.  Any better ideas?

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >