[linux-usb-devel] Re: bug 2400

2004-04-09 Thread James Bottomley
On Thu, 2004-04-08 at 18:06, Greg KH wrote: Why not just use a kref for this? That's what they were created for? :) Heh, I was waiting for you to say that. Took you five days to notice, though I'll convert both sr and sd to use krefs. James

[linux-usb-devel] Re: bug 2400

2004-04-08 Thread Alan Stern
On Mon, 5 Apr 2004, Patrick Mansfield wrote: Much as I hate it, simply adding a lock_kernel in sd_remove would solve the problem, since do_open uses lock_kernel. On Fri, 2 Apr 2004, Mike Anderson wrote: I believe as indicated above that all cross subsystem registrations need a release / put

[linux-usb-devel] Re: bug 2400

2004-04-08 Thread Alan Stern
On Thu, 8 Apr 2004, Matt Gulick wrote: OK, Silly question or maybe not. When writing drivers for MacOS ( 7-9 X) and Windose (98 - XP) and when I architected the USB 2.0 stack at Adaptec for 98SE, ME 2k, we solved this issue with a simple heart beat task. Every so often (1-3 seconds)

[linux-usb-devel] Re: bug 2400

2004-04-08 Thread Matt Gulick
There is implicitly _a_ notification, because when the child device is released it drops its reference to the parent (I still think this would be better if it were done differently, but never mind). However that doesn't do the intermediate driver any good, because the driver doesn't own

[linux-usb-devel] Re: bug 2400

2004-04-08 Thread Matt Gulick
On Thu, 2004-04-08 at 13:33, Alan Stern wrote: On Thu, 8 Apr 2004, Matt Gulick wrote: OK, Silly question or maybe not. When writing drivers for MacOS ( 7-9 X) and Windose (98 - XP) and when I architected the USB 2.0 stack at Adaptec for 98SE, ME 2k, we solved this issue with a

[linux-usb-devel] Re: bug 2400

2004-04-08 Thread Greg KH
On Mon, Apr 05, 2004 at 04:07:12PM -0500, James Bottomley wrote: On Sun, 2004-04-04 at 23:33, Patrick Mansfield wrote: How is unregister_cdrom(cd-cdi) called if the device is not open? OK, the attached does everything correctly (as I should have done at first) by using a kobject to hold the

Re: [linux-usb-devel] Re: bug 2400

2004-04-07 Thread Oliver . Neukum
On Tue, 6 Apr 2004, James Bottomley wrote: On Tue, 2004-04-06 at 01:52, Oliver Neukum wrote: Pure refcounting can never protect you against races with freeing objects. The counters themselves must be protected. Try as you might you need locks for that and rules on how this locks are to be

Re: [linux-usb-devel] Re: bug 2400

2004-04-06 Thread Oliver Neukum
The point is that all the disconnection event does is inform a subsystem that the entity represented by the other object has gone. What it chooses to do with the information is up to it. It could set it's own connected object to degraded, never use the other object again and drop the

[linux-usb-devel] Re: bug 2400

2004-04-06 Thread Jens Axboe
On Mon, Apr 05 2004, James Bottomley wrote: On Mon, 2004-04-05 at 09:03, Jens Axboe wrote: Doesn't quite work, how about something like this as a work-around? I'm not sure we want to be introducing yet another refcount. What was the problem? Same as in this thread, cdrom gets

[linux-usb-devel] Re: bug 2400

2004-04-06 Thread Jens Axboe
On Mon, Apr 05 2004, James Bottomley wrote: On Sun, 2004-04-04 at 23:33, Patrick Mansfield wrote: How is unregister_cdrom(cd-cdi) called if the device is not open? OK, the attached does everything correctly (as I should have done at first) by using a kobject to hold the compound references.

[linux-usb-devel] Re: bug 2400

2004-04-06 Thread James Bottomley
On Tue, 2004-04-06 at 04:22, Jens Axboe wrote: Really? It doesn't even compile :-) Heh, I really must learn that I have to copy the file from the test machine to the email machine *before* attaching it. You got a stale copy of an older incarnation, I think. The attached (hopefully) is what I

[linux-usb-devel] Re: bug 2400

2004-04-06 Thread Alan Stern
On 5 Apr 2004, James Bottomley wrote: On Sun, 2004-04-04 at 22:17, Alan Stern wrote: Of course it would! That's exactly what this thread is about: bugs caused by improper handling of open/disconnect races. So you're quietly shifting your ground away from this assertion: On Sat,

Re: [linux-usb-devel] Re: bug 2400

2004-04-06 Thread James Bottomley
On Tue, 2004-04-06 at 01:52, Oliver Neukum wrote: Pure refcounting can never protect you against races with freeing objects. The counters themselves must be protected. Try as you might you need locks for that and rules on how this locks are to be used. Which part of On Mon, 2004-04-05 at

[linux-usb-devel] Re: bug 2400

2004-04-06 Thread Jens Axboe
On Tue, Apr 06 2004, James Bottomley wrote: On Tue, 2004-04-06 at 04:22, Jens Axboe wrote: Really? It doesn't even compile :-) Heh, I really must learn that I have to copy the file from the test machine to the email machine *before* attaching it. You got a stale copy of an older

[linux-usb-devel] Re: bug 2400

2004-04-06 Thread James Bottomley
On Tue, 2004-04-06 at 09:04, Jens Axboe wrote: Only change is the err - error, correct? Not quite; there was a cleanup goto issue as well (don't want to goto the release if you couldn't get the object). The attached (hopefully) is what I compiled and tested with. The test, incidentally, is

[linux-usb-devel] Re: bug 2400

2004-04-06 Thread Alan Stern
On Mon, 5 Apr 2004, Patrick Mansfield wrote: Yes. If we are in middle opening an sd, and are anywhere between the kobj_lookup having completed call to up_read(domain-sem) [called via do_open calling get_gendisk] and sd_open calling scsi_device_get(), and separately an sd_remove call

Re: [linux-usb-devel] Re: bug 2400

2004-04-06 Thread David Brownell
James Bottomley wrote: that's intra subsystem synchronisation, not inter subsystem I'd say that device_driver.remove() calls from the bus management code to a given device driver are inter-subsystem... Look, the rules are very simple: - Every subsystem gives out refcounted objects. - Every

Re: [linux-usb-devel] Re: bug 2400

2004-04-06 Thread James Bottomley
On Tue, 2004-04-06 at 10:10, David Brownell wrote: James Bottomley wrote: that's intra subsystem synchronisation, not inter subsystem I'd say that device_driver.remove() calls from the bus management code to a given device driver are inter-subsystem... so using a subsystem provided API

Re: [linux-usb-devel] Re: bug 2400

2004-04-06 Thread Alan Stern
On 6 Apr 2004, James Bottomley wrote: Supposing we did no hotplug at all and just relied on the fact that a driver responds DID_NO_CONNECT to a missing device. We have a mounted fs on a CD on SCSI on USB which gets disconnected. No event is generated because no hotplug. Now every read of

Re: [linux-usb-devel] Re: bug 2400

2004-04-06 Thread James Bottomley
On Tue, 2004-04-06 at 11:55, Alan Stern wrote: In principle, yes. However... The kernel still has to process requests correctly while the stack is partially deconstructed. It also has to protect against userland helpers trying to pull things apart in the wrong order. This I agree with

[linux-usb-devel] Re: bug 2400

2004-04-06 Thread Mike Anderson
PS, I am traveling today so future comments will be delayed a bit. Alan Stern [EMAIL PROTECTED] wrote: All right, let's look at sd.c. I'll show you that _it_ doesn't obey the object lifetime rules. In sd_open we see this code (lightly edited): static int sd_open(struct inode

Re: [linux-usb-devel] Re: bug 2400

2004-04-05 Thread Oliver Neukum
Open process: Disconnect process: Get minor number from inode Lookup USB interface using minor number Get device pointer from the interface's private data and check it's not NULL This is what's wrong. Here you should get a

[linux-usb-devel] Re: bug 2400

2004-04-05 Thread Jens Axboe
On Fri, Apr 02 2004, James Bottomley wrote: On Thu, 2004-04-01 at 18:32, James Bottomley wrote: Now, the questions are, whose issue is this and how do we fix it? I can see that a driver needs early notification of unplugs so it can deny all access to a gone device. On the other hand, for

[linux-usb-devel] Re: bug 2400

2004-04-05 Thread James Bottomley
On Sun, 2004-04-04 at 23:33, Patrick Mansfield wrote: How is unregister_cdrom(cd-cdi) called if the device is not open? Yes, I need to pull the same kobject trick that sd does for the same reason. James --- This SF.Net email is sponsored

[linux-usb-devel] Re: bug 2400

2004-04-05 Thread James Bottomley
On Mon, 2004-04-05 at 09:03, Jens Axboe wrote: Doesn't quite work, how about something like this as a work-around? I'm not sure we want to be introducing yet another refcount. What was the problem? James --- This SF.Net email is sponsored

[linux-usb-devel] Re: bug 2400

2004-04-05 Thread James Bottomley
On Sun, 2004-04-04 at 23:33, Patrick Mansfield wrote: How is unregister_cdrom(cd-cdi) called if the device is not open? OK, the attached does everything correctly (as I should have done at first) by using a kobject to hold the compound references. I've stressed this one nicely. Unfortunately,

[linux-usb-devel] Re: bug 2400

2004-04-05 Thread James Bottomley
On Sun, 2004-04-04 at 22:17, Alan Stern wrote: On 4 Apr 2004, James Bottomley wrote: On Sun, 2004-04-04 at 12:46, Alan Stern wrote: Ah, you have left out the third, bad alternative: open succeeds, user gets an fd that points to a deallocated device. More details below... No, that

Re: [linux-usb-devel] Re: bug 2400

2004-04-05 Thread James Bottomley
On Sun, 2004-04-04 at 22:54, David Brownell wrote: Which directly follows from what I said ... USB propagates that knowledge in carefully defined ways. Other layers can do the same, although clearly state associated with open file descriptors needs to use a slightly different strategy. And

[linux-usb-devel] Re: bug 2400

2004-04-05 Thread Patrick Mansfield
On Sun, Apr 04, 2004 at 11:17:55PM -0400, Alan Stern wrote: As it turns out, the block layer guarantees that when sd_open runs the bd_disk pointer will be valid. It does this by following the pattern I mentioned in an earlier message -- drivers/base/map.c uses a subsystem-wide semaphore,

Re: [linux-usb-devel] Re: bug 2400

2004-04-05 Thread David Brownell
James Bottomley wrote: On Sun, 2004-04-04 at 22:54, David Brownell wrote: Which directly follows from what I said ... USB propagates that knowledge in carefully defined ways. Other layers can do the same, although clearly state associated with open file descriptors needs to use a slightly

Re: [linux-usb-devel] Re: bug 2400

2004-04-05 Thread James Bottomley
On Mon, 2004-04-05 at 18:23, David Brownell wrote: Irrelevant is pointlessly strong, even for what I'm guessing you really mean to be saying. No, it's the very point. If we have to enforce ordering on the event propagation across subsystems, that can't be done without inter subsystem

[linux-usb-devel] Re: bug 2400

2004-04-04 Thread James Bottomley
On Sat, 2004-04-03 at 20:40, Alan Stern wrote: Without having looked recently in any detail at the specific code in sd.c or sr.c, I want to comment in general terms on the nature of the open/disconnect race. It's a generic problem that affects every driver whose devices can be opened

[linux-usb-devel] Re: bug 2400

2004-04-04 Thread Alan Stern
On 4 Apr 2004, James Bottomley wrote: OK, your problem definition is that there's a race, which I agree with, I just don't agree that it's a problem. Disconnections are fundamentally asynchronous events (a device may be disconnected by the user at any stage regardless of what any kernel

[linux-usb-devel] Re: bug 2400

2004-04-04 Thread James Bottomley
On Sun, 2004-04-04 at 12:46, Alan Stern wrote: Ah, you have left out the third, bad alternative: open succeeds, user gets an fd that points to a deallocated device. More details below... No, that would be a bug. I'm looking for evidence that such a bug exists in sd. Let's consider a simple

Re: [linux-usb-devel] Re: bug 2400

2004-04-04 Thread David Brownell
James Bottomley wrote: Let me illustrate: the user may disconnect the device then open it. If they open it before even the USB subsystem gets notified of the disconnection then all the elaborate synchronisation in the world isn't going to be able to prevent that (the device was gone when they

Re: [linux-usb-devel] Re: bug 2400

2004-04-04 Thread James Bottomley
On Sun, 2004-04-04 at 14:16, David Brownell wrote: You're assuming that synchronization is there to establish a single global notion of state. Clearly that's impossible; also undesirable. Thank you. The synchronization is actually there to let the device gone state spread cleanly through

[linux-usb-devel] Re: bug 2400

2004-04-04 Thread Alan Stern
On 4 Apr 2004, James Bottomley wrote: On Sun, 2004-04-04 at 12:46, Alan Stern wrote: Ah, you have left out the third, bad alternative: open succeeds, user gets an fd that points to a deallocated device. More details below... No, that would be a bug. Of course it would! That's exactly

Re: [linux-usb-devel] Re: bug 2400

2004-04-04 Thread David Brownell
James Bottomley wrote: So you dispute this assertion in the email you quoted above: Since we cannot solve that race, there's no reason to try to solve the some parts of the kernel know but others don't part of the race. On what basis? This, I think, is the core of the differences between On

[linux-usb-devel] Re: bug 2400

2004-04-04 Thread Patrick Mansfield
On Fri, Apr 02, 2004 at 06:36:55PM -0500, James Bottomley wrote: = drivers/scsi/sr.c 1.103 vs edited = --- 1.103/drivers/scsi/sr.c Fri Apr 2 11:30:44 2004 +++ edited/drivers/scsi/sr.c Fri Apr 2 17:29:06 2004 @@ -424,8 +424,19 @@ static int sr_block_release(struct inode

[linux-usb-devel] Re: bug 2400

2004-04-02 Thread Mike Anderson
James Bottomley [EMAIL PROTECTED] wrote: Well, I know why this happens, but I'm not entirely clear how to fix it. The problem comes because the cdrom open and close take and release references to the SCSI generic device (as they're supposed to). However, Upper level Drivers like sr are

[linux-usb-devel] Re: bug 2400

2004-04-02 Thread James Bottomley
On Fri, 2004-04-02 at 03:43, Mike Anderson wrote: I have looked at the sd issue off and on due to the previous open race reported by Alan Stern. While the window can be narrowed inside SCSI you need help for the calling subsystem to know when it is OK to cleanup and your routine will not be

[linux-usb-devel] Re: bug 2400

2004-04-02 Thread Mike Anderson
James Bottomley [EMAIL PROTECTED] wrote: Recently I have not been spending the proper time looking at this, but last look it appeared that we needed to add a release / put method call to the gendisk disk_release routine. The release function or object to do the put on would need to be set

[linux-usb-devel] Re: bug 2400

2004-04-02 Thread James Bottomley
On Fri, 2004-04-02 at 11:45, Mike Anderson wrote: Maybe some clarification here as I am unsure if we both think there needs to be a notification (a put call) from outside SCSI. We have release functions available on most objects in SCSI now. The issue is that when we register (add_disk,

[linux-usb-devel] Re: bug 2400

2004-04-02 Thread Mike Anderson
James Bottomley [EMAIL PROTECTED] wrote: On Fri, 2004-04-02 at 11:45, Mike Anderson wrote: Maybe some clarification here as I am unsure if we both think there needs to be a notification (a put call) from outside SCSI. We have release functions available on most objects in SCSI now. The

[linux-usb-devel] Re: bug 2400

2004-04-02 Thread James Bottomley
On Fri, 2004-04-02 at 12:44, Mike Anderson wrote: Where does the last put come from? How do you close the open race or know when the final put_disk has been called? SCSI cannot do this alone as we have created and registered an object in another subsystem (alloc_disk and add_disk) and we have

[linux-usb-devel] Re: bug 2400

2004-04-02 Thread James Bottomley
On Thu, 2004-04-01 at 18:32, James Bottomley wrote: Now, the questions are, whose issue is this and how do we fix it? I can see that a driver needs early notification of unplugs so it can deny all access to a gone device. On the other hand, for a user land open where we still have to hold

[linux-usb-devel] Re: bug 2400

2004-04-02 Thread Mike Anderson
James Bottomley [EMAIL PROTECTED] wrote: = drivers/scsi/sr.c 1.103 vs edited = --- 1.103/drivers/scsi/sr.c Fri Apr 2 11:30:44 2004 +++ edited/drivers/scsi/sr.c Fri Apr 2 17:29:06 2004 @@ -424,8 +424,19 @@ static int sr_block_release(struct inode *inode, struct file *file) {

[linux-usb-devel] Re: bug 2400

2004-04-02 Thread James Bottomley
On Fri, 2004-04-02 at 19:11, Mike Anderson wrote: It looks like cdrom_release always returns 0? Is there some other patch outstanding that changes this. Yes, not that I know of. However, while it apparently has a return code indicating success or failure, we should respect it [unless Jens

[linux-usb-devel] Re: bug 2400

2004-04-02 Thread James Bottomley
On Fri, 2004-04-02 at 18:40, Mike Anderson wrote: Greg stopped by and after talking this over I think I see why sd is racing in its current form. The race happens when sd_remove and do_open race. Even though I do not like adding a lock_kernel it would appear adding on to sd_remove would

[linux-usb-devel] Re: bug 2400

2004-04-01 Thread Matt Gulick
On Thu, 2004-04-01 at 15:15, Andrew Morton wrote: Apparently there is some controversy over whether this: http://bugme.osdl.org/show_bug.cgi?id=2400 is a usb-storage problem or a SCSI problem. Can someone please shed some light, propose a way to get it fixed? Thanks. - To unsubscribe

[linux-usb-devel] Re: bug 2400

2004-04-01 Thread Andrew Morton
Matt Gulick [EMAIL PROTECTED] wrote: Remember, SCSI is not hot plugable. oopsing the kernel is not a particularly friendly way of telling this to our users. If we have to bandaid over this, say, by leaking some memory and emitting some rude printk's then OK. Allowing the machine to kill

[linux-usb-devel] Re: bug 2400

2004-04-01 Thread James Bottomley
On Thu, 2004-04-01 at 16:52, Matt Gulick wrote: Where you will get into trouble is that SCSI is not considered to be a 'Dynamic' bus. Devices do not normally come and go, while USB and 1394 devices are 'Hot Plugable'. It isn't? considering all the refcounting work that's gone into the SCSI

[linux-usb-devel] Re: bug 2400

2004-04-01 Thread Matt Gulick
On Thu, 2004-04-01 at 16:08, Andrew Morton wrote: Matt Gulick [EMAIL PROTECTED] wrote: Remember, SCSI is not hot plugable. oopsing the kernel is not a particularly friendly way of telling this to our users. If we have to bandaid over this, say, by leaking some memory and emitting

[linux-usb-devel] Re: bug 2400

2004-04-01 Thread Matt Gulick
On Thu, 2004-04-01 at 16:40, James Bottomley wrote: On Thu, 2004-04-01 at 16:52, Matt Gulick wrote: Where you will get into trouble is that SCSI is not considered to be a 'Dynamic' bus. Devices do not normally come and go, while USB and 1394 devices are 'Hot Plugable'. It isn't?

[linux-usb-devel] Re: bug 2400

2004-04-01 Thread Matthew Dharm
On Thu, Apr 01, 2004 at 01:15:02PM -0800, Andrew Morton wrote: Apparently there is some controversy over whether this: http://bugme.osdl.org/show_bug.cgi?id=2400 is a usb-storage problem or a SCSI problem. Can someone please shed some light, propose a way to get it fixed? There's

[linux-usb-devel] Re: bug 2400

2004-04-01 Thread James Bottomley
On Thu, 2004-04-01 at 16:15, Andrew Morton wrote: Apparently there is some controversy over whether this: http://bugme.osdl.org/show_bug.cgi?id=2400 is a usb-storage problem or a SCSI problem. Well, I know why this happens, but I'm not entirely clear how to fix it. The problem comes because

[linux-usb-devel] Re: bug 2400

2004-04-01 Thread Steven Dake
On Thu, 2004-04-01 at 16:32, James Bottomley wrote: On Thu, 2004-04-01 at 16:15, Andrew Morton wrote: Apparently there is some controversy over whether this: http://bugme.osdl.org/show_bug.cgi?id=2400 is a usb-storage problem or a SCSI problem. Well, I know why this happens, but I'm not