Re: Current qc_defer implementation may lead to infinite recursion

2008-02-18 Thread Elias Oltmanns
Hi Tejun,

Tejun Heo [EMAIL PROTECTED] wrote:
 Elias Oltmanns wrote:
 Tejun Heo [EMAIL PROTECTED] wrote:
 Elias Oltmanns wrote:
 This proves that piix_qc_defer() has declined the same command 100
 times in succession. However, this will only happen if the status of
 all the commands enqueued for one port hasn't changed in the
 meantime. This suggests to me that the threads scheduled for command
 execution and completion aren't served for some reason. Any ideas?
 Blocked counts of 1 will cause busy looping because when blk_run_queue()
 returns because it's recursing too deep, it schedules unplug work right
 away, so it will easily loop 100 times.  Max blocked counts should be
 adjusted to two (needs some testing before actually submitting the
 change).  But that still shouldn't cause any lock up.  What happens if
 you remove the 100 times limit?  Does the machine hang on IO?
 
 Yes, it does. In fact, I had already verified that before sending the
 previous email.

 Hmmm it's supposed not to lock up although it can cause busy wait.
 I'll test it tomorrow.

Have you had a chance to test yet? What is to be done about it?

Regards,

Elias
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current qc_defer implementation may lead to infinite recursion

2008-02-12 Thread Elias Oltmanns
Tejun Heo [EMAIL PROTECTED] wrote:
 Elias Oltmanns wrote:
 +static int piix_qc_defer(struct ata_queued_cmd *qc)
 +{
 +static struct ata_port *ap = NULL;
 +struct ata_queued_cmd qcmd[ATA_MAX_QUEUE];

 missing static?

Oh well, I must have been too tired already yesterday. There are a few
more things I got wrong last time. Please see the new patch at the end
of this email.

This time I applied the patch to 2.6.24.1 and performed a

# cat large-file  /dev/null 
# tail -f /var/log/kern.log

and aborted once the output of dump_stack() had occurred. This proves
that piix_qc_defer() has declined the same command 100 times in
succession. However, this will only happen if the status of all the
commands enqueued for one port hasn't changed in the meantime. This
suggests to me that the threads scheduled for command execution and
completion aren't served for some reason. Any ideas?

The foot print of piix_qc_defer() in my log files looks like this, by
the way:

Feb 12 08:53:42 denkblock kernel: piix_qc_defer(): saved current state
Feb 12 08:53:42 denkblock kernel: Pid: 31, comm: kblockd/0 Not tainted 
2.6.24.1-dbg-1 #1
Feb 12 08:53:42 denkblock kernel:  [e0042116] piix_qc_defer+0xac/0xeb 
[ata_piix]
Feb 12 08:53:42 denkblock kernel:  [e008c4e4] ata_scsi_translate+0xd6/0x10a 
[libata]
Feb 12 08:53:42 denkblock kernel:  [e008af20] ata_build_rw_tf+0x175/0x242 
[libata]
Feb 12 08:53:42 denkblock kernel:  [e006734e] scsi_done+0x0/0x16 [scsi_mod]
Feb 12 08:53:42 denkblock kernel:  [e008eb8f] ata_scsi_queuecmd+0x17f/0x184 
[libata]
Feb 12 08:53:42 denkblock kernel:  [e008c20a] ata_scsi_rw_xlat+0x0/0x1de 
[libata]
Feb 12 08:53:42 denkblock kernel:  [e0067aa8] scsi_dispatch_cmd+0x197/0x20b 
[scsi_mod]
Feb 12 08:53:42 denkblock kernel:  [e006cb33] scsi_request_fn+0x256/0x2d7 
[scsi_mod]
Feb 12 08:53:42 denkblock kernel:  [e0042136] piix_qc_defer+0xcc/0xeb 
[ata_piix]
Feb 12 08:53:42 denkblock kernel:  [c01961d3] blk_run_queue+0x2a/0x4b
Feb 12 08:53:42 denkblock kernel:  [e006bea2] scsi_queue_insert+0x84/0x8e 
[scsi_mod]
Feb 12 08:53:42 denkblock kernel:  [e006a70d] scsi_delete_timer+0xf/0x50 
[scsi_mod]
Feb 12 08:53:42 denkblock kernel:  [e0067adc] scsi_dispatch_cmd+0x1cb/0x20b 
[scsi_mod]
Feb 12 08:53:42 denkblock kernel:  [e006cb33] scsi_request_fn+0x256/0x2d7 
[scsi_mod]
Feb 12 08:53:42 denkblock kernel:  [c0195289] blk_remove_plug+0x4e/0x5a
Feb 12 08:53:42 denkblock kernel:  [c01934c2] blk_unplug_work+0x0/0xc
Feb 12 08:53:42 denkblock kernel:  [c01952b2] 
__generic_unplug_device+0x1d/0x1f
Feb 12 08:53:42 denkblock kernel:  [c0195c07] generic_unplug_device+0x6/0x8
Feb 12 08:53:42 denkblock kernel:  [c01934cd] blk_unplug_work+0xb/0xc
Feb 12 08:53:42 denkblock kernel:  [c0122943] run_workqueue+0x6b/0xdf
Feb 12 08:53:42 denkblock kernel:  [c024dcd2] schedule+0x1f3/0x20d
Feb 12 08:53:42 denkblock kernel:  [c0122f37] worker_thread+0x0/0xbd
Feb 12 08:53:42 denkblock kernel:  [c0122fe9] worker_thread+0xb2/0xbd
Feb 12 08:53:42 denkblock kernel:  [c0125429] 
autoremove_wake_function+0x0/0x35
Feb 12 08:53:42 denkblock kernel:  [c01252d2] kthread+0x36/0x5c
Feb 12 08:53:42 denkblock kernel:  [c012529c] kthread+0x0/0x5c
Feb 12 08:53:42 denkblock kernel:  [c0104733] kernel_thread_helper+0x7/0x10
Feb 12 08:53:42 denkblock kernel:  ===

Regards,

Elias

---

 drivers/ata/ata_piix.c |   42 ++
 1 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c
index b406b39..c987462 100644
--- a/drivers/ata/ata_piix.c
+++ b/drivers/ata/ata_piix.c
@@ -167,6 +167,7 @@ static void piix_set_dmamode(struct ata_
 static void ich_set_dmamode(struct ata_port *ap, struct ata_device *adev);
 static int ich_pata_cable_detect(struct ata_port *ap);
 static u8 piix_vmw_bmdma_status(struct ata_port *ap);
+static int piix_qc_defer(struct ata_queued_cmd *qc);
 #ifdef CONFIG_PM
 static int piix_pci_device_suspend(struct pci_dev *pdev, pm_message_t mesg);
 static int piix_pci_device_resume(struct pci_dev *pdev);
@@ -311,6 +312,7 @@ static const struct ata_port_operations 
 	.bmdma_start		= ata_bmdma_start,
 	.bmdma_stop		= ata_bmdma_stop,
 	.bmdma_status		= ata_bmdma_status,
+	.qc_defer		= piix_qc_defer,
 	.qc_prep		= ata_qc_prep,
 	.qc_issue		= ata_qc_issue_prot,
 	.data_xfer		= ata_data_xfer,
@@ -343,6 +345,7 @@ static const struct ata_port_operations 
 	.bmdma_start		= ata_bmdma_start,
 	.bmdma_stop		= ata_bmdma_stop,
 	.bmdma_status		= ata_bmdma_status,
+	.qc_defer		= piix_qc_defer,
 	.qc_prep		= ata_qc_prep,
 	.qc_issue		= ata_qc_issue_prot,
 	.data_xfer		= ata_data_xfer,
@@ -371,6 +374,7 @@ static const struct ata_port_operations 
 	.bmdma_start		= ata_bmdma_start,
 	.bmdma_stop		= ata_bmdma_stop,
 	.bmdma_status		= ata_bmdma_status,
+	.qc_defer		= piix_qc_defer,
 	.qc_prep		= ata_qc_prep,
 	.qc_issue		= ata_qc_issue_prot,
 	.data_xfer		= ata_data_xfer,
@@ -402,6 +406,7 @@ static const struct ata_port_operations 
 	.bmdma_start		= 

Re: Current qc_defer implementation may lead to infinite recursion

2008-02-12 Thread Tejun Heo
Elias Oltmanns wrote:
 Tejun Heo [EMAIL PROTECTED] wrote:
 Elias Oltmanns wrote:
 +static int piix_qc_defer(struct ata_queued_cmd *qc)
 +{
 +   static struct ata_port *ap = NULL;
 +   struct ata_queued_cmd qcmd[ATA_MAX_QUEUE];
 missing static?
 
 Oh well, I must have been too tired already yesterday. There are a few
 more things I got wrong last time. Please see the new patch at the end
 of this email.
 
 This time I applied the patch to 2.6.24.1 and performed a
 
 # cat large-file  /dev/null 
 # tail -f /var/log/kern.log
 
 and aborted once the output of dump_stack() had occurred. This proves
 that piix_qc_defer() has declined the same command 100 times in
 succession. However, this will only happen if the status of all the
 commands enqueued for one port hasn't changed in the meantime. This
 suggests to me that the threads scheduled for command execution and
 completion aren't served for some reason. Any ideas?

Blocked counts of 1 will cause busy looping because when blk_run_queue()
returns because it's recursing too deep, it schedules unplug work right
away, so it will easily loop 100 times.  Max blocked counts should be
adjusted to two (needs some testing before actually submitting the
change).  But that still shouldn't cause any lock up.  What happens if
you remove the 100 times limit?  Does the machine hang on IO?

Thanks.


-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current qc_defer implementation may lead to infinite recursion

2008-02-12 Thread Elias Oltmanns
Tejun Heo [EMAIL PROTECTED] wrote:
 Elias Oltmanns wrote:
 This proves that piix_qc_defer() has declined the same command 100
 times in succession. However, this will only happen if the status of
 all the commands enqueued for one port hasn't changed in the
 meantime. This suggests to me that the threads scheduled for command
 execution and completion aren't served for some reason. Any ideas?

 Blocked counts of 1 will cause busy looping because when blk_run_queue()
 returns because it's recursing too deep, it schedules unplug work right
 away, so it will easily loop 100 times.  Max blocked counts should be
 adjusted to two (needs some testing before actually submitting the
 change).  But that still shouldn't cause any lock up.  What happens if
 you remove the 100 times limit?  Does the machine hang on IO?

Yes, it does. In fact, I had already verified that before sending the
previous email.

Regards,

Elias
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current qc_defer implementation may lead to infinite recursion

2008-02-12 Thread Tejun Heo
Elias Oltmanns wrote:
 Tejun Heo [EMAIL PROTECTED] wrote:
 Elias Oltmanns wrote:
 This proves that piix_qc_defer() has declined the same command 100
 times in succession. However, this will only happen if the status of
 all the commands enqueued for one port hasn't changed in the
 meantime. This suggests to me that the threads scheduled for command
 execution and completion aren't served for some reason. Any ideas?
 Blocked counts of 1 will cause busy looping because when blk_run_queue()
 returns because it's recursing too deep, it schedules unplug work right
 away, so it will easily loop 100 times.  Max blocked counts should be
 adjusted to two (needs some testing before actually submitting the
 change).  But that still shouldn't cause any lock up.  What happens if
 you remove the 100 times limit?  Does the machine hang on IO?
 
 Yes, it does. In fact, I had already verified that before sending the
 previous email.

Hmmm it's supposed not to lock up although it can cause busy wait.
I'll test it tomorrow.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current qc_defer implementation may lead to infinite recursion

2008-02-11 Thread Elias Oltmanns
Tejun Heo [EMAIL PROTECTED] wrote:
 Hello,

 Elias Oltmanns wrote:
 Hmmm... The reason why max_host_blocked and max_device_blocked are set
 to 1 is to let libata re-consider status after each command completion
 as blocked status can be rather complex w/ PMP.  I haven't really
 followed the code yet but you're saying that blocked count of 2 should
 be used for that behavior, right?
 
 Not quite. On an SMP system the current implementation will probably do
 exactly what you had in mind. In particular, setting max_device_blocked
 and max_host_blocked to 1 seems to be the right thing to do in this
 case.

[...]
 I don't think such infinite recursions can happen because
 blk_run_queue() doesn't allow recursing more than once.

Quite right. Sorry for missing that. However, the problem remains
although I'm not too sure anymore whether libata is to blame. See below.


 Another strange thing is that there hasn't been any such lock up /
 infinite recursion report till now although -qc_defer mechanism bas
 been used widely for some time now.  Can you reproduce the problem w/o
 the disk shock protection?
 
 No, unfortunately, I'm unable to reproduce this without the patch on my
 machine. This is for purely technical reasons though because I'm using
 ata_piix. Running a vanilla kernel, I'd expect everything to work just
 fine except for one case: A non-SMP system using a driver that provides
 the -qc_defer() callback. Currently, the -qc_defer() callback is the
 only thing that can possibly send a non zero return value to the scsi
 midlayer. Once it does, however, the driver will only get a chance to
 complete some qcs before -qc_defer() is called again provided that
 multithreading is supported.
 
 So, what I'm saying is this: If the low level driver doesn't provide a
 -qc_defer() callback, there is no (obvious) reason why
 max_device_blocked and max_host_blocked should be set to 1 since libata
 won't gain anything by it. However, it is not a bug either, even though
 James considers it suboptimal and I will have to think about a solution
 for my patch. On the other hand, once a driver defines the -qc_defer()
 callback, we really have a bug because things will go wrong once
 -qc_defer() returns non zero on a uniprocessor. So, in this case
 max_device_blocked and max_host_blocked should be set to 1 on an SMP
 system and *have to* be bigger than 1 otherwise.

 Well, we need to support -qc_defer on UP systems too.  I'm still very
 skeptical that the scenario you described can happen.  The mechanism is
 just used too widely for such problem to go unnoticed.  For example,
 deferring is exercised when FLUSH is issued to a driver which have
 in-flight NCQ commands.  If you have barrier enabled FS mounted on an
 NCQ enabled hard drive, such events will happen regularly but we haven't
 heard of any similar lock up or stack overflow till now.

 Feel free to add -qc_defer to ata_piix and let it defer commands (say
 every 20th or any other criteria you seem fit for testing)

An excellent idea, although it took me some time to figure out what kind
of a qc_defer replacement would best fit my needs. Indeed, I can
reproduce the system freeze with a suitably crafted qc_defer callback.
See below the patch I've used to get it working. The intention is to
simulate the following scenario:

1. libata has accepted some commands from the midlayer already and has
   stored them on the private queue of some ata_port.
2. Eventually, qc_defer declines a command for whatever reason.
3. Even though an infinite recursion does not occur in the way I had
   suggested, qc_defer will decline this command again the next time
   because the general situation hasn't changed. In particular, all the
   commands that had already been enqueued for that port at the last
   time will still be there and will not have been processed or even
   changed in any way.

Hopefully, I didn't miss the crucial point this time. Please note that
you have to strip down the piix_qc_defer() function in this patch in
order to get the scenario described above. Just get rid of those two
if-blocks whose condition contains the expression

PIIX_QC_DEFER_THRESHOLD + 100

in some way. Not doing that will prevent the system from freezing
entirely but still lead to BUG: messages in dmesg. At least that's what
happens in my case. These messages suggest to me that libata may be not
the culprit after all. However, I have no experience with interpreting
them properly so I'd appreciate it if you had a look at them after you
have verified that my qc_defer function does indeed constitute a valid
test case. I'll append some of the messages I get after the patch.

Thanks for your help,

Elias


---
diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c
index b406b39..f4dede7 100644
--- a/drivers/ata/ata_piix.c
+++ b/drivers/ata/ata_piix.c
@@ -167,6 +167,7 @@ static void piix_set_dmamode(struct ata_
 static void ich_set_dmamode(struct ata_port *ap, struct ata_device *adev);
 static int 

Re: Current qc_defer implementation may lead to infinite recursion

2008-02-11 Thread Tejun Heo
Elias Oltmanns wrote:
 +static int piix_qc_defer(struct ata_queued_cmd *qc)
 +{
 + static struct ata_port *ap = NULL;
 + struct ata_queued_cmd qcmd[ATA_MAX_QUEUE];

missing static?

 + static int count = 0;
 +#define PIIX_QC_DEFER_THRESHOLD 5000
 +
 + if (!ap)
 + ap = qc-ap;
 + else if (ap != qc-ap)
 + return 0;
 +
 + if (count  PIIX_QC_DEFER_THRESHOLD + 100)
 + return 0;
 +
 + count++;
 + if (count  PIIX_QC_DEFER_THRESHOLD)
 + return 0;
 + else if (count == PIIX_QC_DEFER_THRESHOLD) {
 + int i;
 + if (ap-qc_allocated) {
 + for (i = 0; i  ATA_MAX_QUEUE; i++)
 + qcmd[i] = ap-qcmd[i];
 + printk(KERN_DEBUG piix_qc_defer(): saved current 
 state\n);
 + msleep(5000);
 + } else
 + count--;
 + } else if (count == PIIX_QC_DEFER_THRESHOLD + 100) {
 + dump_stack();
 + count++;
 + } else if (memcmp(qcmd, ap-qcmd, sizeof(qcmd[0]) * ATA_MAX_QUEUE))

memcmp() will always mismatch.  qcmd contains garbage.

 + return ATA_DEFER_LINK;
 + else
 + count = 0;
 + return 0;
 +}
 +
  static const struct piix_map_db ich5_map_db = {
   .mask = 0x7,
   .port_enable = 0x3,
 ---
 
 
 
 Feb 11 20:33:05 denkblock kernel: piix_qc_defer(): saved current state
 Feb 11 20:33:11 denkblock kernel: BUG: scheduling while atomic: 
 swapper/0/0x0100

You can't call msleep(5000) there.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current qc_defer implementation may lead to infinite recursion

2008-02-11 Thread Tejun Heo
Hello,

Elias Oltmanns wrote:
 Hmmm... The reason why max_host_blocked and max_device_blocked are set
 to 1 is to let libata re-consider status after each command completion
 as blocked status can be rather complex w/ PMP.  I haven't really
 followed the code yet but you're saying that blocked count of 2 should
 be used for that behavior, right?
 
 Not quite. On an SMP system the current implementation will probably do
 exactly what you had in mind. In particular, setting max_device_blocked
 and max_host_blocked to 1 seems to be the right thing to do in this
 case.

I currently can't test the code but SMP or not doesn't make much
difference if your analysis is correct.  You're saying the the code will
infinitely recurse if qc_defer returns non-zero but on SMP the recursion
will be terminated by another CPU finishing a pending request, right?  I
don't think things will fan out that way.  The recursing thread will
have much more than enough time to overflow the stack before any IO
event comes to stop the recursion.

I don't think such infinite recursions can happen because
blk_run_queue() doesn't allow recursing more than once.

 Another strange thing is that there hasn't been any such lock up /
 infinite recursion report till now although -qc_defer mechanism bas
 been used widely for some time now.  Can you reproduce the problem w/o
 the disk shock protection?
 
 No, unfortunately, I'm unable to reproduce this without the patch on my
 machine. This is for purely technical reasons though because I'm using
 ata_piix. Running a vanilla kernel, I'd expect everything to work just
 fine except for one case: A non-SMP system using a driver that provides
 the -qc_defer() callback. Currently, the -qc_defer() callback is the
 only thing that can possibly send a non zero return value to the scsi
 midlayer. Once it does, however, the driver will only get a chance to
 complete some qcs before -qc_defer() is called again provided that
 multithreading is supported.
 
 So, what I'm saying is this: If the low level driver doesn't provide a
 -qc_defer() callback, there is no (obvious) reason why
 max_device_blocked and max_host_blocked should be set to 1 since libata
 won't gain anything by it. However, it is not a bug either, even though
 James considers it suboptimal and I will have to think about a solution
 for my patch. On the other hand, once a driver defines the -qc_defer()
 callback, we really have a bug because things will go wrong once
 -qc_defer() returns non zero on a uniprocessor. So, in this case
 max_device_blocked and max_host_blocked should be set to 1 on an SMP
 system and *have to* be bigger than 1 otherwise.

Well, we need to support -qc_defer on UP systems too.  I'm still very
skeptical that the scenario you described can happen.  The mechanism is
just used too widely for such problem to go unnoticed.  For example,
deferring is exercised when FLUSH is issued to a driver which have
in-flight NCQ commands.  If you have barrier enabled FS mounted on an
NCQ enabled hard drive, such events will happen regularly but we haven't
heard of any similar lock up or stack overflow till now.

Feel free to add -qc_defer to ata_piix and let it defer commands (say
every 20th or any other criteria you seem fit for testing) but I think
the worst can happen is somewhat inefficient deferring sequence like the
following.

1. -qc_defer returns DEVICE BUSY
2. device_blocked set to 1
3. scsi_queue_insert() ends up calling blk_run_queue()
4. blk_run_queue() recurses, device_blocked cleared and queue processing
   starts again.
5. -qc_defer returns DEVICE BUSY again
6. device_blocked set to 1
7. scsi_queue_insert() ends up calling blk_run_queue()
8. blk_run_queue() detects it's recursing the second time and veils.

At this point, although it went through unnecessary loop through queue
processing, everything is well.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current qc_defer implementation may lead to infinite recursion

2008-02-10 Thread Elias Oltmanns
Tejun Heo [EMAIL PROTECTED] wrote:
 Elias Oltmanns wrote:
 Hi Tejun,
 
 due to your commit 31cc23b34913bc173680bdc87af79e551bf8cc0d libata now
 sets max_host_blocked and max_device_blocked to 1 for all devices it
 manages. Under certain conditions this may lead to system lockups due to
 infinite recursion as I have explained to James on the scsi list (kept
 you cc-ed). James told me that it was the business of libata to make
 sure that such a recursion cannot happen.
 
 In my discussion with James I imprudently claimed that this was easy to
 fix in libata. However, after giving the matter some thought, I'm not at
 all sure as to what exactly should be done about it. The easy bit is
 that max_host_blocked and max_device_blocked should be left alone as
 long as the low level driver does not provide the -qc_defer() callback.
 But even if the driver has defined this callback, ata_std_qc_defer() for
 one will not prevent this recursion on a uniprocessor, whereas things
 might work out well on an SMP system due to the lock fiddling in the
 scsi midlayer.
 
 As a conclusion, the current implementation makes it imperative to leave
 max_host_blocked and max_device_blocked alone on a uniprocessor system.
 For SMP systems the current implementation might just be fine but even
 there it might just as well be a good idea to make the adjustment
 depending on -qc_defer != NULL.

 Hmmm... The reason why max_host_blocked and max_device_blocked are set
 to 1 is to let libata re-consider status after each command completion
 as blocked status can be rather complex w/ PMP.  I haven't really
 followed the code yet but you're saying that blocked count of 2 should
 be used for that behavior, right?

Not quite. On an SMP system the current implementation will probably do
exactly what you had in mind. In particular, setting max_device_blocked
and max_host_blocked to 1 seems to be the right thing to do in this
case.


 Another strange thing is that there hasn't been any such lock up /
 infinite recursion report till now although -qc_defer mechanism bas
 been used widely for some time now.  Can you reproduce the problem w/o
 the disk shock protection?

No, unfortunately, I'm unable to reproduce this without the patch on my
machine. This is for purely technical reasons though because I'm using
ata_piix. Running a vanilla kernel, I'd expect everything to work just
fine except for one case: A non-SMP system using a driver that provides
the -qc_defer() callback. Currently, the -qc_defer() callback is the
only thing that can possibly send a non zero return value to the scsi
midlayer. Once it does, however, the driver will only get a chance to
complete some qcs before -qc_defer() is called again provided that
multithreading is supported.

So, what I'm saying is this: If the low level driver doesn't provide a
-qc_defer() callback, there is no (obvious) reason why
max_device_blocked and max_host_blocked should be set to 1 since libata
won't gain anything by it. However, it is not a bug either, even though
James considers it suboptimal and I will have to think about a solution
for my patch. On the other hand, once a driver defines the -qc_defer()
callback, we really have a bug because things will go wrong once
-qc_defer() returns non zero on a uniprocessor. So, in this case
max_device_blocked and max_host_blocked should be set to 1 on an SMP
system and *have to* be bigger than 1 otherwise.

Regards,

Elias
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Current qc_defer implementation may lead to infinite recursion

2008-02-10 Thread Tejun Heo
Elias Oltmanns wrote:
 Hi Tejun,
 
 due to your commit 31cc23b34913bc173680bdc87af79e551bf8cc0d libata now
 sets max_host_blocked and max_device_blocked to 1 for all devices it
 manages. Under certain conditions this may lead to system lockups due to
 infinite recursion as I have explained to James on the scsi list (kept
 you cc-ed). James told me that it was the business of libata to make
 sure that such a recursion cannot happen.
 
 In my discussion with James I imprudently claimed that this was easy to
 fix in libata. However, after giving the matter some thought, I'm not at
 all sure as to what exactly should be done about it. The easy bit is
 that max_host_blocked and max_device_blocked should be left alone as
 long as the low level driver does not provide the -qc_defer() callback.
 But even if the driver has defined this callback, ata_std_qc_defer() for
 one will not prevent this recursion on a uniprocessor, whereas things
 might work out well on an SMP system due to the lock fiddling in the
 scsi midlayer.
 
 As a conclusion, the current implementation makes it imperative to leave
 max_host_blocked and max_device_blocked alone on a uniprocessor system.
 For SMP systems the current implementation might just be fine but even
 there it might just as well be a good idea to make the adjustment
 depending on -qc_defer != NULL.

Hmmm... The reason why max_host_blocked and max_device_blocked are set
to 1 is to let libata re-consider status after each command completion
as blocked status can be rather complex w/ PMP.  I haven't really
followed the code yet but you're saying that blocked count of 2 should
be used for that behavior, right?

Another strange thing is that there hasn't been any such lock up /
infinite recursion report till now although -qc_defer mechanism bas
been used widely for some time now.  Can you reproduce the problem w/o
the disk shock protection?

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html