On 4/27/14, 10:38 AM, Cale, Yonatan wrote:
Hi,
Our sim module is above the scsi layer (not between the iscsi&scsi layers), so 
I think this already rules out this guess.

What we do is something like this:
-Send scsi command
-If we didn't get a response after X seconds,
--Abort the command (perhaps many times, if the abort fails)

So.. We add some prints somewhere new?


I should have written *you* have to add some printks in the scsi/block layer :) As iscsi maintainer I am happy to help all vendors on iscsi related issue as you have seen in this thread, but I work for Fusion-io on their FC/SRP/iSCSI target, ION, so I do not have time to debug all kernel layers for a multi-billion dollar company like EMC :)

If I hit this problem with our product, I would look over the scsi scan code since we see those commands time out. I would look at the scsi scan code and see how it handled time out failures for report luns and inquirys.

Probably what the problem is, is that scsi layer tried to send a report luns, that timedout due to your target not responding for whatever reason, the scsi layer handled that by thinking that it failed because target does not support report luns and not due it just timing out, and scsi layer dropped down to a sequential scan as a result. So all those inquirys in the logs are not retries but instead the scsi layer trying to see if a lu is behind lun0, lun1, lun2....... lun(N = MAX_UNSIGNED_INT).

If that is not the problem, I would add debug code to the scsi_request_fn/scsi_dispatch_cmd and scsi_done/scsi_softirq_done/scsi_decide_disposition/scsi_finish_command/scsi_io_completion to see why those inquirys are retried when they should be failed.


I'd like to say again, that this bug happens with one version of VNX but not 
with another version. Do you think that might give us a hint?


Yes. I would guess your other VNX versions reply to the scsi scan related IO, so we do not fall into this problem where the scsi scan IO timedout, and IO is now endlessly retried or we drop down a sequential scan. Again, if I worked for EMC, I would have compared the logs for different versions to see what behavior changed.

Hope this helps. If you have even the slightest hunch it is a iscsi code problem come back and bug me, because I really do not care what vendor you are from when fixing iscsi bugs.




-----Original Message-----
From: Mike Christie [mailto:micha...@cs.wisc.edu]
Sent: Thursday, April 24, 2014 10:04 PM
To: Cale, Yonatan
Cc: open-iscsi@googlegroups.com; myselfandfr...@gmail.com
Subject: Re: Target reboot -> iscsiadm rescan Stuck

On 04/22/2014 04:10 AM, Cale, Yonatan wrote:
-----Original Message-----
From: Mike Christie [mailto:micha...@cs.wisc.edu]
Sent: Tuesday, April 22, 2014 12:38 AM
To: Cale, Yonatan
Cc: open-iscsi@googlegroups.com; myselfandfr...@gmail.com
Subject: Re: Target reboot -> iscsiadm rescan Stuck

Do you have some module that is hooking into the scsi layer or iscsi modules? Just 
wondering what the "sim_try_to_abort_cmd" call is. Where are you hooking in?
"sim" is our module that handles iscsi data-path. We hook for
notifications in order to know if we should cancel a command


Hey, does your sim module that handles the data path just monitor or do you 
handle error codes that the iscsi modules returns. The problem is that the 
iscsi layer is trying to fail a scsi scan related command, but whatever layer 
is above it (I thought it was just the scsi layer like normal in my other 
response) just keeps retrying it. Does your module do anything to IO failed with

#define DID_TRANSPORT_FAILFAST  0x0f /* Transport class fastfailed the io */

from the queuecommand path? Is it the one forcing the retry? That would explain 
why we do not see anything from the scsi scan layer debug printks.

If not, then it is the scsi or block layer and we will have to add some printks 
in there.



--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

Reply via email to