Mike Christie wrote:
[ .. ]
> 
> I am not sure if we should be needing this if the target is operating
> within the RFC (there is one exception but I am not sure if you are
> hitting it).
> 
> In 3.2.2.1, I saw this:
> 
> An iSCSI target MUST be able to handle at least one immediate task
> management command and one immediate non-task-management iSCSI command
> per connection at any time.
> 
> 3.2.2.1 also has:
> 
> The target MUST silently ignore any non-immediate command outside of
> this range or non-immediate duplicates within the range. The CmdSN
> carried by immediate commands may lie outside the ExpCmdSN to MaxCmdSN
> range.
> 
> 
> I took this to mean even when the window is closed we can send a nop as
> immediate. What do you guys think?
> 
Totally beside the point.

We're not sending NOPs outside the CmdSN range, we're sending
_data_ PDUs outside the CmdSN range. Just make it a printk in
the above patch and start hitting the target hard.
You'll see an amazing number of messages ...

There is _nothing_ in the code which checks if the data PDU
we're about to send has a CmdSN within the target window.
And then we're hitting the quoted text and the target drops the
PDU, leading to a nice I/O stall.
Which is the I/O stall I'm fighting with since _months_.

> The initiator will only send 1 TMF as immediate per session at a time
> and it will only send one nop as a ping marked as immediate at a time.
> The only exception to use sending more than one non tmf immediate cmd is
> if the target sends us a nop-in we could have sent two nop-outs marked
> as immediate (for the nop-out in response to the target's nop-in,
> 10.18.1 says we have to set the I bit).
> 
> If we send too many nops marked as immediate we should be getting a
> reject pdu right? If so then I think we just need the attached patch
> which adds some code to handle rejected immediate pdus. The patch is
> made over scsi-rc-fixes and is only compile tested.
> 
> 
> Are you only seeing this with the one target? Could we confirm with them
> that they will accept one non tmf immediate command?
> 
> 
> If I am reading the RFC wrong, then for your patch, we want to move the
> check to below the check_mgmt label because iscsi_data_xmit can send
> multiple pdus. You probably just want to move it to
> iscsi_prep_mgmt_task(). Also I think we want to dequeue a nop as a ping
> so it does not timeout while the cmd window is closed (the problem would
> be is if the window was closed and then the connection goes bad - we
> would not be able to catch that).
> 

Yes, you are probably correct in that we'd need to move it into
the individual queue loops to be able to transmit as many PDUs as
possible.
With the original patch we're running into the risk of hitting the
same error when enough PDUs are queued.

I'll update the patch.
> 
> 
>>> +
>>
>> Looks good, From the time queuecommand did the check
>> (iscsi_check_cmdsn_window_closed)
>> a management command came in without checking and stuffed an entry
>> into the task queue.
>> Good catch.
>>
No, wrong. The check in queuecommand is by no means relevant
to the actual window.
We're checking the target window at the time queuecommand is run,
but we're _generating_ the CmdSN only much later after we've
dequeued the command.

And it's quite feasible to flood the xmitqueue with more commands
than can be transmitted, so the CmdSN window won't be updated for
a long time. In addition we're not incrementing the 
This allows us to stuff quite a few commands in the
xmitqueue. You can easily check this by eg doing a journal replay.
That puts out quite a lot of I/O in a short time:

Jul 22 12:29:46 esk kernel: [ 2164.102874]  connection1:0: cmd target window 
closed, cmd 5319 max 5318
Jul 22 12:29:46 esk kernel: [ 2164.138952]  connection1:0: cmd target window 
closed, cmd 5339 max 5338
Jul 22 12:29:46 esk kernel: [ 2164.177920]  connection1:0: cmd target window 
closed, cmd 5362 max 5361
Jul 22 12:29:46 esk kernel: [ 2164.213620]  connection1:0: cmd target window 
closed, cmd 5382 max 5381
Jul 22 12:29:46 esk kernel: [ 2164.251724]  connection1:0: cmd target window 
closed, cmd 5402 max 5401
Jul 22 12:29:46 esk kernel: [ 2154.265954]  connection2:0: cmd target window 
closed, cmd 5283 max 5282
Jul 22 12:29:46 esk kernel: [ 2164.298269]  connection1:0: cmd target window 
closed, cmd 5445 max 5444
Jul 22 12:29:46 esk kernel: [ 2164.298380]  connection1:0: cmd target window 
closed, cmd 5446 max 5445
Jul 22 12:29:46 esk kernel: [ 2164.298757]  connection1:0: cmd target window 
closed, cmd 5447 max 5446
Jul 22 12:29:46 esk kernel: [ 2164.299374]  connection1:0: cmd target window 
closed, cmd 5448 max 5447
Jul 22 12:29:46 esk kernel: [ 2164.299971]  connection1:0: cmd target window 
closed, cmd 5449 max 5448
Jul 22 12:29:46 esk kernel: [ 2164.300717]  connection1:0: cmd target window 
closed, cmd 5450 max 5449
Jul 22 12:29:46 esk kernel: [ 2164.301455]  connection1:0: cmd target window 
closed, cmd 5451 max 5450
Jul 22 12:29:46 esk kernel: [ 2164.302187]  connection1:0: cmd target window 
closed, cmd 5452 max 5451
Jul 22 12:29:46 esk kernel: [ 2164.302934]  connection1:0: cmd target window 
closed, cmd 5453 max 5452
Jul 22 12:29:46 esk kernel: [ 2164.303672]  connection1:0: cmd target window 
closed, cmd 5454 max 5453
Jul 22 12:29:46 esk kernel: [ 2164.304407]  connection1:0: cmd target window 
closed, cmd 5455 max 5454
Jul 22 12:29:46 esk kernel: [ 2164.305125]  connection1:0: cmd target window 
closed, cmd 5456 max 5455

(that check is just after iscsi_prep_scsi_cmd_pdu() in iscsi_data_xmit() )
So as you can see, we're normally just off by one, so by the time the PDU 
reaches the target
there is a fair change the target has processed some old ones and updated the 
(internal) CmdSN window
already. But if not -> boom.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                   zSeries & Storage
h...@suse.de                          +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to