Erez Zilber wrote:
> 
> I enabled open-iscsi logging + added some printk calls when the abort
> handler returns.
> Here's the log. I see that iscsi_eh_cmd_timed_out gets called, but
> there's no abort.

> May 17 11:00:06 kpc36 kernel:  session1: iscsi_eh_cmd_timed_out scsi
> cmd ffff8101e30efe40 timedout
> May 17 11:00:06 kpc36 kernel:  session1: iscsi_eh_cmd_timed_out return
> timer reset

As you can see in iscsi_eh_cmd_timed_out, if the sesison is down then 
there is no point in letting the scsi eh run since we have to relogin 
and restart commands so we would return reset timer which prevents the 
scsi eh from running.

And then there is code in there to check if we are in the middle of 
checking the connection. If we are then we ask for some more time with 
the command, and that will prevent the scsi eh from running. This looks 
like it can be problem because we would get a response to our nop which 
would update the last_recv field. If there was no progress being made 
for the scsi command we would still ask to reset the timer and we could 
end up in that loop forever since the scsi layer does not cap the number 
of times you can reset the time. I will send a patch to fix that.


However, that probably will not fix your problem.


For your specific setup, it looks like we hit the 
iscsi_eh_cmd_timed_out, reset the scsi command timer becuase we are in 
the middle of checking the the connection with the nop/ping, but then 
the nop/ping does not return in time and so we drop the session:

   connection1:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4526718494, last ping 4526723494, now
4526728494

That is why on the target you see it cleanup up commands. On the 
initiator you can see us cleaning up:

May 17 11:00:07 kpc36 kernel:  session1: iscsi_start_session_recovery
blocking session
May 17 11:00:07 kpc36 kernel:  session1: fail_scsi_tasks failing sc
ffff8101e30efe40 itt 0x13 state 3

And then later in the logs you will see us start the commands again when 
we are logged in again.


So you probably need to continue to replying to nops when the r2t is 
dropped. I will fix it on the initiatotr side to detect if we are not 
getting IO for a specific command and then let the scsi eh run.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to