On 07/01/2011 08:57 AM, Seth Simons wrote: > Unfortunately there's very little technical documentation. This is one > of HP's 'low end' SANs. I'll see if HP support has anything to say > about it. > > I'm not really having luck with be2iscsi, now that i've managed to get > it to login to the target, I'm getting the following kernel panic > after around 15-20 minutes of the machine being up and logged in > (copied from the console). Any ideas? I am using multipath. >
The hung task warnings are saying that some IO has taken longer than the hung task timeout value which looks like it is 2 miniutes for you. Are you doing any type of port down/up type of test? > Debian GNU/Linux 6.0 thm-vmutil01 ttyS0 > Is there any line before this? > thm-vmutil01 login: [ 2642.519605] connection1:0: Could not send nopout The iscsi layer sends a iscsi ping (nop iscsi packet) every noop_interval (see iscsid.conf) seconds if there is no scsi traffic or if the target sends us a nop (target is pinging us) and we are replying to it. That message indicates that the iscsi layer could not send a nop. A failure could happen if the target sends us a lot of nops and we cannot allocate memory to reply or if the driver cannot allocate memory or if the session was not logged in. > [ 2662.491488] (beiscsi_process_cq():1953):CQ Error 13, reset CID 0x0... Here we see the target drop the connection. > [ 2662.547658] connection1:0: detected conn error (1011) iscsi layer logging that. So what could have happened is that the target sent us a ping, we could not allocate resources and so the target dropped the connection. Or The log messages got logged out of order (the sending of the nop and handling of the target drop could happen on different processors) and what happened was the target dropped the connection, that would set the session to not logged in state, and that would cause the Could not send nop message/failure. > [ 2683.577311] device-mapper: multipath: Failing path 8:0. Multipath sends a test command every so often. It just figured out that the path that was affected by the target dropping the connection is down. > [ 2791.052787] connection2:0: Could not send nopout > [ 2811.027636] (beiscsi_process_cq():1953):CQ Error 13, reset CID 0x40... > [ 2811.084076] connection2:0: detected conn error (1011) Same thing happens to session2. > [ 2833.072003] sd 0:0:0:0: timing out command, waited 180s Here it means the command has been running for at least 180 secs. The scsi layer is now going to fail it. > [ 2833.120683] sd 0:0:0:0: [sda] Unhandled error code > [ 2833.162344] sd 0:0:0:0: [sda] Result: hostbyte=DID_IMM_RETRY Here is the strange thing. When the session goes down the iscsi layer will temporarily requeue the IO with the code DID_IMM_RETRY. At the same time the iscsi layer will set the devices/paths into the blocked state (see /sys/block/sdX/device/state). And so we should not be seeing that waited 180 secs error. What should happen is that the iscsi layer will block the devices/paths, and IO will be queued. Then if we can log back in we will start IO again or if we cannot log in within node.session.timeo.replacement_timeout seconds, the iscsi layer will unblock the devices/paths and fail IO upwards to the block/multipath/FS layers. So with the default setting of the replacement_timeout (120 secs) you should be seeing a message: session recovery timed out after X secs before you see hung task message below. Is this easy to replicate? There is just too much going wrong here. If it happens again, can you do cat /sys/block/sdX/device/state and tell me the values and run iscsiadm -m session -P 3 and send all the output? Would you also be able to run a patch that will add some extra debugging to the driver and iscsi layer? I will try to contact HP and get access to a box like this. Jay is leaving on vacation so I do not think he will be able to help for a couple days. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to firstname.lastname@example.org. To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.