John Forte wrote:

> What you are trying to do should be possible so no, it is not expected
> behavior.

This is being done today, with Shared QFS configured with multiple  
iSCSI initiators, all accessing a single iSCSI Target, and single QFS  
metadata server.

> The errno (146) is indicating connection refused which would
> seem to imply that there is no listening socket at the time. With that
> said, I did just try this on a build 70b  initiator and target and it
> worked fine for me. Based on the below message, it looks like the
> initiator is Solaris 10 based, is that correct?

We are seeing a problem with the iSCSI target resulting in the  
messages below, where an undefined state is reached in the T10 State  
Machine, restarting the daemon process after it creates a process  
core file. This causes all iSCSI Initiator connections to be dropped,  
and eventually a re-login sequence to happen once the iSCSI Target  
restarts. Depending on the processing state of the Solaris 10 iSCSI  
Initiator is during this restart, there is a TCP/IP socket that gets  
stuck in an error 146 state. This doesn't happen with the Nevada  
iSCSI Initiator, but I'm not sure if the fix was in the iSCSI Target  
or the Solaris TCP/IP stack for Nevada, since I have not been able to  
root-cause the failure.

libc.so.1`_lwp_kill+8(6, ffffffffffffffec, ffffffff7dc00000, 0, 0, 5)
libc.so.1`abort+0x10c(1, 4960, 6, 0, aafe4, 4800)
libc.so.1`_assert+0x70(10002aad0, 10002aad8, 253, 7000, aac98,  
ffffffff7dc00000)
t10_cmd_state_machine+0x444(100261090, 2, 19c, 100000, 8, 100137000)
t10_cmd_shoot_event+0x1c(100261160, 2, 0, 4, 100261090, 100261140)
spc_mselect_data+0xa0(100261090, 0, 0, 10013cac0, 10013cac4, 0)
trans_rqst_dataout+0xc0(100261090, 10013cac0, e, 0, 0, 10013cad7)
spc_mselect+0x7c(100261090, 0, 0, e, 10, 10013c4e0)
lu_runner+0x148(100261260, 0, 6, 200, 100137f70, 100139e60)
libc.so.1`_lwp_start(0, 0, 0, 0, 0, 0)

The work-around for Solaris 10 is to disable / enable the iSCSI  
Initiator discovery method, and of course a final fix is needed for  
the iSCSI Target's T10 State Machine.

Also, the process cores, can fill up the Solaris root device, so if  
the iSCSI Target restarts are occurring at a high rate, please take a  
look at coreadm, and move the core files to a mounted filesystem  
other then root "/".

Jim

> What release level is
> the target? I'm not saying that release level compatibility is the  
> issue
> but the error condition seems a bit puzzling. Have you looked at the
> logs on the target?
>
> - John
>
> Markus Halter wrote:
>> I try to share a single iscsi target between multiple initiators.  
>> As long as I have only one initiator using the target, there is no  
>> problem. As soon as I enable the initiator on the second node I  
>> get messages like the following:
>>
>> Nov 25 19:27:51 s0003 iscsi: NOTICE: iscsi connection(27) unable  
>> to connect to target iqn.1986-03.com.sun:02:87ffcc6a-61ff- 
>> cc07-8356-8eeaa5e1ab20 (errno:146)
>>
>> I get this messages on both initiator nodes. To me it looks like  
>> both initiator nodes race for only one connection on the target side.
>>
>> Is this the expected behavior ? Is it not possible to share  
>> targets between multiple initiators? Am I doing something wrong ?
>>
>> Any hints appreciated
>> Markus
>>
>>
>> This message posted from opensolaris.org
>> _______________________________________________
>> storage-discuss mailing list
>> [email protected]
>> http://mail.opensolaris.org/mailman/listinfo/storage-discuss
>>
>>
>
> _______________________________________________
> storage-discuss mailing list
> [email protected]
> http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Jim Dunham
Storage Platform Software Group

Sun Microsystems, Inc.
1617 Southwood Drive
Nashua, NH 03063


_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to