On Jan 18, 2007, at 3:28 PM, Peter Tribble wrote:

Next test: Create a second zvol on my snv_54 target machine, and share it with iscsi.

I add it to the static-config on my S10U3 initiator machine. I can see it with format just
fine.

I then use it to create a zfs pool, and the client reports:

Jan 18 22:01:07 foobar iscsi: [ID 867852 kern.warning] WARNING: iscsi connection(5) protocol error - received an unsupported opcode: 0x41

Needless to say something very strange is occurring here. The target uses various #defines while creating reponse PDU's and 0x41 is not part of those #defines. So how you are getting a packet with such a value is the question. Corruption of the data across the wire? Possible, but highly unlikely unless you've got a real bad network setup.

Jan 18 22:01:09 foobar scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x1 Jan 18 22:01:09 foobar iscsi: [ID 286457 kern.notice] NOTICE: iscsi connection(10) unable to connect to target iqn.1986-03.com.sun: 02:4f0bfc3c-008e-41a9-e995-d91cdddff311 (errno:146) Jan 18 22:01:09 foobar iscsi: [ID 286457 kern.notice] NOTICE: iscsi connection(5) unable to connect to target iqn.1986-03.com.sun: 02:7c606b5f-cc7e-ca47-8759-94d344519af1 (errno:146)

(The first connection is the new target, the second one is the old one.)

And I have another core from iscsitgtd on the server.

[EMAIL PROTECTED] mdb - core.iscsi2
Loading modules: [ libavl.so.1 libc.so.1 libnvpair.so.1 libuutil.so. 1 ld.so.1 ]
> $c
libc.so.1`_lwp_kill+8(6, ffffffffffffffec, ffffffff7dc00000, 0, 0, 5)
libc.so.1`abort+0x10c(1, 4960, 6, 0, aafe4, 4800)
libc.so.1`_assert+0x70(10002aad0, 10002aad8, 253, 7000, aac98, ffffffff7dc00000
)
t10_cmd_state_machine+0x444(100261090, 2, 19c, 100000, 8, 100137000)
t10_cmd_shoot_event+0x1c(100261160, 2, 0, 4, 100261090, 100261140)
spc_mselect_data+0xa0(100261090, 0, 0, 10013cac0, 10013cac4, 0)
trans_rqst_dataout+0xc0(100261090, 10013cac0, e, 0, 0, 10013cad7)
spc_mselect+0x7c(100261090, 0, 0, e, 10, 10013c4e0)
lu_runner+0x148(100261260, 0, 6, 200, 100137f70, 100139e60)
libc.so.1`_lwp_start(0, 0, 0, 0, 0, 0)


Well, the code (t10_cmd_state_machine) has detected an illegal state transition. The next question is why. Considering the bogus data that the initiator received it's possible that the target has received something just as unlikely causing it to attempt to do something equally weird.

I know this doesn't help much, but this behavior is too strange for words.

Would you feel comfortable shipping me the core and binary? Remember that since the daemon core could contain data blocks from your backing store I might see something you would wish to keep private.

Thanks to the magic of SMF, iscsitgtd restarted, and everything seems to
be operating normally now.

--
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

----
Rick McNeal

"If ignorance is bliss, this lesson would appear to be a deliberate attempt on your part to deprive me of happiness, the pursuit of which is my unalienable right according to the Declaration of Independence. I therefore assert my patriotic prerogative not to know this material. I'll be out on the playground." -- Calvin


_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to