On Jan 18, 2007, at 4:29 PM, Peter Tribble wrote:
On 1/18/07, Rick McNeal <[EMAIL PROTECTED]> wrote:Jan 18 22:01:07 foobar iscsi: [ID 867852 kern.warning] WARNING: iscsi connection(5) protocol error - received an unsupported opcode:0x41Needless to say something very strange is occurring here. The target uses various #defines while creating reponse PDU's and 0x41 is not part of those #defines. So how you are getting a packet with such a value is the question. Corruption of the data across the wire? Possible, but highly unlikely unless you've got a real bad network setup.I have seen network corruption here, but only the once.I was discussing issues around this last night with some of the folks at theLondon OpenSolaris User Group meeting. OK, so you've got checksums on the zpool supplying the target, and checksums locally in zfs, but what protects the packets as they move across the wire?
The iSCSI Protocol has the ability to add checksums to the header and/ or data portions. You can enable the digests on the initiator with the following commands:
iscsiadm modify target-param --headerdigest CRC32 <target_name> iscsiadm modify target-param --datadigest CRC32 <target_name>Just so that you're aware, the Solaris Target automatically accepts whatever the initiator wants to use for digests (enabled or not). The way the protocol is written, both target and initiator have to agree to use digests before they are sent. So, it's possible that if you where to use another vendors iSCSI Target you might need to enable digests there as well. I took the approach that if the administrator sets up the initiator to use digests they know what they want, they should have to log into the target and make a change there as well.
And I have another core from iscsitgtd on the server. [EMAIL PROTECTED] mdb - core.iscsi2Loading modules: [ libavl.so.1 libc.so.1 libnvpair.so.1 libuutil.so.1 ld.so.1 ]> $c libc.so.1`_lwp_kill+8(6, ffffffffffffffec, ffffffff7dc00000, 0, 0, 5) libc.so.1`abort+0x10c(1, 4960, 6, 0, aafe4, 4800)libc.so.1`_assert+0x70(10002aad0, 10002aad8, 253, 7000, aac98, ffffffff7dc00000) t10_cmd_state_machine+0x444(100261090, 2, 19c, 100000, 8, 100137000) t10_cmd_shoot_event+0x1c(100261160, 2, 0, 4, 100261090, 100261140) spc_mselect_data+0xa0(100261090, 0, 0, 10013cac0, 10013cac4, 0) trans_rqst_dataout+0xc0(100261090, 10013cac0, e, 0, 0, 10013cad7) spc_mselect+0x7c(100261090, 0, 0, e, 10, 10013c4e0) lu_runner+0x148(100261260, 0, 6, 200, 100137f70, 100139e60) libc.so.1`_lwp_start(0, 0, 0, 0, 0, 0)Well, the code (t10_cmd_state_machine) has detected an illegal state transition. The next question is why. Considering the bogus data that the initiator received it's possible that the target has received something just as unlikely causing it to attempt to do something equally weird.I know this doesn't help much, but this behavior is too strange for words.;-)Would you feel comfortable shipping me the core and binary? Remember that since the daemon core could contain data blocks from your backing store I might see something you would wish to keep private.I'll send it once I get into work in the morning.
Okay.
-- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
---- Rick McNeal"If ignorance is bliss, this lesson would appear to be a deliberate attempt on your part to deprive me of happiness, the pursuit of which is my unalienable right according to the Declaration of Independence. I therefore assert my patriotic prerogative not to know this material. I'll be out on the playground." -- Calvin
_______________________________________________ storage-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/storage-discuss
