On 06/23/2014 06:31 PM, Hans van Kranenburg wrote:
On 06/23/2014 01:30 AM, Hans van Kranenburg wrote:

If there's no obvious way to be found to trigger the same error in the
test environment, I think I'm going to propose to trigger the same again
while having the test physical server attached to the production luns.
 From the past occurance, I know that if the only thing that breaks is
the storage connection on the physical server that executes the UNMAP.
It's still not the most reassuring choice, but a kind of a calculated
risk.

If that's possible I can do a couple of tcpdumps on the iscsi and
blktrace dumps to capture what's going on and post them here. Doing so
will prove whether the SCSI error was actually being sent by the NetApp
device or not.

And that's what I just did, together with a colleague of mine. On one
lun, the NetApp box accepts unmap, on another lun it throws up with
Incompatible Medium Installed. All other iSCSI connections from other
physical servers to the same production lun are not impacted, only the
connection to this server.

[...] dsfsdfsdfsdfdsf

For netapp-linux-community folks, previous mail is still in moderation queue, you can also read it in the debian bug report, including interesting tcpdump attachments with iscsi traffic while the errors occur: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=740701#87

Ok, for sake of completeness, I installed the 3.14 kernel from wheezy-backports (linux-image-3.14-0.bpo.1-amd64 3.14.7-1~bpo70+1) and reran the test, which provides the exact same results. Doing unmap on the test lun succeeds, doing unmap on the other lun results in the same behaviour and same errors, in slightly different formatting then when using the 3.2 kernel:

[...]
Jun 23 23:29:51 jolteon kernel: [ 678.219033] sd 9:0:0:0: [sdl] Unhandled sense code
Jun 23 23:29:51 jolteon kernel: [  678.219142] sd 9:0:0:0: [sdl]
Jun 23 23:29:51 jolteon kernel: [ 678.219234] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jun 23 23:29:51 jolteon kernel: [  678.219331] sd 9:0:0:0: [sdl]
Jun 23 23:29:51 jolteon kernel: [ 678.219423] Sense Key : Medium Error [current]
Jun 23 23:29:51 jolteon kernel: [  678.219653] sd 9:0:0:0: [sdl]
Jun 23 23:29:51 jolteon kernel: [ 678.219753] Add. Sense: Incompatible medium installed
Jun 23 23:29:51 jolteon kernel: [  678.219926] sd 9:0:0:0: [sdl] CDB:
Jun 23 23:29:51 jolteon kernel: [ 678.220019] Unmap/Read sub-channel: 42 00 00 00 00 00 00 00 18 00 Jun 23 23:29:51 jolteon kernel: [ 678.220946] device-mapper: multipath: Failing path 8:176.
[...]

By the way, also, the first message on this debian bug report, from Bill MacAllister already listed the output of a very recent linux kernel when using the test case 'mkfs on jessie'.

That concludes the discussion about older or newer linux kernels. The real problem here is NetApp, returning the SCSI errors while issuing UNMAP commands to it.

Questions left:
- Is it wanted to have the linux kernel multipathing fail an iop instead of retry on receiving the combination of a medium error and additional code incompatible medium installed? - Now I'm left with my broken NetApp, and I'd like to start using UNMAP on it... Any comments from netapp people reading this? There must be some reason why this is happening, and only on this specific lun, and not on the test lun, or on several of the other NetApp filer we use.

--
Hans van Kranenburg - System / Network Engineer
T +31 (0)10 2760434 | [email protected] | www.mendix.com


--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Reply via email to