Hi,
On 06/22/2014 10:19 AM, Martin George wrote:
So firstly, the question arises why your kernel marked all paths as
failed when you hit this error. This actually resembles the old Linux
behavior where for a device error such as a MEDIUM ERROR, it gets
retried on all paths available to the LUN, all which result in the same
error, and hence all paths get marked as failed. This was addressed with
the upstream patch at
http://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=63583cca745f440167bf27877182dc13e19d4bcf,
where more fine-grained error handling is now available.
Yes, it retries on all paths. The kernel version (3.2.57) which is used
in my case already includes the changes mentioned above.
With this,
device errors such as MEDIUM ERROR are no longer retried since it treats
such errors as permanent errors. That makes me suspect your kernel is
already missing some of the key patches from the upstream kernel in
context with this error handling. And given that UNMAP has also been a
relatively new feature which underwent several upstream revisions to get
to the current stable state, it would be prudent for you to check if
your kernel is up-to-date with its SCSI & UNMAP handling.
Currently I'm not able to reproduce the error (getting this iSCSI
response) I see in production after re-creating a very similar test
setup using same hardware and software that is failing on me, which is a
bit confusing. :||
So, even worse, I'm not convinced that the actual problem is a linux
kernel problem yet. Why is my NetApp filer sending a MEDIUM ERROR
"Incompatible medium installed" to me anyway in the other case?
The latest kernel code only prevents (afaics) the retry in a small
subset of cases, which does not include an asc of 0x30 INCOMPATIBLE
MEDIUM INSTALLED.
case MEDIUM_ERROR:
if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */
sshdr.asc == 0x13 || /* AMNF DATA FIELD */
sshdr.asc == 0x14) { /* RECORD NOT FOUND */
set_host_byte(scmd,DID_MEDIUM_ERROR);
return SUCCESS;
}
return NEEDS_RETRY;
That said, it is indeed strange that you hit a MEDIUM ERROR in the first
place, when using UNMAP. As described above, that's a device error. So
does this fail even for other commands such as a regular write (you
could try this with dd) or even a simple TUR command (like say using
sg_turs -v /dev/mpathX)?
# sg_turs -v /dev/mapper/mpath_scylla0
test unit ready cdb: 00 00 00 00 00 00
The UNMAP is the only command that causes the failure. As long as I do
not cause an UNMAP to be sent, by doing mkfs.ext4 without -E nodiscard,
doing a mkfs.btrfs without preventing discard or issuing an fstrim
command, this multipathed lvm on iscsi handles millions of iscsi write
and read ops every day in production just fine. If an UNMAP is sent, it
makes all iSCSI storage on a physical server hang, as seen before.
Today I played around a bit in my test environment (where the failure
does not occur yet), also tcpdumping the iSCSI traffic, viewing it
afterwards using wireshark, and reading about the SCSI specs. That's a
very interesting way to learn more about what I'm talking about here. :-)
If there's no obvious way to be found to trigger the same error in the
test environment, I think I'm going to propose to trigger the same again
while having the test physical server attached to the production luns.
From the past occurance, I know that if the only thing that breaks is
the storage connection on the physical server that executes the UNMAP.
It's still not the most reassuring choice, but a kind of a calculated risk.
If that's possible I can do a couple of tcpdumps on the iscsi and
blktrace dumps to capture what's going on and post them here. Doing so
will prove whether the SCSI error was actually being sent by the NetApp
device or not.
--
Hans van Kranenburg - System / Network Engineer
T +31 (0)10 2760434 | [email protected] | www.mendix.com
--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]