>The kernel of current Linux-Distributions does not support
>multiple pathes to a dasd device at all.
>A workaround is to spread the data over multiple devices
>using LVM or MD in striping mode. Using the same amount
>of devices like the amount of pathes available (or a multiple
>of it) should fit best.
>This problem is already addressed in the current (experimental)
>2.4.17 code.


Hi Carsten,

I work on the Enterprise Volume Management System (evms) project.   You can
look us up on sourceforge.   We support lvm and md volumes in addition to
evms volumes, compatibility volumes, etc.    I don't understand how any
flavor of raid is a workaround for not supporting multipath.  This is not
... i repeat ... not ... a performance question at all.   So, let me
explain a bit further ... you undoubtedly know all this ...  A linux block
device driver reports disks to the kernel (2.4) by calling register_blkdev
() I think. If the block device driver is incapable of recognizing
alternate paths to the same device, either by scsi id or ch-cu-dev
addressing or whatever,  then later on your going to find multiple
instances of the same disk in the gendisk list.   Then, our evms logical
device manager walks the gendisk list and thinks it is finding unique disks
... but it truly isn't.   It hands the logical disks up the feature stack
so that disk segments, regions, containers ... volumes can be discovered
... only its now handing up multiple instances of the same disk.   I
actually work on the user mode configuration tools and honestly don't know
much about the kernel side of things.  However, my engine plugins ( i write
features like drive linking and segment managers like mdos or the 390
segmgr) need to be able to run discovery paths just like the kernel and
life suddenly gets tougher for me when I might see multiple instances of
the same physical disk.

You say that 2.4.17 supports multipath.   I built a 2.4.17 evms kernel  to
test this out on ... I don't think my 390 has multi channel paths
configured ... at least I don't think so ... and so I tried the following :

shutdown -h now
cp link * 208 209 mw
ipl 200 clear

And now I have dasdj showing up on my machine.   This isn't a very good
test case but it does seem to simulate what can happen when the same
physical device appears more than once in the gendisk list.    Currently, I
look for disks with the same volume id.   So, dasdi and dasdj are both
going to report a volid of 0x0208 (in EBCDIC) and so I inspect them a bit
closer ... by writing a test pattern to the volid field on one of the disks
and then reading the volume label on the second disk to see if the test
pattern appears in the volid field.  If it does ... then I am looking at
the same disk through different gendisk entries ... restore the volid on
the disk ... remove the second instance of the disk ... continue along the
volume discovery path.    Ok, this seems to allow me to recognize multiple
entries ... prevent anyone from trying to allocate a data extent on the
second entry ... even do failover to the second entry if the first starts
producing i/o errors ... but I can't test for channel busy and start the
i/o on a non busy chpid to get improved performance.   Plus ... I now have
a paranoid segment manager  on the 390 (a plugin that manages a disk data
extent) that is worried about multipath devices when it should only be
concerned about managing segments.

So ... I'd -LOVE- to see multipath support on s390 that would give me the
comfort of knowing I won't ever see multiple instances on the same physical
disk being reported and also ... the added benefit of improved performance
by utilizing non-busy channel paths.   But my test case seems to show that
on a 2.4.17 kernel I will still see mp disks showing up in the gendisk list
multiple times.   Is this just a bad test case?

Also,  I think your suggested configuration was to have striping equal the
number of channel paths to the device.   Would it not be more desirable to
have each physical volume on a separate control unit with multiple channel
paths to each control unit?  This would seem to lessen the chance of
getting channel busy back when trying to read/write a stripe to its device.


Mit Freundlichem ... and Thanks!

Don Mulvey
IBM Austin
Linux Technology Center
(512) 838-1787

Reply via email to