>The kernel of current Linux-Distributions does not support >multiple pathes to a dasd device at all. >A workaround is to spread the data over multiple devices >using LVM or MD in striping mode. Using the same amount >of devices like the amount of pathes available (or a multiple >of it) should fit best. >This problem is already addressed in the current (experimental) >2.4.17 code.
Hi Carsten, I work on the Enterprise Volume Management System (evms) project. You can look us up on sourceforge. We support lvm and md volumes in addition to evms volumes, compatibility volumes, etc. I don't understand how any flavor of raid is a workaround for not supporting multipath. This is not ... i repeat ... not ... a performance question at all. So, let me explain a bit further ... you undoubtedly know all this ... A linux block device driver reports disks to the kernel (2.4) by calling register_blkdev () I think. If the block device driver is incapable of recognizing alternate paths to the same device, either by scsi id or ch-cu-dev addressing or whatever, then later on your going to find multiple instances of the same disk in the gendisk list. Then, our evms logical device manager walks the gendisk list and thinks it is finding unique disks ... but it truly isn't. It hands the logical disks up the feature stack so that disk segments, regions, containers ... volumes can be discovered ... only its now handing up multiple instances of the same disk. I actually work on the user mode configuration tools and honestly don't know much about the kernel side of things. However, my engine plugins ( i write features like drive linking and segment managers like mdos or the 390 segmgr) need to be able to run discovery paths just like the kernel and life suddenly gets tougher for me when I might see multiple instances of the same physical disk. You say that 2.4.17 supports multipath. I built a 2.4.17 evms kernel to test this out on ... I don't think my 390 has multi channel paths configured ... at least I don't think so ... and so I tried the following : shutdown -h now cp link * 208 209 mw ipl 200 clear And now I have dasdj showing up on my machine. This isn't a very good test case but it does seem to simulate what can happen when the same physical device appears more than once in the gendisk list. Currently, I look for disks with the same volume id. So, dasdi and dasdj are both going to report a volid of 0x0208 (in EBCDIC) and so I inspect them a bit closer ... by writing a test pattern to the volid field on one of the disks and then reading the volume label on the second disk to see if the test pattern appears in the volid field. If it does ... then I am looking at the same disk through different gendisk entries ... restore the volid on the disk ... remove the second instance of the disk ... continue along the volume discovery path. Ok, this seems to allow me to recognize multiple entries ... prevent anyone from trying to allocate a data extent on the second entry ... even do failover to the second entry if the first starts producing i/o errors ... but I can't test for channel busy and start the i/o on a non busy chpid to get improved performance. Plus ... I now have a paranoid segment manager on the 390 (a plugin that manages a disk data extent) that is worried about multipath devices when it should only be concerned about managing segments. So ... I'd -LOVE- to see multipath support on s390 that would give me the comfort of knowing I won't ever see multiple instances on the same physical disk being reported and also ... the added benefit of improved performance by utilizing non-busy channel paths. But my test case seems to show that on a 2.4.17 kernel I will still see mp disks showing up in the gendisk list multiple times. Is this just a bad test case? Also, I think your suggested configuration was to have striping equal the number of channel paths to the device. Would it not be more desirable to have each physical volume on a separate control unit with multiple channel paths to each control unit? This would seem to lessen the chance of getting channel busy back when trying to read/write a stripe to its device. Mit Freundlichem ... and Thanks! Don Mulvey IBM Austin Linux Technology Center (512) 838-1787
