> On Jun 10, 2010, at 5:53 AM, "Caspar Smit" <[email protected]> wrote: > >>> On 2010-06-08 16:03, Caspar Smit wrote: >>>> I also noticed that for the OCF scripts to work you have to start >> iscsi-target via the init script at boot using an EMPTY ietd.conf >> because >>>> for the scripts to work /usr/sbin/ietd has to be running otherwise >>>> the >> scripts give a "Connection refused" error and all hell breaks loose in >> pacemaker saying that the scripts/monitors are not installed etc. >> Maybe >> this can be incorperated in an updated version of the scripts that it >> start ietd if it is not running at all. >>> >>> Nope, won't do, but the upstream version of these RAs do this >>> checking >> more appropriately. >> >> Ok, I tried the latest upstream iSCSI* RAs from >> http://hg.linux-ha.org/agents and now the connection is lost (State -> >> Inactive) in MS iSCSI initiator EVERYTIME when i do a failover >> (before it >> worked sometimes, now it never works). The scripts themselves seem >> more >> stable now though, i didn't have any crashing ietd anymore. >> >> Here are some (Windows 2003 Server Standard) event logs. (I use the >> latest >> iSCSI initiator btw 2.08). I first thought it had something to do with >> MPIO because everytime i connected a target/lun it showed up as >> multipath >> device although i didn't select MPIO and don't have multiple paths >> atm. >> Now i completely uninstalled MPIO support to be sure and the situation >> didn't change. >> >> It starts with event id 20 from iScsiPrt: >> >> Connection to the target was lost. The initiator will attempt to >> retry the >> connection. >> >> then immediatly event id 10 from iScsiPrt: >> >> Login request failed. The login response packet is given in the dump >> data. >> >> then event id 12 from PlugPlayManager: >> >> The device 'IET VIRTUAL-DISK SCSI Disk Device' >> (SCSI\Disk&Ven_IET_____&Prod_VIRTUAL-DISK____&Rev_0___ >> \1&1843ccbc&0&000000) >> disappeared from the system without first being prepared for removal. >> >> Finally event 57 from Ftdisk: >> >> The system failed to flush data to the transaction log. Corruption >> may occur. >> >> The only thing I changed was ditching portblock (Like Ross suggested). > > And to verify this is with 1.4.20.1?
Yes, 1.4.20.1 > > When tearing down ietd on one side, what method does it use? 'ietadm -- > op delete' or 'killall ietd'? It should use killall. It uses --op delete. Won't 'killall ietd' remove ALL targets on one side? The reason i want to use the RAs is because it handles a single target with its luns individually so i can have 4 targets with 1 lun each and have them spread across the cluster. Split them 2 on one side and 2 on the other side. Then if one targets needs loads of reading speed i can failover 1 target and spread 1 - 3. See what I mean? So i can't killall ietd because that would stop both targets on one side. Don't get me wrong, the failover IS WORKING. Only the connection re-instatement to MS iSCSI Initiator isn't when using the RAs. btw. When using the iscsi-target init script the re-instatement works fine and I see that the stop command inside the init script uses --op delete too. So i don't really see why i should be using killall. I will try open-iscsi tomorrow and see if open-iscsi can re-instate the connection when using the RAs, so i can pinpoint MS iSCSI initiator as the problem. > >> Then I tried enabling portblock again and now the connection >> reinstatement >> failover works like before, only from node01 -> node02 and NOT the >> other >> way around. >> >> Conslusion: I think that during a failover without portblock the >> connection is tried to reconnect too soon (when it's still on the >> old node >> but not up) and then give an immediate fail on the initiator side >> resulting in a disconnection (State -> Inactive). >> >> Any ideas how to solve this? > > Change 'ietadm --op delete' to 'killall ietd'. > > Have pacemaker check /proc on startup to make sure kernel module is > loaded and start/nostart ietd depending on whether it is primary or > secondary, kill ietd on fail-over. The kernel module will take care of > clearing it's config on ietd termination, so you can leave it loaded. > > -Ross > _______________________________________________________ Linux-HA-Dev: [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
