> On Jun 10, 2010, at 5:53 AM, "Caspar Smit" <[email protected]> wrote:
>
>>> On 2010-06-08 16:03, Caspar Smit wrote:
>>>> I also noticed that for the OCF scripts to work you have to start
>> iscsi-target via the init script at boot using an EMPTY ietd.conf
>> because
>>>> for the scripts to work /usr/sbin/ietd has to be running otherwise
>>>> the
>> scripts give a "Connection refused" error and all hell breaks loose in
>> pacemaker saying that the scripts/monitors are not installed etc.
>> Maybe
>> this can be incorperated in an updated version of the scripts that it
>> start ietd if it is not running at all.
>>>
>>> Nope, won't do, but the upstream version of these RAs do this
>>> checking
>> more appropriately.
>>
>> Ok, I tried the latest upstream iSCSI* RAs from
>> http://hg.linux-ha.org/agents and now the connection is lost (State ->
>> Inactive) in MS iSCSI initiator EVERYTIME when i do a failover
>> (before it
>> worked sometimes, now it never works). The scripts themselves seem
>> more
>> stable now though, i didn't have any crashing ietd anymore.
>>
>> Here are some (Windows 2003 Server Standard) event logs. (I use the
>> latest
>> iSCSI initiator btw 2.08). I first thought it had something to do with
>> MPIO because everytime i connected a target/lun it showed up as
>> multipath
>> device although i didn't select MPIO and don't have multiple paths
>> atm.
>> Now i completely uninstalled MPIO support to be sure and the situation
>> didn't change.
>>
>> It starts with event id 20 from iScsiPrt:
>>
>> Connection to the target was lost. The initiator will attempt to
>> retry the
>> connection.
>>
>> then immediatly event id 10 from iScsiPrt:
>>
>> Login request failed. The login response packet is given in the dump
>> data.
>>
>> then event id 12 from PlugPlayManager:
>>
>> The device 'IET      VIRTUAL-DISK     SCSI Disk Device'
>> (SCSI\Disk&Ven_IET_____&Prod_VIRTUAL-DISK____&Rev_0___
>> \1&1843ccbc&0&000000)
>> disappeared from the system without first being prepared for removal.
>>
>> Finally event 57 from Ftdisk:
>>
>> The system failed to flush data to the transaction log. Corruption
>> may occur.
>>
>> The only thing I changed was ditching portblock (Like Ross suggested).
>
> And to verify this is with 1.4.20.1?

Yes, 1.4.20.1

>
> When tearing down ietd on one side, what method does it use? 'ietadm --
> op delete' or 'killall ietd'? It should use killall.

It uses --op delete.
Won't 'killall ietd' remove ALL targets on one side?
The reason i want to use the RAs is because it handles a single target
with its luns individually so i can have 4 targets with 1 lun each and
have them spread across the cluster. Split them 2 on one side and 2 on the
other side. Then if one targets needs loads of reading speed i can
failover 1 target and spread 1 - 3. See what I mean? So i can't killall
ietd because that would stop both targets on one side.

Don't get me wrong, the failover IS WORKING. Only the connection
re-instatement to MS iSCSI Initiator isn't when using the RAs.

btw. When using the iscsi-target init script the re-instatement works fine
and I see that the stop command inside the init script uses --op delete
too. So i don't really see why i should be using killall.

I will try open-iscsi tomorrow and see if open-iscsi can re-instate the
connection when using the RAs, so i can pinpoint MS iSCSI initiator as the
problem.

>
>> Then I tried enabling portblock again and now the connection
>> reinstatement
>> failover works like before, only from node01 -> node02 and NOT the
>> other
>> way around.
>>
>> Conslusion: I think that during a failover without portblock the
>> connection is tried to reconnect too soon (when it's still on the
>> old node
>> but not up) and then give an immediate fail on the initiator side
>> resulting in a disconnection (State -> Inactive).
>>
>> Any ideas how to solve this?
>
> Change 'ietadm --op delete' to 'killall ietd'.
>
> Have pacemaker check /proc on startup to make sure kernel module is
> loaded and start/nostart ietd depending on whether it is primary or
> secondary, kill ietd on fail-over. The kernel module will take care of
> clearing it's config on ietd termination, so you can leave it loaded.
>
> -Ross
>

_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to