On Mon, May 3, 2010 at 4:00 PM, Sam Tran <[email protected]> wrote:
> On Mon, May 3, 2010 at 3:36 PM, Lars Ellenberg
> <[email protected]> wrote:
>> On Mon, May 03, 2010 at 02:42:00PM -0400, Sam Tran wrote:
>>> Hi Lars,
>>>
>>> Here is the content of the single state file for the LDAPS TCP connection:
>>>
>>> [...@info-ldap-015 ~]# cat /tmp/tickle/192.168.8.171
>>> 192.168.8.171:636 192.168.240.178:32913
>>>
>>> I tried to run the tickle_tcp manually:
>>>
>>> [...@info-ldap-015 ~]# cat /tmp/tickle/192.168.8.171 |
>>> /usr/lib64/heartbeat/tickle_tcp -n 3
>>>
>>> It did send three packets to the LDAP slave. But it didn't break the
>>> existing TCP connection between the VIP and the slave.
>>
>> It is not supposed to do that.
>> By sending an invalid ACK, it is supposed to tickle the TCP stack of the
>> client into sending a valid ack to the now failed over IP, which then is
>> supposed to trigger a valid RST from the new server, because it does not
>> know anything about that connection.
>>
>> As long as the endpoint is still there, the tcp session is supposed to
>> be robust against such "attacks".
>>
>> The point of this "attack" is that the endpoint _changed_, and thus the
>> tcp session is no longer valid, even though the client has not noticed
>> that yet.
>> The trick is to talk the client into sending _anything_ at all, so a
>> suitable RST can be send to cause the client to notice, and reconnect.
>
> Thanks for the reminder.
>
>
>>
>>> I have attached
>>> the output of the packet capture in text format.
>>>
>>> Thanks,
>>> Sam
>>
>>> No. Time Source Destination Protocol
>>> Info
>>> 1 0.000000 192.168.8.171 192.168.240.178 TCP
>>> [TCP Window Update] ldaps > 32913 [ACK] Seq=1 Ack=1 Win=1234 Len=0
>>
>> Right. So tickle acks are send out.
>> All is well ;-)
>
> Well it didn't seem to send anything when using crm.
> I manually triggered this by running the command on the shell.
>
I confirm that when using the tickle ack feature with crm the failover
node doesn't send the tickle acks.
I added some debugging in the run_tickle_tcp() function and found out
that $OCF_RESKEY_tickle_dir is actually empty. If I set that variable
in the function the tickle acks are sent out. So I don't quite
understand why this variable is empty when the crm configuration seems
to be correct:
primitive portblock_unblock ocf:heartbeat:portblock \
params protocol="tcp" ip="192.168.8.171" portno="636" action="unblock" \
op monitor interval="10" timeout="10" depth="0"
tickle_dir="/tmp/tickle" sync_script="/usr/sbin/csync2 -xvr"
Any ideas?
Thanks,
Sam
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/