On Tue, Aug 10, 2010 at 2:28 PM, David Lang
<[email protected]> wrote:
> On Tue, 10 Aug 2010, Igor Chudov wrote:
>
>> Dmitri, you are right.
>>
>> In any case the name change did nothing.
>
> did it eliminate the error from the log? does the log say anything else after
> that point?

It eliminated the error from the log, but the log says the same things.

What it says, in the nutshell, is that both think that "other_holds_resources".

I cannot really imagine that it could possibly be such an unsolvable
problem. I think that we are missing something really simple.

Aug 10 14:47:18 pfs-srv3 heartbeat: [1200]: info: Link pfs-srv4:eth1 up.
Aug 10 14:47:18 pfs-srv3 heartbeat: [1200]: info: Status update for
node pfs-srv4: status up
Aug 10 14:47:18 pfs-srv3 heartbeat: [1200]: info: Managed
write_hostcachedata process 1273 exited with return code 0.
Aug 10 14:47:18 pfs-srv3 harc[1272]: [1279]: info: Running
/etc/ha.d//rc.d/status status
Aug 10 14:47:18 pfs-srv3 heartbeat: [1200]: info: Managed status
process 1272 exited with return code 0.
Aug 10 14:47:19 pfs-srv3 heartbeat: [1200]: info: Comm_now_up():
updating status to active
Aug 10 14:47:19 pfs-srv3 heartbeat: [1200]: info: Local status now set
to: 'active'
Aug 10 14:47:19 pfs-srv3 heartbeat: [1200]: info: Managed
write_hostcachedata process 1284 exited with return code 0.
Aug 10 14:47:19 pfs-srv3 heartbeat: [1200]: info: Status update for
node pfs-srv4: status active
Aug 10 14:47:19 pfs-srv3 heartbeat: [1200]: info:
AnnounceTakeover(local 0, foreign 1, reason 'HB_R_BOTHSTARTING' (0))
Aug 10 14:47:19 pfs-srv3 heartbeat: [1200]: info: STATE 1 => 3
Aug 10 14:47:19 pfs-srv3 heartbeat: [1200]: info: STATE 3 => 2
Aug 10 14:47:19 pfs-srv3 heartbeat: [1200]: info: Managed
write_delcachedata process 1285 exited with return code 0.
Aug 10 14:47:19 pfs-srv3 harc[1286]: [1292]: info: Running
/etc/ha.d//rc.d/status status
Aug 10 14:47:19 pfs-srv3 heartbeat: [1200]: info: Managed status
process 1286 exited with return code 0.
Aug 10 14:47:30 pfs-srv3 heartbeat: [1200]: info: remote resource
transition completed.
Aug 10 14:47:30 pfs-srv3 heartbeat: [1200]: info: STATE 2 => 3
Aug 10 14:47:30 pfs-srv3 heartbeat: [1200]: info: other_holds_resources: 1
Aug 10 14:47:30 pfs-srv3 heartbeat: [1200]: info: remote resource
transition completed.
Aug 10 14:47:30 pfs-srv3 heartbeat: [1200]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (0))
Aug 10 14:47:30 pfs-srv3 heartbeat: [1200]: info: Initial resource
acquisition complete (T_RESOURCES(us))
Aug 10 14:47:30 pfs-srv3 heartbeat: [1200]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(them)' (1))
Aug 10 14:47:30 pfs-srv3 heartbeat: [1200]: info: STATE 3 => 4
Aug 10 14:47:30 pfs-srv3 heartbeat: [1297]: info: 1 local resources
from [/usr/share/heartbeat/ResourceManager listkeys pfs-srv3]
Aug 10 14:47:30 pfs-srv3 heartbeat: [1297]: info: Local Resource
acquisition completed.
Aug 10 14:47:30 pfs-srv3 heartbeat: [1297]: info: FIFO message [type
resource] written rc=81
Aug 10 14:47:30 pfs-srv3 heartbeat: [1200]: info: other_holds_resources: 1
Aug 10 14:47:30 pfs-srv3 heartbeat: [1200]: info: other_holds_resources: 1
Aug 10 14:47:30 pfs-srv3 heartbeat: [1200]: info:
AnnounceTakeover(local 1, foreign 1, reason 'T_RESOURCES(us)' (1))
Aug 10 14:47:30 pfs-srv3 heartbeat: [1200]: info: Managed
req_our_resources(ask) process 1297 exited with return code 0.

> David Lang
>
>> They are still refuse to take over when rebooted simultaneously.
>>
>> The symptoms are the same as usual.
>>
>> I am thinking, should I perhaps put a little statement in
>> /etc/init.d/heartbeat on one of the boxes and add "sleep 100" in it?
>>
>> i
>>
>> On Tue, Aug 10, 2010 at 2:05 PM, Dimitri Maziuk <[email protected]> 
>> wrote:
>>> On Tuesday 10 August 2010 13:14, Igor Chudov wrote:
>>>>
>>>> Haresources refers to "drbddisk", however, the resource in
>>>> /usr/lib/ocf/resource.d/heartbeat is called "drbd".
>>>
>>> Heartbeat 2.1.4 on centos 5 comes with /etc/ha.d/resource.d/drbddisk. Looks
>>> like the docs you read don't match the version you have.
>>>
>>> Dima
>>> --
>>> Dimitri Maziuk
>>> Programmer/sysadmin
>>> BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu
>>> _______________________________________________
>>> Linux-HA mailing list
>>> [email protected]
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to