Current cluster configuration (box1 started only):
box1 ~ # crm_mon -1


============
Last updated: Wed Oct  7 10:19:57 2009
Stack: openais
Current DC: box1 - partition WITHOUT quorum
Version: 1.0.5-05c8b63cbca7ce95182bb41881b3c5677f20bd5c
3 Nodes configured, 3 expected votes
3 Resources configured.
============

Online: [ box1 ]
OFFLINE: [ box2 fc12-node1 ]

Master/Slave Set: ms_drbd
        Masters: [ box1 ]
        Stopped: [ drbd:1 ]
Clone Set: cl_pingd
        Started: [ box1 ]
        Stopped: [ pingd:1 pingd:2 ]
Resource Group: mysql_service_group
    fs_r0       (ocf::heartbeat:Filesystem):    Started box1
    ip_mysql    (ocf::heartbeat:IPaddr2):       Started box1
    mysql       (ocf::heartbeat:mysql): Started box1

node not in cluster: farm
farm ~ # cibadmin -$
cibadmin 1.0.5 for OpenAIS (Build: 05c8b63cbca7ce95182bb41881b3c5677f20bd5c)

Written by Andrew Beekhof


I run on box1:
box1 ~ # export HA_VALGRIND_ENABLED=cib
box1 ~ # export VALGRIND_OPTS="--log-file=/tmp/pacemaker-%p.valgrind --leak-check=full --show-reachable=yes --trace-children=no --num- callers=25"
box1 ~ # aisexec

on farm:
farm ~ # CIB_server=box1.cluster CIB_port=1234 CIB_user=hacluster cibadmin -Q
Password:
cibadmin: Connection to box1.cluster:1234 failed:
Signon to CIB failed:
Init failed, could not perform requested operations
farm ~ # CIB_server=box1.cluster CIB_port=12345 CIB_user=hacluster CIB_encrypted=false cibadmin -Q
Password:
cibadmin: Connection to box1.cluster:12345 failed:
Signon to CIB failed:
Init failed, could not perform requested operations
farm ~ #
Despite i got no results, cluster is operational, crm_mon on box1 works fine.
each attempt gave following in logs on box1:
Oct 7 11:43:01 box1 cib: [3458]: info: log_data_element: cib_remote_listen: Login: <cib_command op="authenticate" user="hacluster" password="*****" hidden="password" /> Oct 7 11:43:01 box1 cib: [3458]: ERROR: cib_remote_listen: User is not a member of the required group
So seems something wrong with hacluster user group
box1 pc # id hacluster
uid=65(hacluster) gid=65(haclient) groups=65(haclient)
box1 pc # getent passwd hacluster
hacluster:x:65:65:added by portage for cluster-glue:/var/lib/ heartbeat:/sbin/nologin


Attempt to connect to plain port without CIB_encrypted=false causes cluster malfunction though
Oct 7 11:50:18 box1 cib: [3458]: ERROR: crm_xml_err: XML Error: Entity: line 1: parsererror : Start tag expected, '<' not found
Oct  7 11:50:18 box1 cib: [3458]: ERROR: crm_xml_err: XML Error:
Oct  7 11:50:18 box1 cib: [3458]: ERROR: crm_xml_err: XML Error: ^
Oct 7 11:50:18 box1 cib: [3458]: WARN: string2xml: Parsing failed (domain=1, level=3, code=4): Start tag expected, '<' not found Oct 7 11:50:18 box1 cib: [3458]: ERROR: string2xml: Couldn't parse 3 chars: Oct 7 11:50:18 box1 cib: [3458]: ERROR: cib_recv_remote_msg: Couldn't parse: ''


Attaching files, which may be useful.



On Oct 06, 2009, at 21:35, Andrew Beekhof wrote:

On Mon, Sep 28, 2009 at 5:38 PM, Alexander Bodnarashik
<[email protected]> wrote:
Thanks for the fix :)
I've checked out
http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/05c8b63cbca7
Now ports are open.

I've encountered other problem though.
I have 2 boxes in cluster - box1 and box2. Third box, not in cluster, is
named farm.
All of them are running Gentoo.

All have the same version of pacemaker?

Cluster stack - openais-1.0.1

Trying to issue cibadmin -Q from farm:
1234 - plain port

CIB_server=box1.cluster CIB_port=1234 cibadmin -Q

So a couple of things here (that you couldn't possibly be expected to
know, sorry, i forgot to mention them at the time)...

You need to set CIB_user to the user than the remote node runs the CIB
as (eg. hacluster)
For plaintext connections, you need to set CIB_encrypted=false

Actually that first one needs to be the default (since non-root
daemons can only do PAM authentication for the user they're running
as).



After that i'm unable to run on box1 neither crm_mon (writes Attempting connection to the cluster...) nor cibadmin -Q - it waits for a while and
then writes following:

 Signon to CIB failed: reply failed
Init failed, could not perform requested operations

Thats very disturbing.
Can you try running the CIB under valgrind to see if it reports
anything or interest?

export HA_VALGRIND_ENABLED=cib
export VALGRIND_OPTS="--log-file=/tmp/pacemaker-%p.valgrind
--leak-check=full --show-reachable=yes --trace-children=no
--num-callers=25"

install valgrind
then start the cluster

Attachment: cibadmin-q.txt.gz
Description: GNU Zip compressed data

Attachment: corosync.conf.gz
Description: GNU Zip compressed data

Attachment: corosync.log.gz
Description: GNU Zip compressed data

Attachment: messages.gz
Description: GNU Zip compressed data

Attachment: pacemaker-3458.valgrind.gz
Description: GNU Zip compressed data

Attachment: ps-ax.txt.gz
Description: GNU Zip compressed data

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to