Hi, On Mon, Oct 12, 2009 at 01:25:31PM +0300, Stratos Zolotas wrote: > On Mon, Oct 12, 2009 at 10:16 AM, Andrew Beekhof <[email protected]> wrote: > > > On Mon, Oct 12, 2009 at 8:34 AM, Stratos Zolotas <[email protected]> wrote: > > > Hello to the list. > > > > > > Please excuse my ignorance, because it is the first time i try to built a > > > cluster. > > > > > > I'm trying to built a 2 node Active/Passive cluster with > > > DRBD+Pacemaker+Openais. > > > > > > I'm on the very beginning and i try to achieve initial communication > > between > > > the nodes. > > > > > > I'm getting the following: > > > > > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: No STONITH > > > resources have been defined > > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: Either > > > configure some or disable STONITH with the stonith-enabled option > > > crm_mon[12156]: 2009/10/12_09:22:36 ERROR: unpack_resources: NOTE: > > Clusters > > > with shared data need STONITH to ensure data integrity > > > > > > > > > ============ > > > Last updated: Mon Oct 12 09:22:36 2009 > > > Current DC: NONE > > > 0 Nodes configured, unknown expected votes > > > 0 Resources configured. > > > ============ > > > > You might just need to wait a bit longer. > > But its hard to say without seeing all the logs. If you attach them > > (compressed) we'll be able to help further. > > > > > I don't care for the first three messages, because i haven't configure > > > anything yet, but it seems that i don't have communication between the > > > nodes. There is no any firewall and the communication is on a dedicated > > LAN. > > > > > > My openais.conf (identical for the two systems) is: > > > > > > alpha:/etc/ais # crm_mon --one-shot -V > > > # Please read the openais.conf.5 manual page > > > > > > aisexec { > > > # Run as root - this is necessary to be able to manage resources > > > with Pacemaker > > > user: root > > > group: root > > > } > > > > > > service { > > > # Load the Pacemaker Cluster Resource Manager > > > ver: 0 > > > name: pacemaker > > > use_mgmtd: yes > > > use_logd: yes > > > } > > > > > > totem { > > > version: 2 > > > > > > # How long before declaring a token lost (ms) > > > token: 5000 > > > > > > # How many token retransmits before forming a new configuration > > > token_retransmits_before_loss_const: 10 > > > > > > # How long to wait for join messages in the membership protocol > > (ms) > > > join: 1000 > > > > > > # How long to wait for consensus to be achieved before starting a > > > new round of membership conf$ > > > consensus: 2500 > > > > > > # Turn off the virtual synchrony filter > > > vsftype: none > > > > > > # Number of messages that may be sent by one processor on receipt > > of > > > the token > > > max_messages: 20 > > > > > > # Stagger sending the node join messages by 1..send_join ms > > > send_join: 45 > > > > > > # Limit generated nodeids to 31-bits (positive signed integers) > > > clear_node_high_bit: yes > > > > > > # Disable encryption > > > secauth: off > > > > > > # How many threads to use for encryption/decryption > > > threads: 0 > > > > > > # Optionally assign a fixed node id (integer) > > > # nodeid: 1234 > > > > > > interface { > > > ringnumber: 0 > > > > > > # The following values need to be set based on your > > > environment > > > bindnetaddr: 192.168.67.0 > > > mcastaddr: 226.94.1.1 > > > mcastport: 5405 > > > } > > > > > > logging { > > > debug: off > > > fileline: off > > > to_syslog: yes > > > to_stderr: off > > > syslog_facility: daemon > > > timestamp: on > > > } > > > > > > amf { > > > mode: disabled > > > } > > > > > > > > > The first node is on 192.168.67.10 and the second on 192.168.67.11. > > > > > > Am i missing something? > > > > > > Thank you in advance and please forgive my lack of knowledge. > > > > > > Stratos. > > Thank you for your immediate response. > > I think that something is wrong because i'm waiting for at least 2-3 hours > for the nodes to appear.
That seems to be a bit excessive :) > Please find the logs for the first machine (/var/log/messages) attached to > the message. If the logs from the second node are needed please ask me to > send them, but i think that the problem is common for both nodes. > > I'm sending only the logs after the last run of openais (rcopenais start on > Opensuse 11.1) There was a segfault in crmd/plumbing: Oct 12 09:13:04 alpha kernel: crmd[11007]: segfault at 18 ip 00007f40ea896eee sp 00007fff0336a960 error 4 in libplumb.so.2.0.0[7f40ea87a000+30000] You should capture the backtrace with gdb or use hb_report. Hopefully there's a core file. There won't be much of a cluster without crmd. Otherwise, openais seems to function fine. Thanks, Dejan > Thank you again. > > > Stratos > > > > -- > Kernel IT Solutions Ltd > http://www.kernelit.gr > > Cyclades Wireless Network > http://www.cywn.gr > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
