On 07/08/2013, at 5:42 PM, Thomas Glanzmann <[email protected]> wrote:
> Hello Andrew,
>
>> I can try and fix that if you re-run with -x and paste the output.
>
> (apache-03) [~] crm_report -l /var/adm/syslog/2013/08/05 -f "2013-08-04
> 18:30:00" -t "2013-08-04 19:15" -x
> + shift
> + true
> + [ ! -z ]
> + break
> + [ x != x ]
> + [ x1375633800 != x ]
> + masterlog=
> + [ -z ]
> + log WARNING: The tarball produced by this program may contain
> + printf %-10s WARNING: The tarball produced by this program may contain\n
> apache-03:
> apache-03: WARNING: The tarball produced by this program may contain
> + log sensitive information such as passwords.
> + printf %-10s sensitive information such as passwords.\n apache-03:
> apache-03: sensitive information such as passwords.
> + log
> + printf %-10s \n apache-03:
> apache-03:
> + log We will attempt to remove such information if you use the
> + printf %-10s We will attempt to remove such information if you use the\n
> apache-03:
> apache-03: We will attempt to remove such information if you use the
> + log -p option. For example: -p "pass.*" -p "user.*"
> + printf %-10s -p option. For example: -p "pass.*" -p "user.*"\n apache-03:
> apache-03: -p option. For example: -p "pass.*" -p "user.*"
> + log
> + printf %-10s \n apache-03:
> apache-03:
> + log However, doing this may reduce the ability for the recipients
> + printf %-10s However, doing this may reduce the ability for the
> recipients\n apache-03:
> apache-03: However, doing this may reduce the ability for the recipients
> + log to diagnose issues and generally provide assistance.
> + printf %-10s to diagnose issues and generally provide assistance.\n
> apache-03:
> apache-03: to diagnose issues and generally provide assistance.
> + log
> + printf %-10s \n apache-03:
> apache-03:
> + log IT IS YOUR RESPONSIBILITY TO PROTECT SENSITIVE DATA FROM EXPOSURE
> + printf %-10s IT IS YOUR RESPONSIBILITY TO PROTECT SENSITIVE DATA FROM
> EXPOSURE\n apache-03:
> apache-03: IT IS YOUR RESPONSIBILITY TO PROTECT SENSITIVE DATA FROM EXPOSURE
> + log
> + printf %-10s \n apache-03:
> apache-03:
> + [ -z ]
> + getnodes any
> + [ -z any ]
> + cluster=any
> + [ -z ]
> + HA_STATE_DIR=/var/lib/heartbeat
> + find_cluster_cf any
> + warning Unknown cluster type: any
> + log WARN: Unknown cluster type: any
> + printf %-10s WARN: Unknown cluster type: any\n apache-03:
> apache-03: WARN: Unknown cluster type: any
> + cluster_cf=
> + ps -ef
> + egrep -qs [c]ib
> + debug Querying CIB for nodes
> + [ 0 -gt 0 ]
> + cibadmin -Ql -o nodes
> + awk
> /type="normal"/ {
> for( i=1; i<=NF; i++ )
> if( $i~/^uname=/ ) {
> sub("uname=.","",$i);
> sub("\".*","",$i);
> print $i;
> next;
> }
> }
>
> + tr \n
> + nodes=apache-03 apache-04
> + log Calculated node list: apache-03 apache-04
> + printf %-10s Calculated node list: apache-03 apache-04 \n apache-03:
> apache-03: Calculated node list: apache-03 apache-04
> + [ -z apache-03 apache-04 ]
> + echo apache-03 apache-04
> + grep -qs apache-03
> + debug We are a cluster node
> + [ 0 -gt 0 ]
> + [ -z 1375636500 ]
> + date +%a-%d-%b-%Y
> + label=pcmk-Wed-07-Aug-2013
> + time2str 1375633800
> + perl -e use POSIX; print strftime('%x %X',localtime(1375633800));
> + time2str 1375636500
> + perl -e use POSIX; print strftime('%x %X',localtime(1375636500));
> + log Collecting data from apache-03 apache-04 (08/04/13 18:30:00 to
> 08/04/13 19:15:00)
> + printf %-10s Collecting data from apache-03 apache-04 (08/04/13 18:30:00
> to 08/04/13 19:15:00)\n apache-03:
> apache-03: Collecting data from apache-03 apache-04 (08/04/13 18:30:00 to
> 08/04/13 19:15:00)
> + collect_data pcmk-Wed-07-Aug-2013 1375633800 1375636500
> + label=pcmk-Wed-07-Aug-2013
> + expr 1375633800 - 10
> + start=1375633790
> + expr 1375636500 + 10
> + end=1375636510
> + masterlog=
> + [ x != x ]
> + l_base=/home/tg/pcmk-Wed-07-Aug-2013
> + r_base=pcmk-Wed-07-Aug-2013
> + [ -e /home/tg/pcmk-Wed-07-Aug-2013 ]
> + mkdir -p /home/tg/pcmk-Wed-07-Aug-2013
> + [ x != x ]
> + cat
> + [ apache-03 = apache-03 ]
> + cat
> + cat /home/tg/pcmk-Wed-07-Aug-2013/.env /usr/share/pacemaker/report.common
> /usr/share/pacemaker/report.collector
> + bash /home/tg/pcmk-Wed-07-Aug-2013/collector
> apache-03: ERROR: Could not determine the location of your cluster logs, try
> specifying --logfile /some/path
> + cat
> + [ apache-03 = apache-04 ]
> + cat /home/tg/pcmk-Wed-07-Aug-2013/.env /usr/share/pacemaker/report.common
> /usr/share/pacemaker/report.collector
> + ssh+ -l root -T apache-04 -- mkdir -p pcmk-Wed-07-Aug-2013; cat >
> pcmk-Wed-07-Aug-2013/collector; bash pcmk-Wed-07-Aug-2013/collectorcd
> /home/tg/pcmk-Wed-07-Aug-2013
> + tar xf -
> apache-04: ERROR: Could not determine the location of your cluster logs, try
> specifying --logfile /some/path
> tar: This does not look like a tar archive
> tar: Exiting with failure status due to previous errors
> + analyze /home/tg/pcmk-Wed-07-Aug-2013
> + flist=hostcache members.txt cib.xml crm_mon.txt logd.cf sysinfo.txt
> + printf Diff hostcache...
> + ls /home/tg/pcmk-Wed-07-Aug-2013/*/hostcache
> + echo no /home/tg/pcmk-Wed-07-Aug-2013/*/hostcache :/
> + continue
> + printf Diff members.txt...
> + ls /home/tg/pcmk-Wed-07-Aug-2013/*/members.txt
> + echo no /home/tg/pcmk-Wed-07-Aug-2013/*/members.txt :/
> + continue
> + printf Diff cib.xml...
> + ls /home/tg/pcmk-Wed-07-Aug-2013/*/cib.xml
> + echo no /home/tg/pcmk-Wed-07-Aug-2013/*/cib.xml :/
> + continue
> + printf Diff crm_mon.txt...
> + ls /home/tg/pcmk-Wed-07-Aug-2013/*/crm_mon.txt
> + echo no /home/tg/pcmk-Wed-07-Aug-2013/*/crm_mon.txt :/
> + continue
> + printf Diff logd.cf...
> + ls /home/tg/pcmk-Wed-07-Aug-2013/*/logd.cf
> + echo no /home/tg/pcmk-Wed-07-Aug-2013/*/logd.cf :/
> + continue
> + printf Diff sysinfo.txt...
> + ls /home/tg/pcmk-Wed-07-Aug-2013/*/sysinfo.txt
> + echo no /home/tg/pcmk-Wed-07-Aug-2013/*/sysinfo.txt :/
> + continue
> + [ -f /home/tg/pcmk-Wed-07-Aug-2013/cluster-log.txt ]
> + cat /home/tg/pcmk-Wed-07-Aug-2013/apache-03/analysis.txt
> cat: /home/tg/pcmk-Wed-07-Aug-2013/apache-03/analysis.txt: No such file or
> directory
> + [ -s /home/tg/pcmk-Wed-07-Aug-2013/apache-03/events.txt ]
> + [ -s /home/tg/pcmk-Wed-07-Aug-2013/cluster-log.txt ]
> + cat /home/tg/pcmk-Wed-07-Aug-2013/apache-04/analysis.txt
> cat: /home/tg/pcmk-Wed-07-Aug-2013/apache-04/analysis.txt: No such file or
> directory
> + [ -s /home/tg/pcmk-Wed-07-Aug-2013/apache-04/events.txt ]
> + [ -s /home/tg/pcmk-Wed-07-Aug-2013/cluster-log.txt ]
> + log
> + printf %-10s \n apache-03:
> apache-03:
> + [ 1 = 1 ]
> + shrink /home/tg/pcmk-Wed-07-Aug-2013
> + olddir=/home/tg
> + dirname /home/tg/pcmk-Wed-07-Aug-2013
> + dir=/home/tg
> + basename /home/tg/pcmk-Wed-07-Aug-2013
> + base=pcmk-Wed-07-Aug-2013
> + target=/home/tg/pcmk-Wed-07-Aug-2013.tar
> + tar_options=cf
> + pickfirst bzip2 gzip false
> + which bzip2
> + echo bzip2
> + return 0
> + variant=bzip2
> + tar_options=jcf
> + target=/home/tg/pcmk-Wed-07-Aug-2013.tar.bz2
> + [ -e /home/tg/pcmk-Wed-07-Aug-2013.tar.bz2 ]
> + cd /home/tg
> + tar jcf /home/tg/pcmk-Wed-07-Aug-2013.tar.bz2 pcmk-Wed-07-Aug-2013
> + cd /home/tg
> + echo /home/tg/pcmk-Wed-07-Aug-2013.tar.bz2
> + fname=/home/tg/pcmk-Wed-07-Aug-2013.tar.bz2
> + rm -rf /home/tg/pcmk-Wed-07-Aug-2013
> + log Collected results are available in /home/tg/pcmk-Wed-07-Aug-2013.tar.bz2
> + printf %-10s Collected results are available in
> /home/tg/pcmk-Wed-07-Aug-2013.tar.bz2\n apache-03:
> apache-03: Collected results are available in
> /home/tg/pcmk-Wed-07-Aug-2013.tar.bz2
> + log
> + printf %-10s \n apache-03:
> apache-03:
> + log Please create a bug entry at
> + printf %-10s Please create a bug entry at\n apache-03:
> apache-03: Please create a bug entry at
> + log
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> + printf %-10s
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker\n
> apache-03:
> apache-03:
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> + log Include a description of your problem and attach this tarball
> + printf %-10s Include a description of your problem and attach this
> tarball\n apache-03:
> apache-03: Include a description of your problem and attach this tarball
> + log
> + printf %-10s \n apache-03:
> apache-03:
> + log Thank you for taking time to create this report.
> + printf %-10s Thank you for taking time to create this report.\n apache-03:
> apache-03: Thank you for taking time to create this report.
> + log
> + printf %-10s \n apache-03:
It really helps to read the output of the commands you're running:
Did you not see these messages the first time?
apache-03: WARN: Unknown cluster type: any
apache-03: ERROR: Could not determine the location of your cluster logs, try
specifying --logfile /some/path
apache-04: ERROR: Could not determine the location of your cluster logs, try
specifying --logfile /some/path
Try adding -H and --logfile {somevalue} next time.
>
> Resulting file is here:
> https://thomas.glanzmann.de/tmp/pcmk-Wed-07-Aug-2013.tar.bz2
>
>> I can't do anything with the core file I'm afraid. I don't run debian
>> at all, let alone that particular version with the same binaries,
>> libraries and symbols as you. Without those, the core file is
>> meaningless (which is why crm_report generates backtraces).
>
> I see, I also think that Debian does not package the debug symbols so
> that the core files are really useless. Please point me to the right
> packages if I'm wrong.
I have no experience with debian.
>
>> That shouldn't have resulted in a crash.
>
> It does. Also I tried to reproduce it on a 32 BIT System and the system
> at least rebooted both nodes at the same time but did not loose the
> config and this time crm just reported an error and did not core dump.
>
>> I would _really_ recommend upgrading to something a little more
>> recent. And it might be time to get off heartbeat while you're at it.
>
> Just to be absolutly sure: I should upgrade to the most recent pacemaker
> release and use corosync as communication layer?
An updated pacemaker is the important part.
Whether you switch to corosync too is up to you.
Pacemaker+heartbeat is by far the least tested combination.
>
> I tried corosync a few years back and I was annoyed because back than it
> could not handle more than two heartbeat links between the nodes,
> however I saw that it now can and the moment I don't need more anyway.
>
> Has anyone Debian packages that can be used in production or should I
> package it myself?
Best to poke the debian maintainers
>
> Has someone a howto guide howto use the peer outdater with corosync?
I'm sure linbit has one somewhere
>
> One last question about maintance mode: I want to use maintance mode to
> change the configuration without affecting the production. See that the
> monitors take the system out of maintance mode and than try the
> failover. I already have verified the resource agents work correctly. Is
> that a valid use of the maintance mode or should I always test my setup
> on a lab system and only than put into the production system?
Do you mean "See that the monitors _work, then_ take the system out of
maintance mode..."?
If so, then yes.
>
> Cheers,
> Thomas
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems