Jason,
On Thu, Jun 21, 2018 at 10:47 AM Jason Gauthier <jagauth...@gmail.com> wrote:
On Thu, Jun 21, 2018 at 9:49 AM Jan Pokorný <jpoko...@redhat.com> wrote:
On 21/06/18 07:05 -0400, Jason Gauthier wrote:
On Thu, Jun 21, 2018 at 5:11 AM Christine Caulfield <ccaul...@redhat.com> wrote:
On 19/06/18 18:47, Jason Gauthier wrote:
Attached!
That's very odd. I can see communication with the server and corosync in
there (do it's doing something) but no logging at all. When I start
qdevice on my systems it logs loads of messages even if it doesn't
manage to contact the server. Do you have any logging entries in
corosync.conf that might be stopping it?
I haven't checked the corosync logs for any entries before, but I just
did. There isn't anything logged.
What about syslog entries (may boil down to /var/log/messages,
journald log, or whatever sink is configured)?
I took a look, since both you and Chrissie mentioned that.
There aren't any new entries added to any of the /var/log files.
# corosync-qdevice -f -d
# date
Thu Jun 21 10:36:06 EDT 2018
# ls -lt|head
total 152072
-rw-r----- 1 root adm 68018 Jun 21 10:34 auth.log
-rw-rw-r-- 1 root utmp 18704352 Jun 21 10:34 lastlog
-rw-rw-r-- 1 root utmp 107136 Jun 21 10:34 wtmp
-rw-r----- 1 root adm 248444 Jun 21 10:34 daemon.log
-rw-r----- 1 root adm 160899 Jun 21 10:34 syslog
-rw-r----- 1 root adm 1119856 Jun 21 09:46 kern.log
I did look through daemon, messages, and syslog just to be sure.
Where did the binary come from? did you build it yourself or is it from
a package? I wonder if it's got corrupted or is a bad version. Possibly
linked against a 'dodgy' libqb - there have been some things going on
there that could cause logging to go missing in some circumstances.
Honza (the qdevice expert) is away at the moment, so I'm guessing a bit
here anyway!
Corosync-qdevice is using same config as corosync, so to get messages on
stderr, please configure
logging.to_stderr: on
Hmm. Interesting. I installed the debian package. When it didn't
work, I grabbed the source from github. They both act the same way,
but if there is an underlying library issue then that will continue to
be a problem.
It doesn't say much:
/usr/lib/x86_64-linux-gnu/libqb.so.0.18.1
You are likely using libqb v1.0.1.
Correct. I didn't even think to look at the output of dpkg -l for the
package version.
Debian 9 also packages binutils-2.28
Ability to figure out the proper package version is one of the most
basic skills to provide useful diagnostics about the issues with
distro-provided packages.
With Debian, the proper incantation seems to be
dpkg -s libqb-dev | grep -i version
or
apt list libqb-dev
(or substitute libqb0 for libqb-dev).
As Chrissie mentioned, there is some fishiness possible if you happen
to use ld linker from binutils 2.29+ for the building with this old
libqb in the mix, so if the issues persist and logging seems to be
missing, try recompiling with the downgraded binutils package below
said breakage point.
Since the system already has a lower numbered binutils (2.28) I wonder
if I should attempt to build a newer version of the libqb library.
As Chrissie mentioned, I will open a bug with Debian in the Interim.
But I don 't believe I will see resolution to that any time soon. :)
I was finally able to look at this problem again, and found that qnetd
is giving me some messaging, but I don't know what to do with it.
Jun 29 16:34:35 debug New client connected
Jun 29 16:34:35 debug cluster name = zeta
Jun 29 16:34:35 debug tls started = 1
Jun 29 16:34:35 debug tls peer certificate verified = 1
Jun 29 16:34:35 debug node_id = 1084772368
Jun 29 16:34:35 debug pointer = 0x563afd609d70
Jun 29 16:34:35 debug addr_str = ::ffff:192.168.80.16:38010
Jun 29 16:34:35 debug ring id = (40a85010.89ec)
Jun 29 16:34:35 debug cluster dump:
Jun 29 16:34:35 debug client = ::ffff:192.168.80.16:38010,
node_id = 1084772368
Jun 29 16:34:35 debug Client ::ffff:192.168.80.16:38010 (cluster
zeta, node_id 1084772368) sent initial node list.
Jun 29 16:34:35 debug msg seq num 4
Jun 29 16:34:35 debug node list:
Jun 29 16:34:35 error ffsplit: Received empty config node list for
client ::ffff:192.168.80.16:38010
Yes, this is interesting. Could you please share your config?
Jun 29 16:34:35 error Algorithm returned error code. Sending error reply.
Jun 29 16:34:35 debug Client ::ffff:192.168.80.16:38010 (cluster
zeta, node_id 1084772368) sent membership node list.
Jun 29 16:34:35 debug msg seq num 5
Jun 29 16:34:35 debug ring id = (40a85010.89ec)
Jun 29 16:34:35 debug node list:
Jun 29 16:34:35 debug node_id = 1084772368, data_center_id = 0,
node_state = not set
Jun 29 16:34:35 debug node_id = 1084772369, data_center_id = 0,
node_state = not set
Jun 29 16:34:35 debug Algorithm result vote is Ask later
Jun 29 16:34:35 debug Client ::ffff:192.168.80.16:38010 (cluster
zeta, node_id 1084772368) sent quorum node list.
Jun 29 16:34:35 debug msg seq num 6
Jun 29 16:34:35 debug quorate = 1
Jun 29 16:34:35 debug node list:
Jun 29 16:34:35 debug node_id = 1084772368, data_center_id = 0,
node_state = member
Jun 29 16:34:35 debug node_id = 1084772369, data_center_id = 0,
node_state = member
It looks like "config node list" is empty, but the other lists are
not. I'm not sure where it's getting that node list from. For fun, I
added
nodelist {
node {
alpha: 192.168.80.16
}
node {
beta: 192.168.80.17
}
}
}
This is how nodelist doesn't look like. It should look like:
nodelist {
node {
ring0_addr: 192.168.80.16
nodeid: 1
}
node {
ring0_addr: 192.168.80.17
nodeid: 2
}
}
But it's really weird corosync-qdevice started without proper nodelist
(it shouldn't).
Honza
to corosync.conf, and restarted both nodes. But that didn't help.
_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org