Jason,

On Thu, Jun 21, 2018 at 10:47 AM Jason Gauthier <jagauth...@gmail.com> wrote:

On Thu, Jun 21, 2018 at 9:49 AM Jan Pokorný <jpoko...@redhat.com> wrote:

On 21/06/18 07:05 -0400, Jason Gauthier wrote:
On Thu, Jun 21, 2018 at 5:11 AM Christine Caulfield <ccaul...@redhat.com> wrote:
On 19/06/18 18:47, Jason Gauthier wrote:
Attached!

That's very odd. I can see communication with the server and corosync in
there (do it's doing something) but no logging at all. When I start
qdevice on my systems it logs loads of messages even if it doesn't
manage to contact the server. Do you have any logging entries in
corosync.conf that might be stopping it?

I haven't checked the corosync logs for any entries before, but I just
did.  There isn't anything logged.

What about syslog entries (may boil down to /var/log/messages,
journald log, or whatever sink is configured)?

I took a look, since both you and Chrissie mentioned that.

There aren't any new entries added to any of the /var/log files.

# corosync-qdevice -f -d
# date
Thu Jun 21 10:36:06 EDT 2018

# ls -lt|head
total 152072
-rw-r----- 1 root        adm          68018 Jun 21 10:34 auth.log
-rw-rw-r-- 1 root        utmp      18704352 Jun 21 10:34 lastlog
-rw-rw-r-- 1 root        utmp        107136 Jun 21 10:34 wtmp
-rw-r----- 1 root        adm         248444 Jun 21 10:34 daemon.log
-rw-r----- 1 root        adm         160899 Jun 21 10:34 syslog
-rw-r----- 1 root        adm        1119856 Jun 21 09:46 kern.log

I did look through daemon, messages, and syslog just to be sure.

Where did the binary come from? did you build it yourself or is it from
a package? I wonder if it's got corrupted or is a bad version. Possibly
linked against a 'dodgy' libqb - there have been some things going on
there that could cause logging to go missing in some circumstances.

Honza (the qdevice expert) is away at the moment, so I'm guessing a bit
here anyway!

Corosync-qdevice is using same config as corosync, so to get messages on stderr, please configure

logging.to_stderr: on



Hmm. Interesting.  I installed the debian package.  When it didn't
work, I grabbed the source from github.  They both act the same way,
but if there is an underlying library issue then that will continue to
be a problem.

It doesn't say much:
/usr/lib/x86_64-linux-gnu/libqb.so.0.18.1

You are likely using libqb v1.0.1.

Correct. I didn't even think to look at the output of dpkg -l for the
package version.
Debian 9 also packages binutils-2.28

Ability to figure out the proper package version is one of the most
basic skills to provide useful diagnostics about the issues with
distro-provided packages.

With Debian, the proper incantation seems to be

   dpkg -s libqb-dev | grep -i version

or

   apt list libqb-dev

(or substitute libqb0 for libqb-dev).

As Chrissie mentioned, there is some fishiness possible if you happen
to use ld linker from binutils 2.29+ for the building with this old
libqb in the mix, so if the issues persist and logging seems to be
missing, try recompiling with the downgraded binutils package below
said breakage point.

Since the system already has a lower numbered binutils (2.28) I wonder
if I should attempt to build a newer version of the libqb library.

As Chrissie mentioned, I will open a bug with Debian in the Interim.
But I don 't believe I will see resolution to that any time soon. :)

I was finally able to look at this problem again, and found that qnetd
is giving me some messaging, but I don't know what to do with it.

Jun 29 16:34:35 debug   New client connected
Jun 29 16:34:35 debug     cluster name = zeta
Jun 29 16:34:35 debug     tls started = 1
Jun 29 16:34:35 debug     tls peer certificate verified = 1
Jun 29 16:34:35 debug     node_id = 1084772368
Jun 29 16:34:35 debug     pointer = 0x563afd609d70
Jun 29 16:34:35 debug     addr_str = ::ffff:192.168.80.16:38010
Jun 29 16:34:35 debug     ring id = (40a85010.89ec)
Jun 29 16:34:35 debug     cluster dump:
Jun 29 16:34:35 debug       client = ::ffff:192.168.80.16:38010,
node_id = 1084772368
Jun 29 16:34:35 debug   Client ::ffff:192.168.80.16:38010 (cluster
zeta, node_id 1084772368) sent initial node list.
Jun 29 16:34:35 debug     msg seq num 4
Jun 29 16:34:35 debug     node list:
Jun 29 16:34:35 error   ffsplit: Received empty config node list for
client ::ffff:192.168.80.16:38010

Yes, this is interesting. Could you please share your config?

Jun 29 16:34:35 error   Algorithm returned error code. Sending error reply.
Jun 29 16:34:35 debug   Client ::ffff:192.168.80.16:38010 (cluster
zeta, node_id 1084772368) sent membership node list.
Jun 29 16:34:35 debug     msg seq num 5
Jun 29 16:34:35 debug     ring id = (40a85010.89ec)
Jun 29 16:34:35 debug     node list:
Jun 29 16:34:35 debug       node_id = 1084772368, data_center_id = 0,
node_state = not set
Jun 29 16:34:35 debug       node_id = 1084772369, data_center_id = 0,
node_state = not set
Jun 29 16:34:35 debug   Algorithm result vote is Ask later
Jun 29 16:34:35 debug   Client ::ffff:192.168.80.16:38010 (cluster
zeta, node_id 1084772368) sent quorum node list.
Jun 29 16:34:35 debug     msg seq num 6
Jun 29 16:34:35 debug     quorate = 1
Jun 29 16:34:35 debug     node list:
Jun 29 16:34:35 debug       node_id = 1084772368, data_center_id = 0,
node_state = member
Jun 29 16:34:35 debug       node_id = 1084772369, data_center_id = 0,
node_state = member

It looks like "config node list" is empty, but the other lists are
not.  I'm not sure where it's getting that node list from.  For fun, I
added
nodelist {
     node {
        alpha: 192.168.80.16
      }
     node {
        beta: 192.168.80.17
     }
   }
}

This is how nodelist doesn't look like. It should look like:
nodelist {
        node {
                ring0_addr: 192.168.80.16
                nodeid: 1
        }
        node {
                ring0_addr: 192.168.80.17
                nodeid: 2
        }
}

But it's really weird corosync-qdevice started without proper nodelist (it shouldn't).

Honza

to corosync.conf, and restarted both nodes. But that didn't help.
_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to