Liviu;

First: what version of Ceph are you running?

Second: I don't see a cluster network option in you configuration file?

At least for us, running Nautilus, there are no underscores (_) in the options, 
so our configuration files look like this:

[global]
        auth clust required = cphx
        public network = <NETWORK>/<MASK>
        cluster network = <NETWORK>/<MASK>

Thank you,

Dominic L. Hilsbos, MBA 
Director - Information Technology 
Perform Air International Inc.
dhils...@performair.com 
www.PerformAir.com


-----Original Message-----
From: Liviu Sas [mailto:droop...@gmail.com] 
Sent: Sunday, March 22, 2020 5:03 PM
To: ceph-users@ceph.io
Subject: [ceph-users] ceph ignoring cluster/public_network when initiating TCP 
connections

Hello,

While testing our ceph cluster setup, I noticed a possible issue with the
cluster/public network configuration being ignored for TCP session
initiation.

Looks like the daemons (mon/mgr/mds/osd) are all listening on the right IP
address but are initiating TCP sessions from the wrong interfaces.
Would it be possible to force ceph daemons to use the cluster/public IP
addresses to initiate new TCP connections instead of letting the kernel
chose?

Some details below:

We set everything up to use our "10.2.1.0/24" network:
10.2.1.x (x=node number 1,2,3)
But we can see TCP sessions being initiated from "10.2.0.0/24" network.

So the daemons are listening to the right IP addresses.
root@nbs-vp-01:~# lsof -nPK i | grep ceph | grep LISTE
ceph-mds  1541648             ceph   16u     IPv4            8169344
 0t0        TCP 10.2.1.1:6800 (LISTEN)
ceph-mds  1541648             ceph   17u     IPv4            8169346
 0t0        TCP 10.2.1.1:6801 (LISTEN)
ceph-mgr  1541654             ceph   25u     IPv4            8163039
 0t0        TCP 10.2.1.1:6810 (LISTEN)
ceph-mgr  1541654             ceph   27u     IPv4            8163051
 0t0        TCP 10.2.1.1:6811 (LISTEN)
ceph-mon  1541703             ceph   27u     IPv4            8170914
 0t0        TCP 10.2.1.1:3300 (LISTEN)
ceph-mon  1541703             ceph   28u     IPv4            8170915
 0t0        TCP 10.2.1.1:6789 (LISTEN)
ceph-osd  1541711             ceph   16u     IPv4            8169353
 0t0        TCP 10.2.1.1:6802 (LISTEN)
ceph-osd  1541711             ceph   17u     IPv4            8169357
 0t0        TCP 10.2.1.1:6803 (LISTEN)
ceph-osd  1541711             ceph   18u     IPv4            8169362
 0t0        TCP 10.2.1.1:6804 (LISTEN)
ceph-osd  1541711             ceph   19u     IPv4            8169368
 0t0        TCP 10.2.1.1:6805 (LISTEN)
ceph-osd  1541711             ceph   20u     IPv4            8169375
 0t0        TCP 10.2.1.1:6806 (LISTEN)
ceph-osd  1541711             ceph   21u     IPv4            8169383
 0t0        TCP 10.2.1.1:6807 (LISTEN)
ceph-osd  1541711             ceph   22u     IPv4            8169392
 0t0        TCP 10.2.1.1:6808 (LISTEN)
ceph-osd  1541711             ceph   23u     IPv4            8169402
 0t0        TCP 10.2.1.1:6809 (LISTEN)

Sessions to the other nodes use the wrong IP address:

@nbs-vp-01:~# lsof -nPK i | grep ceph | grep 10.2.1.2
ceph-mds  1541648             ceph   28u     IPv4            8279520
 0t0        TCP 10.2.0.2:44180->10.2.1.2:6800 (ESTABLISHED)
ceph-mgr  1541654             ceph   41u     IPv4            8289842
 0t0        TCP 10.2.0.2:44146->10.2.1.2:6800 (ESTABLISHED)
ceph-mon  1541703             ceph   40u     IPv4            8174827
 0t0        TCP 10.2.0.2:40864->10.2.1.2:3300 (ESTABLISHED)
ceph-osd  1541711             ceph   65u     IPv4            8171035
 0t0        TCP 10.2.0.2:58716->10.2.1.2:6804 (ESTABLISHED)
ceph-osd  1541711             ceph   66u     IPv4            8172960
 0t0        TCP 10.2.0.2:54586->10.2.1.2:6806 (ESTABLISHED)
root@nbs-vp-01:~# lsof -nPK i | grep ceph | grep 10.2.1.3
ceph-mds  1541648             ceph   30u     IPv4            8292421
 0t0        TCP 10.2.0.2:45710->10.2.1.3:6802 (ESTABLISHED)
ceph-mon  1541703             ceph   46u     IPv4            8173025
 0t0        TCP 10.2.0.2:40164->10.2.1.3:3300 (ESTABLISHED)
ceph-osd  1541711             ceph   67u     IPv4            8173043
 0t0        TCP 10.2.0.2:56920->10.2.1.3:6804 (ESTABLISHED)
ceph-osd  1541711             ceph   68u     IPv4            8171063
 0t0        TCP 10.2.0.2:41952->10.2.1.3:6806 (ESTABLISHED)
ceph-osd  1541711             ceph   69u     IPv4            8178891
 0t0        TCP 10.2.0.2:57890->10.2.1.3:6808 (ESTABLISHED)


See below our cluster config:

[global]
         auth_client_required = cephx
         auth_cluster_required = cephx
         auth_service_required = cephx
         cluster_network = 10.2.1.0/24
         fsid = 0f19b6ff-0432-4c3f-b0cb-730e8302dc2c
         mon_allow_pool_delete = true
         mon_host = 10.2.1.1 10.2.1.2 10.2.1.3
         osd_pool_default_min_size = 2
         osd_pool_default_size = 3
         public_network = 10.2.1.0/24

[client]
         keyring = /etc/pve/priv/$cluster.$name.keyring

[mds]
         keyring = /var/lib/ceph/mds/ceph-$id/keyring

[mds.nbs-vp-01]
         host = nbs-vp-01
         mds_standby_for_name = pve

[mds.nbs-vp-03]
         host = nbs-vp-03
         mds standby for name = pve

[osd.0]
        public addr = 10.2.1.1
        cluster addr = 10.2.1.1

[osd.1]
        public addr = 10.2.1.2
        cluster addr = 10.2.1.2

[osd.2]
        public addr = 10.2.1.3
        cluster addr = 10.2.1.3

[mgr.nbs-vp-01]
        public addr = 10.2.1.1

[mgr.nbs-vp-02]
        public addr = 10.2.1.2

[mgr.nbs-vp-03]
        public addr = 10.2.1.3

[mon.nbs-vp-01]
        public addr = 10.2.1.1

[mon.nbs-vp-02]
        public addr = 10.2.1.2

[mon.nbs-vp-03]
        public addr = 10.2.1.3

Cheers,
Liviu
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to