now this goes against what I thought I learned about ceph fs. You should be able to RW to/from all OSDs, how can it be limited to only a single OSD??
On Sat, Oct 6, 2018 at 4:30 AM Christopher Blum <[email protected]> wrote: > I wouldn't recommend you pursuit this any further, but if this is the only > client that would reside on the same VM as the OSD, one thing you could try > is to decrease the primary affinity to 0 [1] for the local OSD . > That way that single OSD would never become a primary OSD ;) > > Disclaimer: This is more like a hack. > > > [1] https://ceph.com/geen-categorie/ceph-primary-affinity/ > > On Fri, Oct 5, 2018 at 10:23 PM Gregory Farnum <[email protected]> wrote: > >> On Fri, Oct 5, 2018 at 3:13 AM Marc Roos <[email protected]> >> wrote: >> >>> >>> >>> I guess then this waiting "quietly" should be looked at again, I am >>> having load of 10 on this vm. >>> >>> [@~]# uptime >>> 11:51:58 up 4 days, 1:35, 1 user, load average: 10.00, 10.01, 10.05 >>> >>> [@~]# uname -a >>> Linux smb 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14 21:49:04 UTC 2018 >>> x86_64 x86_64 x86_64 GNU/Linux >>> >>> [@~]# cat /etc/redhat-release >>> CentOS Linux release 7.5.1804 (Core) >>> >>> [@~]# dmesg >>> [348948.927734] libceph: osd23 192.168.10.114:6810 socket closed (con >>> state CONNECTING) >>> [348957.120090] libceph: osd27 192.168.10.114:6802 socket closed (con >>> state CONNECTING) >>> [349010.370171] libceph: osd26 192.168.10.114:6806 socket closed (con >>> state CONNECTING) >>> [349114.822301] libceph: osd24 192.168.10.114:6804 socket closed (con >>> state CONNECTING) >>> [349141.447330] libceph: osd29 192.168.10.114:6812 socket closed (con >>> state CONNECTING) >>> [349278.668658] libceph: osd25 192.168.10.114:6800 socket closed (con >>> state CONNECTING) >>> [349440.467038] libceph: osd28 192.168.10.114:6808 socket closed (con >>> state CONNECTING) >>> [349465.043957] libceph: osd23 192.168.10.114:6810 socket closed (con >>> state CONNECTING) >>> [349473.236400] libceph: osd27 192.168.10.114:6802 socket closed (con >>> state CONNECTING) >>> [349526.486408] libceph: osd26 192.168.10.114:6806 socket closed (con >>> state CONNECTING) >>> [349630.938498] libceph: osd24 192.168.10.114:6804 socket closed (con >>> state CONNECTING) >>> [349657.563561] libceph: osd29 192.168.10.114:6812 socket closed (con >>> state CONNECTING) >>> [349794.784936] libceph: osd25 192.168.10.114:6800 socket closed (con >>> state CONNECTING) >>> [349956.583300] libceph: osd28 192.168.10.114:6808 socket closed (con >>> state CONNECTING) >>> [349981.160225] libceph: osd23 192.168.10.114:6810 socket closed (con >>> state CONNECTING) >>> [349989.352510] libceph: osd27 192.168.10.114:6802 socket closed (con >>> state CONNECTING) >>> >> >> Looks like in this case the client is spinning trying to establish the >> network connections it expects to be available. There's not really much >> else it can do — we expect and require full routing. The monitors are >> telling the clients that the OSDs are up and available, and it is doing >> data IO that requires them. So it tries to establish a connection, sees the >> network fail, and tries again. >> >> Unfortunately the restricted-network use case you're playing with here is >> just not supported by Ceph. >> -Greg >> >> >>> .. >>> .. >>> .. >>> >>> >>> >>> >>> -----Original Message----- >>> From: John Spray [mailto:[email protected]] >>> Sent: donderdag 27 september 2018 11:43 >>> To: Marc Roos >>> Cc: [email protected] >>> Subject: Re: [ceph-users] Cannot write to cephfs if some osd's are not >>> available on the client network >>> >>> On Thu, Sep 27, 2018 at 10:16 AM Marc Roos <[email protected]> >>> wrote: >>> > >>> > >>> > I have a test cluster and on a osd node I put a vm. The vm is using a >>> > macvtap on the client network interface of the osd node. Making access >>> >>> > to local osd's impossible. >>> > >>> > the vm of course reports that it cannot access the local osd's. What I >>> >>> > am getting is: >>> > >>> > - I cannot reboot this vm normally, need to reset it. >>> >>> When linux tries to shut down cleanly, part of that is flushing buffers >>> from any mounted filesystem back to disk. If you have a network >>> filesystem mounted, and the network is unavailable, that can cause the >>> process to block. You can try forcibly unmounting before rebooting. >>> >>> > - vm is reporting very high load. >>> >>> The CPU load part is surprising -- in general Ceph clients should wait >>> quietly when blocked, rather than spinning. >>> >>> > I guess this should not be happening not? Because it should choose an >>> > other available osd of the 3x replicated pool and just write the data >>> > to that one? >>> >>> No -- writes always go through the primary OSD for the PG being written >>> to. If an OSD goes down, then another OSD will become the primary. In >>> your case, the primary OSD is not going down, it's just being cut off >>> from the client by the network, so the writes are blocking indefinitely. >>> >>> John >>> >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > _______________________________________________ >>> > ceph-users mailing list >>> > [email protected] >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> [email protected] >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> _______________________________________________ >> ceph-users mailing list >> [email protected] >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
