Re: [ceph-users] slow request and unresponsive kvm guests after upgrading ceph cluster and os, please help debugging
Quoting Paul Emmerich (paul.emmer...@croit.io): > We've also seen some problems with FileStore on newer kernels; 4.9 is the > last kernel that worked reliably with FileStore in my experience. > > But I haven't seen problems with BlueStore related to the kernel version > (well, except for that scrub bug, but my work-around for that is in all > release versions). What scrub bug are you talking about? Gr. Stefan -- | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] slow request and unresponsive kvm guests after upgrading ceph cluster and os, please help debugging
Quoting Jelle de Jong (jelledej...@powercraft.nl): > question 2: what systemd target i can use to run a service after all > ceph-osds are loaded? I tried ceph.target ceph-osd.target both do not work > reliable. ceph-osd.target works for us (every time). Have you enabled all the individual OSD services, i.e. ceph-osd@0.service? > question 3: should I still try to upgrade to bluestore or pray to the system > ods that my performance is back after many many hours of troubleshooting? I would suggest the first, second is optional ;-). Especially because you have seperate NVMe device you can use for WAL / DB. It has advantages over filestore ... > I made a few changes I am going to just list them for other people that are > suffering from slow performance after upgrading there Ceph and/or OS. > > Disk utilization is back around 10% no more 80-100%... and rados bench is > stable again. > > apt-get install irqbalance nftables ^^ Are these some of these changes? Do you need those packages in order to unload / blacklist them? I don't get what your fixes are, or what the problem was. Firewall issues? What Ceph version did you upgrade to? Gr. Stefan -- | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] slow request and unresponsive kvm guests after upgrading ceph cluster and os, please help debugging
Hello everybody, I think I fixed the issues after weeks of looking. question 1: anyone know hos to prevent iptables, nftables or conntrack to be loaded in the first time? Adding them to /etc/modprobe.d/blacklist.local.conf does not seem to work? What is recommended? question 2: what systemd target i can use to run a service after all ceph-osds are loaded? I tried ceph.target ceph-osd.target both do not work reliable. question 3: should I still try to upgrade to bluestore or pray to the system ods that my performance is back after many many hours of troubleshooting? I made a few changes I am going to just list them for other people that are suffering from slow performance after upgrading there Ceph and/or OS. Disk utilization is back around 10% no more 80-100%... and rados bench is stable again. apt-get install irqbalance nftables # cat /etc/ceph/ceph.conf [global] fsid = 5f8d3724-1a51-4895-9b3e-5eb90ea49782 mon_initial_members = ceph01, ceph02, ceph03 mon_host = 192.168.35.11,192.168.35.12,192.168.35.13 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true osd pool default size = 3 public network = 192.168.35.0/28 cluster network = 192.168.35.0/28 osd pool default min size = 2 osd scrub begin hour = 23 osd scrub end hour = 6 # default osd recovery max active = 3 osd recovery max active = 1 #setuser match path = /var/lib/ceph/$type/$cluster-$id debug_default = 0 debug_lockdep = 0/0 debug_context = 0/0 debug_crush = 0/0 debug_buffer = 0/0 debug_timer = 0/0 debug_filer = 0/0 debug_objecter = 0/0 debug_rados = 0/0 debug_rbd = 0/0 debug_journaler = 0/0 debug_objectcatcher = 0/0 debug_client = 0/0 debug_osd = 0/0 debug_optracker = 0/0 debug_objclass = 0/0 debug_filestore = 0/0 debug_journal = 0/0 debug_ms = 0/0 debug_monc = 0/0 debug_tp = 0/0 filestore_op_threads = 8 filestore_max_inline_xattr_size = 254 filestore_max_inline_xattrs = 6 filestore_queue_max_ops = 500 filestore_queue_committing_max_ops = 5000 filestore_merge_threshold = 40 filestore_split_multiple = 10 journal_max_write_entries = 1000 journal_queue_max_ops = 3000 journal_max_write_bytes = 1048576000 osd_mkfs_options_xfs = -f -I size=2048 osd_mount_options_xfs = noatime,largeio,nobarrier,inode64,allocsize=8M ods_op_threads = 32 osd_journal_size = 1 filestore_queue_max_bytes = 1048576000 filestore_queue_committing_max_bytes = 1048576000 journal_queue_max_bytes = 1048576000 filestore_max_sync_interval = 10 filestore_journal_parallel = true [client] rbd cache = true #rbd cache max dirty = 0 # cat /etc/sysctl.d/30-nic-10gbit.conf net.ipv4.tcp_rmem = 1000 1000 1000 net.ipv4.tcp_wmem = 1000 1000 1000 net.ipv4.tcp_mem = 1000 1000 1000 net.core.rmem_default = 524287 net.core.wmem_default = 524287 net.core.rmem_max = 524287 net.core.wmem_max = 524287 net.core.netdev_max_backlog = 30 Unload all forms of filtering, does not blacklist does not work, they keep getting loaded! Guess auto loaded by kernel. echo "blacklist ip_tables" | tee --append /etc/modprobe.d/blacklist.local.conf echo "blacklist iptable_filter" | tee --append /etc/modprobe.d/blacklist.local.conf echo "blacklist ip6_tables" | tee --append /etc/modprobe.d/blacklist.local.conf echo "blacklist ip6table_filter" | tee --append /etc/modprobe.d/blacklist.local.conf echo "blacklist nf_tables" | tee --append /etc/modprobe.d/blacklist.local.conf echo "blacklist nf6_tables" | tee --append /etc/modprobe.d/blacklist.local.conf depmod -a update-initramfs -u -k all -v root@ceph02:~# cat /etc/rc.local #!/bin/bash -e # # rc.local # # This script is executed at the end of each multiuser runlevel. # Make sure that the script will "exit 0" on success or any other # value on error. # # In order to enable or disable this script just change the execution # bits. # # By default this script does nothing. for i in {a..e}; doecho 512 > /sys/block/sd$i/queue/read_ahead_kb; done for i in {a..d}; dohdparm -q -B 255 -q -W0 /dev/sd$i; done echo 'on' > '/sys/bus/pci/devices/:00:01.0/power/control' echo 'on' > '/sys/bus/pci/devices/:00:03.0/power/control' echo 'on' > '/sys/bus/pci/devices/:00:01.0/power/control' cpupower frequency-set --governor performance modprobe -r iptable_filter ip_tables ip6table_filter ip6_tables nf_tables_ipv6 nf_tables_ipv4 nf_tables_bridge nf_tables array=($(pidof ceph-osd)) taskset -cp 0-5 $(echo ${array[0]}) taskset -cp 12-17 $(echo ${array[1]}) taskset -cp 6-11 $(echo ${array[2]}) taskset -cp 18-23 $(echo ${array[3]}) exit 0 Please also save the pastebin from my OP there is a lot of benchmark and test notes in there. root@ceph02:~# rados bench -p scbench 10 write --no-cleanup hints = 1 Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 10 seconds or 0 objects Object prefix: benchmark_data_ceph02_396172 sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg
Re: [ceph-users] slow request and unresponsive kvm guests after upgrading ceph cluster and os, please help debugging
We've also seen some problems with FileStore on newer kernels; 4.9 is the last kernel that worked reliably with FileStore in my experience. But I haven't seen problems with BlueStore related to the kernel version (well, except for that scrub bug, but my work-around for that is in all release versions). Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Mon, Jan 6, 2020 at 8:44 PM Jelle de Jong wrote: > Hello everybody, > > I have issues with very slow requests a simple tree node cluster here, > four WDC enterprise disks and Intel Optane NVMe journal on identical > high memory nodes, with 10GB networking. > > It was working all good with Ceph Hammer on Debian Wheezy, but I wanted > to upgrade to a supported version and test out bluestore as well. So I > upgraded to luminous on Debian Stretch and used ceph-volume to create > bluestore osds, everything went downhill from there. > > I went back to filestore on all nodes but I still have slow requests and > I can not pinpoint a good reason I tried to debug and gathered > information to look at: > > https://paste.debian.net/hidden/acc5d204/ > > First I thought it was the balancing that was making things slow, then I > thought it might be the LVM layer, so I recreated the nodes without LVM > by switching from ceph-volume to ceph-disk, no different still slow > request. Then I changed back from bluestore to filestore but still the a > very slow cluster. Then I thought it was a CPU scheduling issue and > downgraded the 5.x kernel and CPU performance is full speed again. I > thought maybe there is something weird with an osd and taking them out > one by one, but slow request are still showing up and client performance > from vms is really poor. > > I just feel a burst of small requests keeps blocking for a while then > recovers again. > > Many thanks for helping out looking at the URL. > > If there are options which I should tune for a hdd with nvme journal > setup please share. > > Jelle > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com