Re: [ceph-users] Applications slow in VMs running RBD disks
Hi Eliza, Em qua, 21 de ago de 2019 às 09:30, Eliza escreveu: > Hi > > on 2019/8/21 20:25, Gesiel Galvão Bernardes wrote: > > I`m use a Qemu/kvm(Opennebula) with Ceph/RBD for running VMs, and I > > having problems with slowness in aplications that many times not > > consuming very CPU or RAM. This problem affect mostly Windows. Appearly > > the problem is that normally the application load many short files (ex: > > DLLs) and these files take a long time to load, generating a slowness. > > Did you check/test your network connection? > Do you have a fast network setup? I have a bond of two 10GB interfaces, with little use. > > regards. > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Applications slow in VMs running RBD disks
Hi, I`m use a Qemu/kvm(Opennebula) with Ceph/RBD for running VMs, and I having problems with slowness in aplications that many times not consuming very CPU or RAM. This problem affect mostly Windows. Appearly the problem is that normally the application load many short files (ex: DLLs) and these files take a long time to load, generating a slowness. I using 8Tb disks, with 3x replica (I've tried with erasure and 2x too), and tried with SSD cache and without SSD cache and the problem persists. Using the same disks with NFS, the applications run fine. I've already tried change RBD object size (from 4mb to 128k), use qemu writeback cache, configure virtio iscsi queues, use virtio (virtio-blk) driver, and and none of these brought effective improvement for this problem. Anyone already had similar problem and / or have any idea how to solve this? Or have a idea that where I should look to resolve this? Thanks Advance, Gesiel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Time of response of "rbd ls" command
HI, I recently noticed that in two of my pools the command "rbd ls" has take several minutes to return the values. These pools have between 100 and 120 images each. Where should I look to check why this slowness? The cluster is apparently fine, without any warning. Thank you very much in advance. Gesiel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Online disk resize with Qemu/KVM and Ceph
Bingo! Changed disk to scsi and adapter to virtio is working perfectly. Thank you Mark! Regards, Gesiel Em sex, 15 de fev de 2019 às 10:21, Marc Roos escreveu: > > Use scsi disk and virtio adapter? I think that is recommended also for > use with ceph rbd. > > > > -Original Message- > From: Gesiel Galvão Bernardes [mailto:gesiel.bernar...@gmail.com] > Sent: 15 February 2019 13:16 > To: Marc Roos > Cc: ceph-users > Subject: Re: [ceph-users] Online disk resize with Qemu/KVM and Ceph > > HI Marc, > > i tried this and the problem continue :-( > > > Em sex, 15 de fev de 2019 às 10:04, Marc Roos > escreveu: > > > > > And then in the windows vm > cmd > diskpart > Rescan > > Linux vm > echo 1 > /sys/class/scsi_device/2\:0\:0\:0/device/rescan (sda) > echo 1 > /sys/class/scsi_device/2\:0\:3\:0/device/rescan (sdd) > > > > I have this to, have to do this to: > > virsh qemu-monitor-command vps-test2 --hmp "info block" > virsh qemu-monitor-command vps-test2 --hmp "block_resize > drive-scsi0-0-0-0 12G" > > > > > > -Original Message- > From: Gesiel Galvão Bernardes [mailto:gesiel.bernar...@gmail.com] > Sent: 15 February 2019 12:59 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] Online disk resize with Qemu/KVM and Ceph > > Hi, > > I'm making a environment for VMs with qemu/kvm and Ceph using RBD, > and > I'm with the follow problem: The guest VM not recognizes disk > resize > (increase). The cenario is: > > Host: > Centos 7.6 > Libvirt 4.5 > Ceph 13.2.4 > > I follow the following steps to increase the disk (ex: disk 10Gb > to > 20Gb): > > > # rbd resize --size 20480 mypool/vm_test # virsh blockresize > --domain > vm_test --path vda --size 20G > > But after this steps, the disk in VM continue with original size. > For > apply the change, is necessary reboot VM. > If I use local datastore instead Ceph, the VM recognize new size > imediatally. > > Does anyone have this? Is this expected? > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Online disk resize with Qemu/KVM and Ceph
HI Marc, i tried this and the problem continue :-( Em sex, 15 de fev de 2019 às 10:04, Marc Roos escreveu: > > > And then in the windows vm > cmd > diskpart > Rescan > > Linux vm > echo 1 > /sys/class/scsi_device/2\:0\:0\:0/device/rescan (sda) > echo 1 > /sys/class/scsi_device/2\:0\:3\:0/device/rescan (sdd) > > > > I have this to, have to do this to: > > virsh qemu-monitor-command vps-test2 --hmp "info block" > virsh qemu-monitor-command vps-test2 --hmp "block_resize > drive-scsi0-0-0-0 12G" > > > > > > -Original Message- > From: Gesiel Galvão Bernardes [mailto:gesiel.bernar...@gmail.com] > Sent: 15 February 2019 12:59 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] Online disk resize with Qemu/KVM and Ceph > > Hi, > > I'm making a environment for VMs with qemu/kvm and Ceph using RBD, and > I'm with the follow problem: The guest VM not recognizes disk resize > (increase). The cenario is: > > Host: > Centos 7.6 > Libvirt 4.5 > Ceph 13.2.4 > > I follow the following steps to increase the disk (ex: disk 10Gb to > 20Gb): > > > # rbd resize --size 20480 mypool/vm_test # virsh blockresize --domain > vm_test --path vda --size 20G > > But after this steps, the disk in VM continue with original size. For > apply the change, is necessary reboot VM. > If I use local datastore instead Ceph, the VM recognize new size > imediatally. > > Does anyone have this? Is this expected? > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Online disk resize with Qemu/KVM and Ceph
Hi, I'm making a environment for VMs with qemu/kvm and Ceph using RBD, and I'm with the follow problem: The guest VM not recognizes disk resize (increase). The cenario is: Host: Centos 7.6 Libvirt 4.5 Ceph 13.2.4 I follow the following steps to increase the disk (ex: disk 10Gb to 20Gb): # rbd resize --size 20480 mypool/vm_test # virsh blockresize --domain vm_test --path vda --size 20G But after this steps, the disk in VM continue with original size. For apply the change, is necessary reboot VM. If I use local datastore instead Ceph, the VM recognize new size imediatally. Does anyone have this? Is this expected? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Use SSDs for metadata or for a pool cache?
Hello, I am building a new cluster with 4 hosts, which have the following configuration: 128Gb RAM 12 HDs SATA 8TB 7.2k rpm 2 SSDs 240Gb 2x10GB Network I will use the cluster to store RBD images of VMs, I thought to use with 2x replica, if it does not get too slow. My question is: Using bluestore (default since Luminous, right?), should I use the SSDs as a "cache pool" or use the SSDs to store the bluestore metadata? Or could I use 1 SSD for metadata and another for a "cache pool"? Thank you in advance for your opinions. Gesiel ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] [Ceph-community] Pool broke after increase pg_num
Hi, The pool is back up and running. I made this actions: - Increased max pg per OSD (ceph tell mon.* injectargs '--mon_max_pg_per_osd=400'). But was still frozen. (already had OSDs with 251 pgs, then I not sure if this was the my problem.) - Restarted all daemons, including OSDs. In a specific host, when I restarted a OSD daemon, It took too long, and after this I saw that the pool started rebuild. I don't have a sure conclusion about what's happened, at least it's working. I will read logs, now with more diem, for understanding exactly happened. Thank you all for your help. Gesiel Em sex, 9 de nov de 2018 às 03:37, Ashley Merrick escreveu: > Are you sure the down OSD didn't happen to have any data required for the > re-balance to complete? How long has the down now removed OSD been out? > Before or after your increased PG count? > > If you do "ceph health detail" and then pick a stuck PG what does "ceph pg > PG query" output? > > Has your ceph -s output changed at all since the last paste? > > On Fri, Nov 9, 2018 at 12:08 AM Gesiel Galvão Bernardes < > gesiel.bernar...@gmail.com> wrote: > >> Em qui, 8 de nov de 2018 às 10:00, Joao Eduardo Luis >> escreveu: >> >>> Hello Gesiel, >>> >>> Welcome to Ceph! >>> >>> In the future, you may want to address the ceph-users list >>> (`ceph-users@lists.ceph.com`) for this sort of issues. >>> >>> >> Thank you, I will do. >> >> On 11/08/2018 11:18 AM, Gesiel Galvão Bernardes wrote: >>> > Hi everyone, >>> > >>> > I am a beginner in Ceph. I made a increase of pg_num in a pool, and >>> > after the cluster rebalance I increased pgp_num (a confission: I not >>> > had read the complete documentation about this operation :-( ). Then >>> > after this my cluster broken, and stoped all. The cluster not >>> rebalance, >>> > and my impression is that are all stopped. >>> > >>> > Below is my "ceph -s". Can anyone help-me? >>> >>> You have two osds down. Depending on how your data is mapped, your pgs >>> may be waiting for those to come back up before they finish being >>> cleaned up. >>> >>> >> After removed OSD downs, it is tried rebalance, but is "frozen" again, >> in this status: >> >> cluster: >> id: ab5dcb0c-480d-419c-bcb8-013cbcce5c4d >> health: HEALTH_WARN >> 12840/988707 objects misplaced (1.299%) >> Reduced data availability: 358 pgs inactive, 325 pgs peering >> >> services: >> mon: 3 daemons, quorum cmonitor,thanos,cmonitor2 >> mgr: thanos(active), standbys: cmonitor >> osd: 17 osds: 17 up, 17 in; 221 remapped pgs >> >> data: >> pools: 1 pools, 1024 pgs >> objects: 329.6 k objects, 1.3 TiB >> usage: 3.8 TiB used, 7.4 TiB / 11 TiB avail >> pgs: 1.660% pgs unknown >> 33.301% pgs not active >> 12840/988707 objects misplaced (1.299%) >> 666 active+clean >> 188 remapped+peering >> 137 peering >> 17 unknown >> 16 activating+remapped >> >> Any other idea? >> >> >> Gesiel >> >>> >>> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] [Ceph-community] Pool broke after increase pg_num
Em qui, 8 de nov de 2018 às 10:00, Joao Eduardo Luis escreveu: > Hello Gesiel, > > Welcome to Ceph! > > In the future, you may want to address the ceph-users list > (`ceph-users@lists.ceph.com`) for this sort of issues. > > Thank you, I will do. On 11/08/2018 11:18 AM, Gesiel Galvão Bernardes wrote: > > Hi everyone, > > > > I am a beginner in Ceph. I made a increase of pg_num in a pool, and > > after the cluster rebalance I increased pgp_num (a confission: I not > > had read the complete documentation about this operation :-( ). Then > > after this my cluster broken, and stoped all. The cluster not rebalance, > > and my impression is that are all stopped. > > > > Below is my "ceph -s". Can anyone help-me? > > You have two osds down. Depending on how your data is mapped, your pgs > may be waiting for those to come back up before they finish being > cleaned up. > > After removed OSD downs, it is tried rebalance, but is "frozen" again, in this status: cluster: id: ab5dcb0c-480d-419c-bcb8-013cbcce5c4d health: HEALTH_WARN 12840/988707 objects misplaced (1.299%) Reduced data availability: 358 pgs inactive, 325 pgs peering services: mon: 3 daemons, quorum cmonitor,thanos,cmonitor2 mgr: thanos(active), standbys: cmonitor osd: 17 osds: 17 up, 17 in; 221 remapped pgs data: pools: 1 pools, 1024 pgs objects: 329.6 k objects, 1.3 TiB usage: 3.8 TiB used, 7.4 TiB / 11 TiB avail pgs: 1.660% pgs unknown 33.301% pgs not active 12840/988707 objects misplaced (1.299%) 666 active+clean 188 remapped+peering 137 peering 17 unknown 16 activating+remapped Any other idea? Gesiel > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com