[ceph-users] FS Reclaims storage too slow

2018-06-25 Thread Zhang Qiang
Hi, Is it normal that I deleted files from the cephfs and ceph didn't delete the back objects a day later? Until I restart the mds deamon then it started to release the storage space. I noticed the doc(http://docs.ceph.com/docs/mimic/dev/delayed-delete/) says the file is marked as deleted on the

Re: [ceph-users] Uneven data distribution with even pg distribution after rebalancing

2018-06-25 Thread David Turner
If you look at ceph pg dump, you'll see the size ceph believes each PG is. >From your ceph df, your PGs for the rbd_pool will be almost zero. So if you have an osd with 6 of those PGs and another with none of them, but both osds have the same number of PGs overall... The osd with none of them will

Re: [ceph-users] Uneven data distribution with even pg distribution after rebalancing

2018-06-25 Thread shadow_lin
Hi David, I am sure most(if not all) data are in one pool. rbd_pool is only for omap for EC rbd. ceph df: GLOBAL: SIZE AVAIL RAW USED %RAW USED 427T 100555G 329T 77.03 POOLS: NAMEID USED %USED MAX AVAIL OBJECTS

Re: [ceph-users] Uneven data distribution with even pg distribution after rebalancing

2018-06-25 Thread David Turner
You have 2 different pools. PGs in each pool are going to be a different size. It's like saying 12x + 13y should equal 2x + 23y because they each have 25 X's and Y's. Having equal PG counts on each osd is only balanced if you have a single pool or have a case where all PGs are identical in size.

Re: [ceph-users] Uneven data distribution with even pg distribution after rebalancing

2018-06-25 Thread shadow_lin
Hi David, I am afraid I can't run the command you provide now,because I tried to remove another osd on that host to see if it would make the data distribution even and it did. The pg number of my pools are at power of 2. Below is from my note before removed another osd: pool

[ceph-users] Increase queue_depth in KVM

2018-06-25 Thread Damian Dabrowski
Hello, When I mount rbd image with -o queue_depth=1024 I can see much improvement, generally on writes(random write improvement from 3k IOPS on standard queue_depth to 24k IOPS on queue_depth=1024). But is there any way to attach rbd disk to KVM instance with custom queue_depth? I can't find any

Re: [ceph-users] Ceph 12.2.5 - FAILED assert(0 == "put on missing extent (nothing before)")

2018-06-25 Thread Dyweni - Ceph-Users
Hi, Is there any information you'd like to grab off this OSD? Anything I can provide to help you troubleshoot this? I ask, because if not, I'm going to reformat / rebuild this OSD (unless there is a faster way to repair this issue). Thanks, Dyweni On 2018-06-25 07:30, Dyweni -

[ceph-users] Proxmox with EMC VNXe 3200

2018-06-25 Thread Eneko Lacunza
Hi all, We're planning the migration of a VMWare 5.5 cluster backed by a EMC VNXe 3200 storage appliance to Proxmox. The VNXe has about 3 year of warranty left and half the disks unprovisioned, so the current plan is to use the same VNXe for Proxmox storage. After warranty expires we'll

Re: [ceph-users] Recovery after datacenter outage

2018-06-25 Thread Brett Niver
+Paul On Mon, Jun 25, 2018 at 5:14 AM, Christian Zunker wrote: > Hi Jason, > > your guesses were correct. Thank you for your support. > > Just in case, someone else stumbles upon this thread, some more links: > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/020722.html >

[ceph-users] Ceph 12.2.5 - FAILED assert(0 == "put on missing extent (nothing before)")

2018-06-25 Thread Dyweni - Ceph-Users
Good Morning, After removing roughly 20-some rbd shapshots, one of my OSD's has begun flapping. ERROR 1 2018-06-25 06:46:39.132257 a0ce2700 -1 osd.8 pg_epoch: 44738 pg[4.e8( v 44721'485588 (44697'484015,44721'485588] local-lis/les=44593/44595 n=2972 ec=9422/9422 lis/c

[ceph-users] Intel SSD DC P3520 PCIe for OSD 1480 TBW good idea?

2018-06-25 Thread Jelle de Jong
Hello everybody, I am thinking about making a production three node Ceph cluster with 3x 1.2TB Intel SSD DC P3520 PCIe storage devices. 10.8 (7.2TB 66% for production) I am not planning on a journal on a separate ssd. I assume there is no advantage of this when using pcie storage? Network

Re: [ceph-users] radosgw failover help

2018-06-25 Thread Burkhard Linke
Hi, On 06/20/2018 07:20 PM, David Turner wrote: We originally used pacemaker to move a VIP between our RGWs, but ultimately decided to go with an LB in front of them. With an LB you can utilize both RGWs while they're up, but the LB will shy away from either if they're down until the check

Re: [ceph-users] PG status is "active+undersized+degraded"

2018-06-25 Thread Burkhard Linke
Hi, On 06/22/2018 08:06 AM, dave.c...@dell.com wrote: I saw these statement from this link ( http://docs.ceph.com/docs/master/rados/operations/crush-map/ ), it that the reason which leads to the warning? " This, combined with the default CRUSH failure domain, ensures that replicas or

[ceph-users] Balancer: change from crush-compat to upmap

2018-06-25 Thread Caspar Smit
Hi All, I've been using the balancer module in crush-compat mode for quite a while now and want to switch to upmap mode since all my clients are now luminous (v12.2.5) i've reweighted the compat weight-set back to as close as the original crush weights using 'ceph osd crush reweight-compat'

[ceph-users] Move Ceph-Cluster to another Datacenter

2018-06-25 Thread Mehmet
Hey Ceph people, need advise on how to move a ceph-cluster from one datacenter to another without any downtime :) DC 1: 3 dedicated MON-Server (also MGR on this Servers) 4 dedicated OSD-Server (3x12 OSD, 1x 23 OSDs) 3 Proxmox Nodes with connection to our Ceph-Storage (not managed from

Re: [ceph-users] fixing unrepairable inconsistent PG

2018-06-25 Thread Andrei Mikhailovsky
Hi Brad, here is the output: -- root@arh-ibstorage1-ib:/home/andrei# ceph --debug_ms 5 --debug_auth 20 pg 18.2 query 2018-06-25 10:59:12.100302 7fe23eaa1700 2 Event(0x7fe2400e0140 nevent=5000 time_id=1).set_owner idx=0 owner=140609690670848 2018-06-25 10:59:12.100398 7fe23e2a0700

Re: [ceph-users] Help! Luminous 12.2.5 CephFS - MDS crashed and now won't start (failing at MDCache::add_inode)

2018-06-25 Thread Linh Vu
So my colleague Sean Crosby and I were looking through the logs (with debug mds = 10) and found some references just before the crash to inode number. We converted it from HEX to decimal and got something like 109953*5*627776 (last few digits not necessarily correct). We set one digit up i.e to

Re: [ceph-users] Recovery after datacenter outage

2018-06-25 Thread Christian Zunker
Hi Jason, your guesses were correct. Thank you for your support. Just in case, someone else stumbles upon this thread, some more links: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/020722.html

[ceph-users] Help! Luminous 12.2.5 CephFS - MDS crashed and now won't start (failing at MDCache::add_inode)

2018-06-25 Thread Linh Vu
Hi all, We have a Luminous 12.2.5 cluster, running entirely just CephFS with 1 active and 1 standby MDS. The active MDS crashed and now won't start again with this same error: ### 0> 2018-06-25 16:11:21.136203 7f01c2749700 -1

Re: [ceph-users] unfound blocks IO or gives IO error?

2018-06-25 Thread Dan van der Ster
On Fri, Jun 22, 2018 at 10:44 PM Gregory Farnum wrote: > > On Fri, Jun 22, 2018 at 6:22 AM Sergey Malinin wrote: >> >> From >> http://docs.ceph.com/docs/mimic/rados/troubleshooting/troubleshooting-pg/ : >> >> "Now 1 knows that these object exist, but there is no live ceph-osd who has >> a