Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-11 Thread 宗友 姚
Currently, this can only be done by hand. Maybe we need some scripts to handle this automatically. I don't know if https://github.com/ceph/ceph/tree/master/src/pybind/mgr/balancer can handle this. From: Konstantin Shalygin Sent:

Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-11 Thread Konstantin Shalygin
On 04/12/2018 10:58 AM, ?? ? wrote: Yes, according to crush algorithms, large drives are given high weight, this is expected. By default, crush gives no consideration of each drive's performance, which may cause the performance distribution is not balanced. And the highest io util osd may

Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-11 Thread ?? ?
Yes, according to crush algorithms, large drives are given high weight, this is expected. By default, crush gives no consideration of each drive's performance, which may cause the performance distribution is not balanced. And the highest io util osd may slow down the whole cluster.

Re: [ceph-users] osds with different disk sizes may killing performance

2018-04-11 Thread Konstantin Shalygin
After digging into our internal system stats, we find the new added's disk io util is about two times than the old. This is obviously and expected. Yours 8Tb drives weighted double against 4Tb and do *double* crush work in comparison. k ___

Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks

2018-04-11 Thread Alex Gorbachev
On Wed, Apr 11, 2018 at 2:13 PM, Jason Dillaman wrote: > I've tested the patch on both 4.14.0 and 4.16.0 and it appears to > function correctly for me. parted can see the newly added free-space > after resizing the RBD image and our stress tests once again pass >

[ceph-users] osds with different disk sizes may killing performance

2018-04-11 Thread ? ??
Hi,  For anybody who may be interested, here I share a process of locating the reason for ceph cluster performance slow down in our environment. Internally, we have a cluster with capacity 1.1PB, used 800TB, and raw user data is about 500TB. Each day, 3TB' data is uploaded and 3TB oldest data

Re: [ceph-users] Cluster degraded after Ceph Upgrade 12.2.1 => 12.2.2

2018-04-11 Thread Patrick Donnelly
Hello Ronny, On Wed, Apr 11, 2018 at 10:25 AM, Ronny Aasen wrote: > mds: restart mds's one at the time. you will notice the standby mds taking > over for the mds that was restarted. do both. No longer recommended. See:

Re: [ceph-users] cephfs snapshot format upgrade

2018-04-11 Thread Gregory Farnum
On Tue, Apr 10, 2018 at 8:50 PM, Yan, Zheng wrote: > On Wed, Apr 11, 2018 at 3:34 AM, Gregory Farnum wrote: >> On Tue, Apr 10, 2018 at 5:54 AM, John Spray wrote: >>> On Tue, Apr 10, 2018 at 1:44 PM, Yan, Zheng wrote:

Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks

2018-04-11 Thread Alex Gorbachev
On Wed, Apr 11, 2018 at 2:13 PM, Jason Dillaman wrote: > I've tested the patch on both 4.14.0 and 4.16.0 and it appears to > function correctly for me. parted can see the newly added free-space > after resizing the RBD image and our stress tests once again pass >

Re: [ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-11 Thread Steven Vacaroaia
Hello again, I have reinstalled the cluster and noticed that, with 2 servers is working as expectd, adding the 3rd one tanks perfermonce IRRESPECTIVE of which server is the 3 rd one I have tested it with only 1 OSD per server in order to eliminate any balancing issues This seems to indicate an

Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks

2018-04-11 Thread Jason Dillaman
I've tested the patch on both 4.14.0 and 4.16.0 and it appears to function correctly for me. parted can see the newly added free-space after resizing the RBD image and our stress tests once again pass successfully. Do you have any additional details on the issues you are seeing? On Wed, Apr 11,

Re: [ceph-users] Fwd: Separate --block.wal --block.db bluestore not working as expected.

2018-04-11 Thread Gary Verhulp
On 4/7/18 4:21 PM, Alfredo Deza wrote: On Sat, Apr 7, 2018 at 11:59 AM, Gary Verhulp wrote: I’m trying to create bluestore osds with separate --block.wal --block.db devices on a write intensive SSD I’ve split the SSD (/dev/sda) into two partditions sda1 and sda2

Re: [ceph-users] Cluster degraded after Ceph Upgrade 12.2.1 => 12.2.2

2018-04-11 Thread Ronny Aasen
ceph upgrades are usualy not a problem: ceph have to be upgraded in the right order. normally when each service is on its own machine this is not difficult. but when you have mon, mgr, osd, mds, and klients on the same host you have to do it a bit carefully.. i tend to have a terminal open

Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks

2018-04-11 Thread Jason Dillaman
I'll give it a try locally and see if I can figure it out. Note that this commit [1] also dropped the call to "bd_set_size" within "nbd_size_update", which seems suspicious to me at initial glance. [1]

Re: [ceph-users] Purged a pool, buckets remain

2018-04-11 Thread Robert Stanford
Ok. How do I fix what's been broke? How do I "rebuild my index"? Thanks On Wed, Apr 11, 2018 at 1:49 AM, Robin H. Johnson wrote: > On Tue, Apr 10, 2018 at 10:06:57PM -0500, Robert Stanford wrote: > > I used this command to purge my rgw data: > > > > rados purge

Re: [ceph-users] Cluster degraded after Ceph Upgrade 12.2.1 => 12.2.2

2018-04-11 Thread Ranjan Ghosh
Ah, nevermind, we've solved it. It was a firewall issue. The only thing that's weird is that it became an issue immediately after an update. Perhaps it has sth. to do with monitor nodes shifting around or anything. Well, thanks again for your quick support, though. It's much appreciated. BR

Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks

2018-04-11 Thread Alex Gorbachev
> On Wed, Apr 11, 2018 at 10:27 AM, Alex Gorbachev > wrote: >> On Wed, Apr 11, 2018 at 2:43 AM, Mykola Golub >> wrote: >>> On Tue, Apr 10, 2018 at 11:14:58PM -0400, Alex Gorbachev wrote: >>> So Josef fixed the one issue that enables e.g.

Re: [ceph-users] Cluster degraded after Ceph Upgrade 12.2.1 => 12.2.2

2018-04-11 Thread Ranjan Ghosh
Thank you for your answer. Do you have any specifics on which thread you're talking about? Would be very interested to read about a success story, because I fear that if I update the other node that the whole cluster comes down. Am 11.04.2018 um 10:47 schrieb Marc Roos: I think you have to

[ceph-users] "ceph-fuse" / "mount -t fuse.ceph" do not report a failed mount on exit (Pacemaker OCF "Filesystem" resource)

2018-04-11 Thread Nicolas Huillard
Hi all, I use Pacemaker and the "Filesystem" Resource Agent to mount/unmount my cephfs. Depending on timing, MDS may be reachable a few dozen seconds after the mount command, but there is no report of the failure through the exit code. Examples using mount.fuse.ceph or ceph-fuse (no MDS running

Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks

2018-04-11 Thread Jason Dillaman
Do you have a preliminary patch that we can test against? On Wed, Apr 11, 2018 at 10:27 AM, Alex Gorbachev wrote: > On Wed, Apr 11, 2018 at 2:43 AM, Mykola Golub wrote: >> On Tue, Apr 10, 2018 at 11:14:58PM -0400, Alex Gorbachev wrote: >> >>>

Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks

2018-04-11 Thread Alex Gorbachev
On Wed, Apr 11, 2018 at 2:43 AM, Mykola Golub wrote: > On Tue, Apr 10, 2018 at 11:14:58PM -0400, Alex Gorbachev wrote: > >> So Josef fixed the one issue that enables e.g. lsblk and sysfs size to >> reflect the correct siz on change. However, partptobe and parted >> still

Re: [ceph-users] Cluster degraded after Ceph Upgrade 12.2.1 => 12.2.2

2018-04-11 Thread Marc Roos
I think you have to update all osd's, mon's etc. I can remember running into similar issue. You should be able to find more about this in mailing list archive. -Original Message- From: Ranjan Ghosh [mailto:gh...@pw6.de] Sent: woensdag 11 april 2018 16:02 To: ceph-users Subject:

[ceph-users] Cluster degraded after Ceph Upgrade 12.2.1 => 12.2.2

2018-04-11 Thread Ranjan Ghosh
Hi all, We have a two-cluster-node (with a third "monitoring-only" node). Over the last months, everything ran *perfectly* smooth. Today, I did an Ubuntu "apt-get upgrade" on one of the two servers. Among others, the ceph packages were upgraded from 12.2.1 to 12.2.2. A minor release update,

Re: [ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-11 Thread Steven Vacaroaia
[root@osd01 ~]# ceph osd pool ls detail -f json-pretty [ { "pool_name": "rbd", "flags": 1, "flags_names": "hashpspool", "type": 1, "size": 2, "min_size": 1, "crush_rule": 0, "object_hash": 2, "pg_num": 128,

Re: [ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-11 Thread Konstantin Shalygin
On 04/11/2018 07:48 PM, Steven Vacaroaia wrote: Thanks for the suggestion but , unfortunately, having same number of OSD did not solve the issue Here is with 2 OSD per server, 3 servers - identical servers and osd configuration ceph osd pool ls detail ceph osd crush rule dump k

Re: [ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-11 Thread Steven Vacaroaia
Thanks for the suggestion but , unfortunately, having same number of OSD did not solve the issue Here is with 2 OSD per server, 3 servers - identical servers and osd configuration [root@osd01 ~]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 4.02173 root default

Re: [ceph-users] ceph-fuse CPU and Memory usage vs CephFS kclient

2018-04-11 Thread Yan, Zheng
On Wed, Apr 11, 2018 at 2:30 PM, Wido den Hollander wrote: > > > On 04/10/2018 09:45 PM, Gregory Farnum wrote: >> On Tue, Apr 10, 2018 at 12:36 PM, Wido den Hollander wrote: >>> >>> >>> On 04/10/2018 09:22 PM, Gregory Farnum wrote: On Tue, Apr 10, 2018 at 6:32

Re: [ceph-users] radosgw: can't delete bucket

2018-04-11 Thread Micha Krause
Hi, I finaly managed to delete the bucket, I wrote a script that reads the omap keys from the bucket index and deletes every key without a matching object on the data pool. Not sure if this has any negative repercussions, but after the script deleted thousands of keys from the index, i was

Re: [ceph-users] Purged a pool, buckets remain

2018-04-11 Thread Robin H. Johnson
On Tue, Apr 10, 2018 at 10:06:57PM -0500, Robert Stanford wrote: > I used this command to purge my rgw data: > > rados purge default.rgw.buckets.data --yes-i-really-really-mean-it > > Now, when I list the buckets with s3cmd, I still see the buckets (s3cmd ls > shows a listing of them.) When

Re: [ceph-users] rbd-nbd not resizing even after kernel tweaks

2018-04-11 Thread Mykola Golub
On Tue, Apr 10, 2018 at 11:14:58PM -0400, Alex Gorbachev wrote: > So Josef fixed the one issue that enables e.g. lsblk and sysfs size to > reflect the correct siz on change. However, partptobe and parted > still do not detect the change, complete unmap and remap of rbd-nbd > device and remount

Re: [ceph-users] ceph-fuse CPU and Memory usage vs CephFS kclient

2018-04-11 Thread Wido den Hollander
On 04/10/2018 09:45 PM, Gregory Farnum wrote: > On Tue, Apr 10, 2018 at 12:36 PM, Wido den Hollander wrote: >> >> >> On 04/10/2018 09:22 PM, Gregory Farnum wrote: >>> On Tue, Apr 10, 2018 at 6:32 AM Wido den Hollander >> > wrote: >>> >>>

Re: [ceph-users] Purged a pool, buckets remain

2018-04-11 Thread Konstantin Shalygin
Now, when I list the buckets with s3cmd, I still see the buckets (s3cmd ls shows a listing of them.) When I try to delete one (s3cmd rb) I get this: ERROR: S3 error: 404 (NoSuchKey) Because you are drop all your data. But all your buckets is indexed. You shouldn't work with S3 like