[ceph-users] Is this a deadlock?

2017-01-03 Thread 许雪寒
Hi, everyone. Recently in one of our online ceph cluster, one OSD suicided itself after experiencing some network connectivity problem, and the OSD log is as follows: -173> 2017-01-03 23:42:19.145490 7f5021bbc700 0 -- 10.205.49.55:6802/1778451 >> 10.205.49.174:6803/1499671

Re: [ceph-users] Ceph per-user stats?

2017-01-03 Thread Shinobu Kinjo
On Wed, Jan 4, 2017 at 4:33 PM, Henrik Korkuc wrote: > On 17-01-04 03:16, Gregory Farnum wrote: >> >> On Fri, Dec 23, 2016 at 12:04 AM, Henrik Korkuc wrote: >>> >>> Hello, >>> >>> I wondered if Ceph can emit stats (via perf counters, statsd or in some >>> other

Re: [ceph-users] Ceph per-user stats?

2017-01-03 Thread Henrik Korkuc
On 17-01-04 03:16, Gregory Farnum wrote: On Fri, Dec 23, 2016 at 12:04 AM, Henrik Korkuc wrote: Hello, I wondered if Ceph can emit stats (via perf counters, statsd or in some other way) IO and bandwidth stats per Ceph user? I was unable to find such stats. I know that we can

Re: [ceph-users] Ceph pg active+clean+inconsistent

2017-01-03 Thread Shinobu Kinjo
Would you run: # ceph pg debug unfound_objects_exist On Wed, Jan 4, 2017 at 5:31 AM, Andras Pataki wrote: > Here is the output of ceph pg query for one of hte active+clean+inconsistent > PGs: > > { > "state": "active+clean+inconsistent", > "snap_trimq":

Re: [ceph-users] ceph program uses lots of memory

2017-01-03 Thread Bryan Henderson
I did some investigation and tracked the high usage down to librados. I don't think Python has anything to do with it. I also noticed that the memory usage was really unpredictable. Sometimes I could do a whole 'ceph -s' with only 256M; most of the time I couldn't, but the program crashed in

Re: [ceph-users] ceph program uses lots of memory

2017-01-03 Thread Gregory Farnum
On Thu, Dec 29, 2016 at 2:31 PM, Bryan Henderson wrote: > Does anyone know why the 'ceph' program uses so much memory? If I run it with > an address space rlimit of less than 300M, it usually dies with messages about > not being able to allocate memory. > > I'm curious

Re: [ceph-users] What is replay_version used for?

2017-01-03 Thread Gregory Farnum
On Mon, Dec 26, 2016 at 2:08 AM, xxhdx1985126 wrote: > > According to the following comment, It seems that only when using btrfs which > would make FileStore to do "parallel journaling" will the replay_version be > meaningful. Is that right? > > // look at

Re: [ceph-users] osd' balancing question

2017-01-03 Thread Christian Balzer
Hello, On Tue, 3 Jan 2017 15:47:09 +0200 Yair Magnezi wrote: > Hello > > 1) Does the re-weigh / load balancing is taking place only within the > same node ? Not in general, but it could certainly happen if the change is small and involves only a single PG for examples. > 2) I'm raising

Re: [ceph-users] When Zero isn't 0 (Crush weight mysteries)

2017-01-03 Thread Christian Balzer
Hello, On Tue, 3 Jan 2017 16:52:16 -0800 Gregory Farnum wrote: > On Wed, Dec 21, 2016 at 2:33 AM, Wido den Hollander wrote: > > > >> Op 21 december 2016 om 2:39 schreef Christian Balzer : > >> > >> > >> > >> Hello, > >> > >> I just (manually) added 1 OSD each to

Re: [ceph-users] Ceph per-user stats?

2017-01-03 Thread Gregory Farnum
On Fri, Dec 23, 2016 at 12:04 AM, Henrik Korkuc wrote: > Hello, > > I wondered if Ceph can emit stats (via perf counters, statsd or in some > other way) IO and bandwidth stats per Ceph user? I was unable to find such > stats. I know that we can get at least some of these stats

Re: [ceph-users] documentation

2017-01-03 Thread Shinobu Kinjo
Description of ``--pool=data`` is fine but just confusing users. http://docs.ceph.com/docs/jewel/start/quick-ceph-deploy/ should be synced with https://github.com/ceph/ceph/blob/master/doc/start/quick-ceph-deploy.rst I would recommend you to refer ``quick-ceph-deploy.rst`` because docs in git

Re: [ceph-users] When Zero isn't 0 (Crush weight mysteries)

2017-01-03 Thread Gregory Farnum
On Wed, Dec 21, 2016 at 2:33 AM, Wido den Hollander wrote: > >> Op 21 december 2016 om 2:39 schreef Christian Balzer : >> >> >> >> Hello, >> >> I just (manually) added 1 OSD each to my 2 cache-tier nodes. >> The plan was/is to actually do the data-migration at the

Re: [ceph-users] Unbalanced OSD's

2017-01-03 Thread Brian Andrus
On Mon, Jan 2, 2017 at 4:25 AM, Jens Dueholm Christensen wrote: > On Friday, December 30, 2016 07:05 PM Brian Andrus wrote: > > > We have a set it and forget it cronjob setup once an hour to keep things > a bit more balanced. > > > > 1 * * * * /bin/bash

Re: [ceph-users] Ceph pg active+clean+inconsistent

2017-01-03 Thread Andras Pataki
The attributes for one of the inconsistent objects for the following scrub error: 2016-12-20 11:58:25.825830 7f3e1a4b1700 -1 log_channel(cluster) log [ERR] : deep-scrub 6.92c 6:34932257:::1000187bbb5.0009:head on disk size (0) does not match object info size (3014656) adjusted for ondisk

Re: [ceph-users] Ceph pg active+clean+inconsistent

2017-01-03 Thread Andras Pataki
Here is the output of ceph pg query for one of hte active+clean+inconsistent PGs: { "state": "active+clean+inconsistent", "snap_trimq": "[]", "epoch": 342982, "up": [ 319, 90, 51 ], "acting": [ 319, 90, 51 ],

[ceph-users] Estimate Max IOPS of Cluster

2017-01-03 Thread John Petrini
Hello, Does any one have a reasonably accurate way to determine the max IOPS of a Ceph cluster? Thank You, ___ John Petrini ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Why is file extents size observed by "rbd diff" much larger than observed by "du" the object file on the OSD's machie?

2017-01-03 Thread Jason Dillaman
The backing objects are most likely sparse and the diff extents just represent the maximum offset written within each backing object. Calculating the size of an image composed over potentially hundreds of thousands of backing objects is not a cheap operation, so it's best to consider this (and

Re: [ceph-users] performance with/without dmcrypt OSD

2017-01-03 Thread Adrien Gillard
There has been talks on the subject in the mailing list before [1] which concur with Nick's experience as long as you use AES-XTS. [1] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008444.html On Tue, Jan 3, 2017 at 2:30 PM, Nick Fisk wrote: > > > > > *From:*

Re: [ceph-users] osd' balancing question

2017-01-03 Thread Yair Magnezi
Hello 1) Does the re-weigh / load balancing is taking place only within the same node ? 2) I'm raising the target osd weigh but nothing is happening , i expect to see some data movements but nothing is there , only when decreasing the weigh i can see back-filling is taking place , is this

Re: [ceph-users] performance with/without dmcrypt OSD

2017-01-03 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Kent Borg Sent: 03 January 2017 12:47 To: M Ranga Swami Reddy Cc: ceph-users Subject: Re: [ceph-users] performance with/without dmcrypt OSD On 01/03/2017 06:42 AM,

Re: [ceph-users] osd' balancing question

2017-01-03 Thread Yair Magnezi
Hello Christian . Sorry for my mistake it's Infernalis we're running ( 9.2.1 ) our tree looks like this --> root@ecprdbcph01-opens:~# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 51.26370 root default -2 8.49396 host ecprdbcph03-opens 3

Re: [ceph-users] performance with/without dmcrypt OSD

2017-01-03 Thread Kent Borg
On 01/03/2017 06:42 AM, M Ranga Swami Reddy wrote: On Tue, Jan 3, 2017 at 6:17 AM, Kent Borg > wrote: Assuming I am understanding the question... If there isn't too big a performance hit, it makes disk disposal (we expect disks to die,

Re: [ceph-users] Ceph all-possible configuration options

2017-01-03 Thread Wido den Hollander
> Op 3 januari 2017 om 13:05 schreef Rajib Hossen > : > > > Hello, > I am exploring ceph and installed a mini cluster with 1 mon, 3 osd node(3 > osd daemon each node). For that I wrote a ceph.conf file with only needed > configuration options(see below) > >

[ceph-users] Ceph all-possible configuration options

2017-01-03 Thread Rajib Hossen
Hello, I am exploring ceph and installed a mini cluster with 1 mon, 3 osd node(3 osd daemon each node). For that I wrote a ceph.conf file with only needed configuration options(see below) fsid = mon initial members = host mon host = ip public network = cluster network = auth cluster required =

Re: [ceph-users] rsync mirror download.ceph.com - broken file on rsync server

2017-01-03 Thread Björn Lässig
On Thu, 2015-10-29 at 08:48 -0600, Ken Dreyer wrote: > On Wed, Oct 28, 2015 at 7:54 PM, Matt Taylor > wrote: > > I still see rsync errors due to permissions on the remote side: > > > > Thanks for the heads' up; I bet another upload rsync process got > interrupted there. > >

[ceph-users] ceph performance question

2017-01-03 Thread M Ranga Swami Reddy
Hello, I have a ceph cluster with 25% OSDs ( 200 OSDs in a cluster and 50 OSDs are above 80% ) filled with data. Is this (25% of OSDs filled above 80%) causes the ceph clusetr slowness (write operations slow)? Any hint will help? Thnanks Swami ___

Re: [ceph-users] Why is there no data backup mechanism in the rados layer?

2017-01-03 Thread Christian Balzer
Hello, On Tue, 3 Jan 2017 11:16:27 + 许雪寒 wrote: > Hi, everyone. > > I’m researching the online backup mechanism of ceph, like rbd mirroring and > multi-site. And I’m a little confused. Why is there no data backup mechanism > in the rados layer? Wouldn’t this save the bother to implement

Re: [ceph-users] 10Gbit switch advice for small ceph cluster upgrade

2017-01-03 Thread Björn Lässig
On Fri, 2016-12-16 at 09:52 +0100, Robert Sander wrote: > On 15.12.2016 16:49, Bjoern Laessig wrote: > > > What does your Cluster do? Where is your data. What happens now? > > You could configure the interfaces between the nodes as pointopoint > links and run OSPF on them. The cluster nodes then

Re: [ceph-users] docs.ceph.com down?

2017-01-03 Thread Peter Maloney
Here I made a mirror of the docs. http://thinkofaname.tk/ceph-doc/ notes -I have disabled use of http://ayni.ceph.com/public/js/ceph.js, so this will run fast when ceph.com is down -build date 2017-01-03 -code is from https://github.com/ceph commit eb60e5fd3efa30837b8a5d771e6e5c63c1d264ca If

[ceph-users] Why is there no data backup mechanism in the rados layer?

2017-01-03 Thread 许雪寒
Hi, everyone. I’m researching the online backup mechanism of ceph, like rbd mirroring and multi-site. And I’m a little confused. Why is there no data backup mechanism in the rados layer? Wouldn’t this save the bother to implement a backup system for every higher level feature of ceph, like rbd

[ceph-users] osd' balancing question

2017-01-03 Thread Yair Magnezi
Hello cephers We're running firefly ( 9.2.1 ) I'm trying to re balance our cluster's osd and from some reason it looks like the re balance is going the wrong way : What's i'm trying to do is to reduce the loads from osd-14 ( ceph osd crush reweight osd.14 0.75 ) but what i see is the the

Re: [ceph-users] Migrate cephfs metadata to SSD in running cluster

2017-01-03 Thread Wido den Hollander
> Op 3 januari 2017 om 2:49 schreef Mike Miller : > > > will metadata on SSD improve latency significantly? > No, as I said in my previous e-mail, recent benchmarks showed that storing CephFS metadata on SSD does not improve performance. It still might be good to do

[ceph-users] RBD Cache & Multi Attached Volumes

2017-01-03 Thread Lazuardi Nasution
Hi, For using with OpenStack Cinder multi attached volumes, is it possible to disable RBD Cache for specific multi attached volumes only? Single attached volumes still need to enable RBD Cache for better performance. If I disable RBD Cache on /etc/ceph/ceph.conf, is

Re: [ceph-users] problem accessing docs.ceph.com

2017-01-03 Thread Shinobu Kinjo
Yeah, dreamhost seems to have internal issue which is not quite good for us. Sorry for that. On Tue, Jan 3, 2017 at 5:41 PM, Rajib Hossen wrote: > Hello, I can't browse docs.ceph.com for last 2/3 days. Google says it takes > too many time to reload. I also

[ceph-users] problem accessing docs.ceph.com

2017-01-03 Thread Rajib Hossen
Hello, I can't browse docs.ceph.com for last 2/3 days. Google says it takes too many time to reload. I also couldn't ping the website. I also check http://www.downforeveryoneorjustme.com/docs.ceph.com and it says it also down from other ends. Is there a problem in the server? Thanks.