[ceph-users] Luminous can't seem to provision more than 32 OSDs per server

2017-10-18 Thread Sean Sullivan
I am trying to install Ceph luminous (ceph version 12.2.1) on 4 ubuntu 16.04 servers each with 74 disks, 60 of which are HGST 7200rpm sas drives:: HGST HUS724040AL sdbv sas root@kg15-2:~# lsblk --output MODEL,KNAME,TRAN | grep HGST | wc -l 60 I am trying to deploy them all with :: a line like th

Re: [ceph-users] Thick provisioning

2017-10-18 Thread Adrian Saul
I concur - at the moment we need to manually sum the RBD images to look at how much we have "provisioned" vs what ceph df shows. in our case we had a rapid run of provisioning new LUNs but it took a while before usage started to catch up with what was provisioned as data was migrated in. Cep

Re: [ceph-users] OSD crashed while reparing inconsistent PG luminous

2017-10-18 Thread Gregory Farnum
I updated the ticket with some findings. It appears that osd.93 has that snapshot object in its missing set that gets sent to osd.78, and then osd.69 claims to have the object. Can you upload debug logs of those OSDs that go along with this log? (Or just generate a new set of them together.) -Greg

Re: [ceph-users] [Jewel] Crash Osd with void Hit_set_trim

2017-10-18 Thread Brad Hubbard
On Wed, Oct 18, 2017 at 11:16 PM, pascal.pu...@pci-conseil.net wrote: > hello, > > For 2 week, I lost sometime some OSD : > Here trace : > > 0> 2017-10-18 05:16:40.873511 7f7c1e497700 -1 osd/ReplicatedPG.cc: In > function '*void ReplicatedPG::hit_set_trim(*ReplicatedPG::OpContextUPtr&, > unsig

Re: [ceph-users] Osd FAILED assert(p.same_interval_since)

2017-10-18 Thread Gregory Farnum
I've created a ticket http://tracker.ceph.com/issues/21833 Hopefully we can work this out. On Mon, Oct 16, 2017 at 6:03 PM Dejan Lesjak wrote: > > > On 17. okt. 2017, at 00:59, Gregory Farnum wrote: > > > > On Mon, Oct 16, 2017 at 3:49 PM Dejan Lesjak > wrote: > > > > > On 17. okt. 2017, at 0

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Maged Mokhtar
just run the same 32 threaded rados test as you did before and this time run atop while the test is running looking for %busy of cpu/disks. It should give an idea if there is a bottleneck in them. On 2017-10-18 21:35, Russell Glaue wrote: > I cannot run the write test reviewed at the > ceph-ho

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-10-18 Thread Denes Dolhay
Hi All, The linked document is for filestore, which in your case is correct as I understand it, but I wonder, if a similar document exists for bluestore? Thanks, Denes. On 10/18/2017 02:56 PM, Stijn De Weirdt wrote: hi all, we have a ceph 10.2.7 cluster with a 8+3 EC pool. in that pool, the

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-10-18 Thread Gregory Farnum
It would help if you can provide the exact output of "ceph -s", "pg query", and any other relevant data. You shouldn't need to do manual repair of erasure-coded pools, since it has checksums and can tell which bits are bad. Following that article may not have done you any good (though I wouldn't ex

Re: [ceph-users] Thick provisioning

2017-10-18 Thread sinan
Hi all, Thanks for the replies. The main reason why I was looking for the thin/thick provisioning setting is that I want to be sure that provisioned space should not exceed the cluster capacity. With thin provisioning there is a risk that more space is provisioned than the cluster capacity. When

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Russell Glaue
I cannot run the write test reviewed at the ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device blog. The tests write directly to the raw disk device. Reading an infile (created with urandom) on one SSD, writing the outfile to another osd, yields about 17MB/s. But Isn't this write speed li

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Maged Mokhtar
measuring resource load as outlined earlier will show if the drives are performing well or not. Also how many osds do you have ? On 2017-10-18 19:26, Russell Glaue wrote: > The SSD drives are Crucial M500 > A Ceph user did some benchmarks and found it had good performance > https://forum.prox

Re: [ceph-users] cephfs ceph-fuse performance

2017-10-18 Thread Patrick Donnelly
Hello Ashley, On Wed, Oct 18, 2017 at 12:45 AM, Ashley Merrick wrote: > 1/ Is there any options or optimizations that anyone has used or can suggest > to increase ceph-fuse performance? You may try playing with the sizes of reads/writes. Another alternative is to use libcephfs directly to avoid

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Russell Glaue
36 OSDs Each of 4 storage servers has 9 1TB SSD drives, each drive as 1 osd (no RAID) == 36 OSDs Each drive is one LVM group, with two volumes - one volume for the osd, one volume for the journal Each osd is formatted with xfs On Wed, Oct 18, 2017 at 1:33 PM, Maged Mokhtar wrote: > measuring r

Re: [ceph-users] auth error with ceph-deploy on jewel to luminous upgrade

2017-10-18 Thread Gary Molenkamp
Sorry to reply to my own question, but I noticed that the cephx key for client.bootstrap-mgr was inconsistent with the key in /var/lib/ceph/bootstrap-mgr/ceph.keyring. I deleted the entry in ceph: ceph auth del client.bootstrap-mgr reran the ceph-deploy gather keys: ceph-deploy gathe

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Russell Glaue
The SSD drives are Crucial M500 A Ceph user did some benchmarks and found it had good performance https://forum.proxmox.com/threads/ceph-bad-performance-in-qemu-guests.21551/ However, a user comment from 3 years ago on the blog post you linked to says to avoid the Crucial M500 Yet, this performan

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Maged Mokhtar
Check out the following link: some SSDs perform bad in Ceph due to sync writes to journal https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ Anther thing that can help is to re-run the rados 32 threads as stress and view resource usage usi

Re: [ceph-users] OSD crashed while reparing inconsistent PG luminous

2017-10-18 Thread Mart van Santen
Dear all, We are still struggling this this issue. By now, one OSD crashes all the time (a different then yesterday), but now on a different assert. Namely with this one: #0  0x75464428 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1  0x7546602a in _

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Russell Glaue
In my previous post, in one of my points I was wondering if the request size would increase if I enabled jumbo packets. currently it is disabled. @jdillama: The qemu settings for both these two guest machines, with RAID/LVM and Ceph/rbd images, are the same. I am not thinking that changing the qem

[ceph-users] auth error with ceph-deploy on jewel to luminous upgrade

2017-10-18 Thread Gary molenkamp
I'm running into a permission error when attempting to use ceph-deploy to create an mgr on a recently upgraded jewel->luminous ceph cluster. I've attempted to track down the permission, but so far no success. I'm doing this on a dev environment so I can replicate: Start with a sample jewel

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Maged Mokhtar
First a general comment: local RAID will be faster than Ceph for a single threaded (queue depth=1) io operation test. A single thread Ceph client will see at best same disk speed for reads and for writes 4-6 times slower than single disk. Not to mention the latency of local disks will much better.

[ceph-users] [Jewel] Crash Osd with void Hit_set_trim

2017-10-18 Thread pascal.pu...@pci-conseil.net
hello, For 2 week, I lost sometime some OSD : Here trace :     0> 2017-10-18 05:16:40.873511 7f7c1e497700 -1 osd/ReplicatedPG.cc: In function '*void ReplicatedPG::hit_set_trim(*ReplicatedPG::OpContextUPtr&, unsigned int)' thread 7f7c1e497700 time 2017-10-18 05:16:40.869962 osd/ReplicatedPG.c

Re: [ceph-users] High mem with Luminous/Bluestore

2017-10-18 Thread Sage Weil
On Wed, 18 Oct 2017, Wido den Hollander wrote: > > Op 18 oktober 2017 om 13:48 schreef Hans van den Bogert > > : > > > > > > Indeed it shows ssd in the OSD's metadata. > > > > "bluestore_bdev_type": "ssd", > > > > > > Then I misunderstood the role of the device class in CRUSH, I expected

[ceph-users] ceph inconsistent pg missing ec object

2017-10-18 Thread Stijn De Weirdt
hi all, we have a ceph 10.2.7 cluster with a 8+3 EC pool. in that pool, there is a pg in inconsistent state. we followed http://ceph.com/geen-categorie/ceph-manually-repair-object/, however, we are unable to solve our issue. from the primary osd logs, the reported pg had a missing object. we fo

Re: [ceph-users] High mem with Luminous/Bluestore

2017-10-18 Thread Wido den Hollander
> Op 18 oktober 2017 om 13:48 schreef Hans van den Bogert > : > > > Indeed it shows ssd in the OSD's metadata. > > "bluestore_bdev_type": "ssd", > > > Then I misunderstood the role of the device class in CRUSH, I expected the > OSD would actually set its settings according to the CRUSH d

[ceph-users] Slow requests

2017-10-18 Thread Ольга Ухина
Hi! I have a problem with ceph luminous 12.2.1. It was upgraded from kraken, but I'm not sure if it was a problem in kraken. I have slow requests on different OSDs on random time (for example at night, but I don't see any problems at the time of problem with disks, CPU, there is possibility of net

Re: [ceph-users] High mem with Luminous/Bluestore

2017-10-18 Thread Mark Nelson
Memory usage is still quite high here even with a large onode cache! Are you using erasure coding? I recently was able to reproduce a bug in bluestore causing excessive memory usage during large writes with EC, but have not tracked down exactly what's going on yet. Mark On 10/18/2017 06:48 A

Re: [ceph-users] High mem with Luminous/Bluestore

2017-10-18 Thread Hans van den Bogert
Indeed it shows ssd in the OSD's metadata. "bluestore_bdev_type": "ssd", Then I misunderstood the role of the device class in CRUSH, I expected the OSD would actually set its settings according to the CRUSH device class. I'll try to force the OSDs to behave like HDDs and monitor the memory

Re: [ceph-users] [MONITOR SEGFAULT] Luminous cluster stuck when adding monitor

2017-10-18 Thread Nico Schottelius
Hey Joao, thanks for the pointer! Do you have a timeline for the release of v12.2.2? Best, Nico -- Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/li

Re: [ceph-users] Help with full osd and RGW not responsive

2017-10-18 Thread Webert de Souza Lima
Hi Bryan. I hope that solved it for you. Another think you can do in situations like this is to set the full_ration higher so you can work on the problem. Always set it back to a safe value after the issue is solved. *ceph pg set_full_ratio 0.98* Regards, Webert Lima DevOps Engineer at MAV Te

Re: [ceph-users] OSD crashed while reparing inconsistent PG luminous

2017-10-18 Thread Ana Aviles
Hello, We created a BUG #21827 . Also updated the log file of the OSD with debug 20. Reference is 6e4dba6f-2c15-4920-b591-fe380bbca200 Thanks, Ana On 18/10/17 00:46, Mart van Santen wrote: > > > Hi Greg, > > (I'm a colleague of Ana), Thank you for your reply > > > On 10/17/2017 11:57 PM, Gregory

Re: [ceph-users] Thick provisioning

2017-10-18 Thread Samuel Soulard
I can only speak for some environments, but sometimes, you would want to make sure that a cluster cannot fill up until you can add more capacity. Some organizations are unable to purchase new capacity rapidly and making sure you cannot exceed your current capacity, then you can't run into problems

Re: [ceph-users] High mem with Luminous/Bluestore

2017-10-18 Thread Wido den Hollander
> Op 18 oktober 2017 om 11:41 schreef Hans van den Bogert > : > > > Hi All, > > I've converted 2 nodes with 4 HDD/OSDs each from Filestore to Bluestore. I > expected somewhat higher memory usage/RSS values, however I see, imo, a > huge memory usage for all OSDs on both nodes. > > Small snippe

Re: [ceph-users] [MONITOR SEGFAULT] Luminous cluster stuck when adding monitor

2017-10-18 Thread Joao Eduardo Luis
On 10/18/2017 10:38 AM, Nico Schottelius wrote: Hello Joao, thanks for coming back! I copied the log of the crashing monitor to http://www.nico.schottelius.org/cephmonlog-2017-10-08-v2.xz The monitor is crashing as part of bug #21300 http://tracker.ceph.com/issues/21300 And the fix is c

Re: [ceph-users] OSD are marked as down after jewel -> luminous upgrade

2017-10-18 Thread Daniel Carrasco
Finally i've disabled the mon_osd_report_timeout option and seems to works fine. Greetings!. 2017-10-17 19:02 GMT+02:00 Daniel Carrasco : > Thanks!! > > I'll take a look later. > > Anyway, all my Ceph daemons are in same version on all nodes (I've > upgraded the whole cluster). > > Cheers!! > >

Re: [ceph-users] [MONITOR SEGFAULT] Luminous cluster stuck when adding monitor

2017-10-18 Thread Joao Eduardo Luis
On 10/18/2017 10:38 AM, Nico Schottelius wrote: Hello Joao, thanks for coming back! I copied the log of the crashing monitor to http://www.nico.schottelius.org/cephmonlog-2017-10-08-v2.xz Can I somehow get access to the logs of the other monitors, without restarting them? If you mean incre

[ceph-users] High mem with Luminous/Bluestore

2017-10-18 Thread Hans van den Bogert
Hi All, I've converted 2 nodes with 4 HDD/OSDs each from Filestore to Bluestore. I expected somewhat higher memory usage/RSS values, however I see, imo, a huge memory usage for all OSDs on both nodes. Small snippet from `top` PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ C

Re: [ceph-users] [MONITOR SEGFAULT] Luminous cluster stuck when adding monitor

2017-10-18 Thread Nico Schottelius
Hello Joao, thanks for coming back! I copied the log of the crashing monitor to http://www.nico.schottelius.org/cephmonlog-2017-10-08-v2.xz Can I somehow get access to the logs of the other monitors, without restarting them? I would like to not stop them, as currently we are running with 2/3

Re: [ceph-users] [MONITOR SEGFAULT] Luminous cluster stuck when adding monitor

2017-10-18 Thread Joao Eduardo Luis
Hi Nico, I'm sorry I forgot about your issue. Crazy few weeks. I checked the log you initially sent to the list, but it only contains the log from one of the monitors, and it's from the one synchronizing. This monitor is not stuck however - synchronizing is progressing, albeit slowly. Can y

[ceph-users] cephfs ceph-fuse performance

2017-10-18 Thread Ashley Merrick
Hello, I have been trying cephfs on the latest 12.x release. Performance under cephfs mounted via kernel seems to be as expected maxing out the underlying storage / resources using kernel version 4.13.4. However when it comes to mounting cephfs via ceph-fuse looking at performance of 5-10% for

Re: [ceph-users] Bluestore OSD_DATA, WAL & DB

2017-10-18 Thread Marco Baldini - H.S. Amiata
Hi I'm about to change some SATA SSD disks to NVME disks and for CEPH I too would like to know how to assign space. I have 3 1TB SATA OSDs so I'll split the NVME disks in 3 partitions of equal size, I'm not going to assign a different WAL partition because, if the docs are right, the WAL is a