Re: [ceph-users] How to improve single thread sequential reads?

2015-08-18 Thread Wido den Hollander
On 18-08-15 12:25, Benedikt Fraunhofer wrote: Hi Nick, did you do anything fancy to get to ~90MB/s in the first place? I'm stuck at ~30MB/s reading cold data. single-threaded-writes are quite speedy, around 600MB/s. radosgw for cold data is around the 90MB/s, which is imho limitted by

[ceph-users] НА: НА: tcmalloc use a lot of CPU

2015-08-18 Thread Межов Игорь Александрович
Hi! How many nodes? How many SSDs/OSDs? 2 Nodes, each: - 1xE5-2670, 128G, - 2x146G SAS 10krpm - system + MON root - 10x600G SAS 10krpm + 7x900G SAS 10krpm single drive RAID0 on lsi2208 - 2x400G SSD Intel DC S3700 on С602 - for separate SSD pool - 2x200G SSD Intel DC S3700 on SATA3- for ceph

Re: [ceph-users] How to improve single thread sequential reads?

2015-08-18 Thread Jan Schermer
I'm not sure if I missed that but are you testing in a VM backed by RBD device, or using the device directly? I don't see how blk-mq would help if it's not a VM, it just passes the request to the underlying block device, and in case of RBD there is no real block device from the host

Re: [ceph-users] How to improve single thread sequential reads?

2015-08-18 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: 18 August 2015 11:50 To: Benedikt Fraunhofer given.to.lists.ceph- users.ceph.com.toasta@traced.net Cc: ceph-users@lists.ceph.com; Nick Fisk n...@fisk.me.uk Subject:

Re: [ceph-users] Repair inconsistent pgs..

2015-08-18 Thread Abhishek L
Voloshanenko Igor writes: Hi Irek, Please read careful ))) You proposal was the first, i try to do... That's why i asked about help... ( 2015-08-18 8:34 GMT+03:00 Irek Fasikhov malm...@gmail.com: Hi, Igor. You need to repair the PG. for i in `ceph pg dump| grep inconsistent | grep -v

Re: [ceph-users] Memory-Usage

2015-08-18 Thread Gregory Farnum
On Mon, Aug 17, 2015 at 8:21 PM, Patrik Plank pat...@plank.me wrote: Hi, have a ceph cluster witch tree nodes and 32 osds. The tree nodes have 16Gb memory but only 5Gb is in use. Nodes are Dell Poweredge R510. my ceph.conf: [global] mon_initial_members = ceph01 mon_host =

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Jan Schermer
I already evaluated EnhanceIO in combination with CentOS 6 (and backported 3.10 and 4.0 kernel-lt if I remember correctly). It worked fine during benchmarks and stress tests, but once we run DB2 on it it panicked within minutes and took all the data with it (almost literally - files that werent

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: 18 August 2015 10:01 To: Alex Gorbachev a...@iss-integration.com Cc: Dominik Zalewski dzalew...@optlink.net; ceph-users@lists.ceph.com Subject: Re: [ceph-users] any

Re: [ceph-users] How to improve single thread sequential reads?

2015-08-18 Thread Benedikt Fraunhofer
Hi Nick, did you do anything fancy to get to ~90MB/s in the first place? I'm stuck at ~30MB/s reading cold data. single-threaded-writes are quite speedy, around 600MB/s. radosgw for cold data is around the 90MB/s, which is imho limitted by the speed of a single disk. Data already present on the

Re: [ceph-users] Ceph File System ACL Support

2015-08-18 Thread Gregory Farnum
On Mon, Aug 17, 2015 at 4:12 AM, Yan, Zheng uker...@gmail.com wrote: On Mon, Aug 17, 2015 at 9:38 AM, Eric Eastman eric.east...@keepertech.com wrote: Hi, I need to verify in Ceph v9.0.2 if the kernel version of Ceph file system supports ACLs and the libcephfs file system interface does not.

Re: [ceph-users] Repair inconsistent pgs..

2015-08-18 Thread Voloshanenko Igor
No. This will no help ((( I try to found data, but it's look exist with same time stamp on all osd or missing on all osd ... So, need advice , what I need to do... вторник, 18 августа 2015 г. пользователь Abhishek L написал: Voloshanenko Igor writes: Hi Irek, Please read careful ))) You

Re: [ceph-users] tcmalloc use a lot of CPU

2015-08-18 Thread Alexandre DERUMIER
Hi Mark, Yep! At least from what I've seen so far, jemalloc is still a little faster for 4k random writes even compared to tcmalloc with the patch + 128MB thread cache. Should have some data soon (mostly just a reproduction of Sandisk and Intel's work). I definitively switch to jemmaloc from

[ceph-users] НА: Question

2015-08-18 Thread Межов Игорь Александрович
Hi! You can run mons on the same hosts, though it is not recommemned. MON daemon itself are not resurce hungry - 1-2 cores and 2-4 Gb RAM is enough in most small installs. But there are some pitfalls: - MONs use LevelDB as a backstorage, and widely use direct write to ensure DB consistency. So,

[ceph-users] Fwd: Repair inconsistent pgs..

2015-08-18 Thread Voloshanenko Igor
-- Пересылаемое сообщение - От: *Voloshanenko Igor* igor.voloshane...@gmail.com Дата: вторник, 18 августа 2015 г. Тема: Repair inconsistent pgs.. Кому: Irek Fasikhov malm...@gmail.com Some additional inforamtion (Tnx Irek for questions!) Pool values: root@test:~# ceph osd pool

Re: [ceph-users] How repair 2 invalids pgs

2015-08-18 Thread Pierre BLONDEAU
Le 14/08/2015 15:48, Pierre BLONDEAU a écrit : Hy, Yesterday, I removed 5 ods on 15 from my cluster ( machine migration ). When I stopped the processes, I haven't verified that all the pages were in active stat. I removed the 5 ods from the cluster ( ceph osd out osd.9 ; ceph osd crush rm

Re: [ceph-users] Repair inconsistent pgs..

2015-08-18 Thread Voloshanenko Igor
No. This will no help ((( I try to found data, but it's look exist with same time stamp on all osd or missing on all osd ... So, need advice , what I need to do... вторник, 18 августа 2015 г. пользователь Abhishek L написал: Voloshanenko Igor writes: Hi Irek, Please read careful ))) You

Re: [ceph-users] Stuck creating pg

2015-08-18 Thread Bart Vanbrabant
1) No errors at all. At loglevel 20 the osd does not say anything about the missing placement group 2) I tried that. Several times actually, also for the secondary osd's, but it does not work. gr, Bart On Tue, Aug 18, 2015 at 4:28 AM minchen minche...@outlook.com wrote: osd.19 is blocked by

[ceph-users] radosgw-agent keeps syncing most active bucket - ignoring others

2015-08-18 Thread Sam Wouters
Hi, from the doc of radosgw-agent and some items in this list, I understood that the max-entries argument was there to prevent a very active bucket to keep the other buckets from keeping synced. In our agent logs however we saw a lot of bucket instance bla has 1000 entries after bla, and the

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
-Original Message- From: Emmanuel Florac [mailto:eflo...@intellique.com] Sent: 18 August 2015 12:26 To: Nick Fisk n...@fisk.me.uk Cc: 'Jan Schermer' j...@schermer.cz; 'Alex Gorbachev' ag@iss- integration.com; 'Dominik Zalewski' dzalew...@optlink.net; ceph- us...@lists.ceph.com

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
Just to chime in, I gave dmcache a limited test but its lack of proper writeback cache ruled it out for me. It only performs write back caching on blocks already on the SSD, whereas I need something that works like a Battery backed raid controller caching all writes. It's amazing the 100x

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Emmanuel Florac
Le Tue, 18 Aug 2015 10:12:59 +0100 Nick Fisk n...@fisk.me.uk écrivait: Bcache should be the obvious choice if you are in control of the environment. At least you can cry on LKML's shoulder when you lose data :-) Please note, it looks like the main(only?) dev of Bcache has started

[ceph-users] Rename Ceph cluster

2015-08-18 Thread Vasiliy Angapov
Hi, Does anyone know what steps should be taken to rename a Ceph cluster? Btw, is it ever possbile without data loss? Background: I have a cluster named ceph-prod integrated with OpenStack, however I found out that the default cluster name ceph is very much hardcoded into OpenStack so I decided

Re: [ceph-users] Rename Ceph cluster

2015-08-18 Thread Jan Schermer
I think it's pretty clear: http://ceph.com/docs/master/install/manual-deployment/ For example, when you run multiple clusters in a federated architecture, the cluster name (e.g., us-west, us-east) identifies the cluster for the current CLI session. Note: To identify the cluster name on the

Re: [ceph-users] Repair inconsistent pgs..

2015-08-18 Thread Gregory Farnum
From a quick peek it looks like some of the OSDs are missing clones of objects. I'm not sure how that could happen and I'd expect the pg repair to handle that but if it's not there's probably something wrong; what version of Ceph are you running? Sam, is this something you've seen, a new bug, or

Re: [ceph-users] Rename Ceph cluster

2015-08-18 Thread Wido den Hollander
On 18-08-15 14:13, Erik McCormick wrote: I've got a custom named cluster integrated with Openstack (Juno) and didn't run into any hard-coded name issues that I can recall. Where are you seeing that? As to the name change itself, I think it's really just a label applying to a configuration

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Mark Nelson
Hi Jan, Out of curiosity did you ever try dm-cache? I've been meaning to give it a spin but haven't had the spare cycles. Mark On 08/18/2015 04:00 AM, Jan Schermer wrote: I already evaluated EnhanceIO in combination with CentOS 6 (and backported 3.10 and 4.0 kernel-lt if I remember

Re: [ceph-users] radosgw-agent keeps syncing most active bucket - ignoring others

2015-08-18 Thread Sam Wouters
Hmm, looks like intended behaviour: SNIP CommitDate: Mon Mar 3 06:08:42 2014 -0800 worker: process all bucket instance log entries at once Currently if there are more than max_entries in a single bucket instance's log, only max_entries of those will be processed, and the bucket

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Stefan Priebe - Profihost AG
We're using an extra caching layer for ceph since the beginning for our older ceph deployments. All new deployments go with full SSDs. I've tested so far: - EnhanceIO - Flashcache - Bcache - dm-cache - dm-writeboost The best working solution was and is bcache except for it's buggy code. The

Re: [ceph-users] How to improve single thread sequential reads?

2015-08-18 Thread Jan Schermer
Reply in text On 18 Aug 2015, at 12:59, Nick Fisk n...@fisk.me.uk wrote: -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: 18 August 2015 11:50 To: Benedikt Fraunhofer given.to.lists.ceph-

Re: [ceph-users] How to improve single thread sequential reads?

2015-08-18 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: 18 August 2015 12:41 To: Nick Fisk n...@fisk.me.uk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] How to improve single thread sequential reads? Reply in

Re: [ceph-users] How to improve single thread sequential reads?

2015-08-18 Thread Jan Schermer
On 18 Aug 2015, at 13:58, Nick Fisk n...@fisk.me.uk wrote: -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: 18 August 2015 12:41 To: Nick Fisk n...@fisk.me.uk Cc: ceph-users@lists.ceph.com Subject: Re:

Re: [ceph-users] Rename Ceph cluster

2015-08-18 Thread Erik McCormick
I've got a custom named cluster integrated with Openstack (Juno) and didn't run into any hard-coded name issues that I can recall. Where are you seeing that? As to the name change itself, I think it's really just a label applying to a configuration set. The name doesn't actually appear *in* the

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Jan Schermer
Yes, writeback mode. I didn't try anything else. Jan On 18 Aug 2015, at 18:30, Alex Gorbachev a...@iss-integration.com wrote: HI Jan, On Tue, Aug 18, 2015 at 5:00 AM, Jan Schermer j...@schermer.cz wrote: I already evaluated EnhanceIO in combination with CentOS 6 (and backported 3.10

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: 18 August 2015 17:13 To: Nick Fisk n...@fisk.me.uk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] any recommendation of using EnhanceIO? On 18 Aug 2015, at

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Alex Gorbachev
IE, should we be focusing on IOPS? Latency? Finding a way to avoid journal overhead for large writes? Are there specific use cases where we should specifically be focusing attention? general iscsi? S3? databases directly on RBD? etc. There's tons of different areas that we can work on

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Mark Nelson
On 08/18/2015 11:52 AM, Nick Fisk wrote: snip Here's kind of how I see the field right now: 1) Cache at the client level. Likely fastest but obvious issues like above. RAID1 might be an option at increased cost. Lack of barriers in some implementations scary. Agreed. 2) Cache below

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Samuel Just
1. We've kicked this around a bit. What kind of failure semantics would you be comfortable with here (that is, what would be reasonable behavior if the client side cache fails)? 2. We've got a branch which should merge soon (tomorrow probably) which actually does allow writes to be proxied, so

[ceph-users] ceph-osd suddenly dies and no longer can be started

2015-08-18 Thread Евгений Д .
Hello. I have a small Ceph cluster running 9 OSDs, using XFS on separate disks and dedicated partitions on system disk for journals. After creation it worked fine for a while. Then suddenly one of OSDs stopped and didn't start. I had to recreate it. Recovery started. After few days of recovery

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: 18 August 2015 18:51 To: Nick Fisk n...@fisk.me.uk; 'Jan Schermer' j...@schermer.cz Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] any recommendation of using

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Christian Balzer
On Tue, 18 Aug 2015 20:48:26 +0100 Nick Fisk wrote: [mega snip] 4. Disk based OSD with SSD Journal performance As I touched on above earlier, I would expect a disk based OSD with SSD journal to have similar performance to a pure SSD OSD when dealing with sequential small IO's. Currently the

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Christian Balzer
On Tue, 18 Aug 2015 12:50:38 -0500 Mark Nelson wrote: [snap] Probably the big question is what are the pain points? The most common answer we get when asking folks what applications they run on top of Ceph is everything!. This is wonderful, but not helpful when trying to figure out what

[ceph-users] [Cache-tier] librbd: error finding source object: (2) No such file or directory

2015-08-18 Thread Ta Ba Tuan
Hi everyone, I has been used the cache-tier on a data pool. After a long time, a lot of rbd images don't be displayed in rbd -p data ls. Although that Images still show through rbd info and rados ls command. rbd -p data info volume-008ae4f7-3464-40c0-80b0-51140d8b95a8 rbd image

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
Hi Sam, -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Samuel Just Sent: 18 August 2015 21:38 To: Nick Fisk n...@fisk.me.uk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] any recommendation of using EnhanceIO? 1. We've

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Campbell, Bill
Hey Stefan, Are you using your Ceph cluster for virtualization storage? Is dm-writeboost configured on the OSD nodes themselves? - Original Message - From: Stefan Priebe - Profihost AG s.pri...@profihost.ag To: Mark Nelson mnel...@redhat.com, ceph-users@lists.ceph.com Sent: Tuesday,

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Mark Nelson
On 08/18/2015 06:47 AM, Nick Fisk wrote: Just to chime in, I gave dmcache a limited test but its lack of proper writeback cache ruled it out for me. It only performs write back caching on blocks already on the SSD, whereas I need something that works like a Battery backed raid controller

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Mark Nelson
On 08/18/2015 09:24 AM, Jan Schermer wrote: On 18 Aug 2015, at 15:50, Mark Nelson mnel...@redhat.com wrote: On 08/18/2015 06:47 AM, Nick Fisk wrote: Just to chime in, I gave dmcache a limited test but its lack of proper writeback cache ruled it out for me. It only performs write back

Re: [ceph-users] ceph cluster_network with linklocal ipv6

2015-08-18 Thread Jan Schermer
Should ceph care about what scope the address is in? We don't specify it for ipv4 anyway, or is link-scope special in some way? And isn't this the correct syntax actually? cluster_network = fe80::/64%cephnet On 18 Aug 2015, at 16:17, Wido den Hollander w...@42on.com wrote: On 18-08-15

Re: [ceph-users] ceph distributed osd

2015-08-18 Thread gjprabu
Hi Luis, What i mean , we have three OSD with Harddisk size each 1TB and two pool (poolA and poolB) with replica 2. Here writing behavior is the confusion for us. Our assumptions is below. PoolA -- may write with OSD1 and OSD2 (is this correct) PoolB -- may write with OSD3

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: 18 August 2015 14:51 To: Nick Fisk n...@fisk.me.uk; 'Jan Schermer' j...@schermer.cz Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] any recommendation of using

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
snip Here's kind of how I see the field right now: 1) Cache at the client level. Likely fastest but obvious issues like above. RAID1 might be an option at increased cost. Lack of barriers in some implementations scary. Agreed. 2) Cache below the OSD. Not much recent

Re: [ceph-users] ceph cluster_network with linklocal ipv6

2015-08-18 Thread Wido den Hollander
Op 18 aug. 2015 om 18:15 heeft Jan Schermer j...@schermer.cz het volgende geschreven: On 18 Aug 2015, at 17:57, Björn Lässig b.laes...@pengutronix.de wrote: On 08/18/2015 04:32 PM, Jan Schermer wrote: Should ceph care about what scope the address is in? We don't specify it for

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Alex Gorbachev
HI Jan, On Tue, Aug 18, 2015 at 5:00 AM, Jan Schermer j...@schermer.cz wrote: I already evaluated EnhanceIO in combination with CentOS 6 (and backported 3.10 and 4.0 kernel-lt if I remember correctly). It worked fine during benchmarks and stress tests, but once we run DB2 on it it panicked

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Jan Schermer
On 18 Aug 2015, at 16:44, Nick Fisk n...@fisk.me.uk wrote: -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: 18 August 2015 14:51 To: Nick Fisk n...@fisk.me.uk; 'Jan Schermer' j...@schermer.cz Cc:

Re: [ceph-users] ceph cluster_network with linklocal ipv6

2015-08-18 Thread Jan Schermer
On 18 Aug 2015, at 17:57, Björn Lässig b.laes...@pengutronix.de wrote: On 08/18/2015 04:32 PM, Jan Schermer wrote: Should ceph care about what scope the address is in? We don't specify it for ipv4 anyway, or is link-scope special in some way? fe80::/64 is on every ipv6 enabled interface

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Mark Nelson
On 08/18/2015 11:08 AM, Nick Fisk wrote: -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: 18 August 2015 15:55 To: Jan Schermer j...@schermer.cz Cc: ceph-users@lists.ceph.com; Nick Fisk n...@fisk.me.uk Subject: Re:

Re: [ceph-users] Repair inconsistent pgs..

2015-08-18 Thread Samuel Just
Also, what command are you using to take snapshots? -Sam On Tue, Aug 18, 2015 at 8:48 AM, Samuel Just sj...@redhat.com wrote: Is the number of inconsistent objects growing? Can you attach the whole ceph.log from the 6 hours before and after the snippet you linked above? Are you using

Re: [ceph-users] any recommendation of using EnhanceIO?

2015-08-18 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: 18 August 2015 15:55 To: Jan Schermer j...@schermer.cz Cc: ceph-users@lists.ceph.com; Nick Fisk n...@fisk.me.uk Subject: Re: [ceph-users] any recommendation of using

Re: [ceph-users] ceph cluster_network with linklocal ipv6

2015-08-18 Thread Björn Lässig
On 08/18/2015 04:32 PM, Jan Schermer wrote: Should ceph care about what scope the address is in? We don't specify it for ipv4 anyway, or is link-scope special in some way? fe80::/64 is on every ipv6 enabled interface ... thats different from legacy ip. And isn't this the correct syntax