Re: [ceph-users] Poor performance on all SSD cluster

2014-06-23 Thread Alexandre DERUMIER
I don't known if it's related, but "[Performance] Improvement on DB Performance" http://www.spinics.net/lists/ceph-devel/msg19062.html they are a patch here: https://github.com/ceph/ceph/pull/1848 already pushed in master - Mail original - De: "Robert van Leeuwen" À: ceph-users@l

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-23 Thread Robert van Leeuwen
> All of which means that Mysql performance (looking at you binlog) may > still suffer due to lots of small block size sync writes. Which begs the question: Anyone running a reasonable busy Mysql server on Ceph backed storage? We tried and it did not perform good enough. We have a small ceph clu

Re: [ceph-users] Deep scrub versus osd scrub load threshold

2014-06-23 Thread Christian Balzer
Hello, On Mon, 23 Jun 2014 21:50:50 -0700 David Zafman wrote: > > By default osd_scrub_max_interval and osd_deep_scrub_interval are 1 week > 604800 seconds (60*60*24*7) and osd_scrub_min_interval is 1 day 86400 > seconds (60*60*24). As long as osd_scrub_max_interval <= > osd_deep_scrub_interv

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-23 Thread Mark Kirkwood
On 24/06/14 17:37, Alexandre DERUMIER wrote: Hi Greg, So the only way to improve performance would be to not use O_DIRECT (as this should bypass rbd cache as well, right?). yes, indeed O_DIRECT bypass cache. BTW, Do you need to use mysql with O_DIRECT ? default innodb_flush_method is fda

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-23 Thread Alexandre DERUMIER
Hi Greg, >>So the only way to improve performance would be to not use O_DIRECT (as this >>should bypass rbd cache as well, right?). yes, indeed O_DIRECT bypass cache. BTW, Do you need to use mysql with O_DIRECT ? default innodb_flush_method is fdatasync, so it should work with cache. (but yo

Re: [ceph-users] Deep scrub versus osd scrub load threshold

2014-06-23 Thread David Zafman
By default osd_scrub_max_interval and osd_deep_scrub_interval are 1 week 604800 seconds (60*60*24*7) and osd_scrub_min_interval is 1 day 86400 seconds (60*60*24). As long as osd_scrub_max_interval <= osd_deep_scrub_interval then the load won’t impact when deep scrub occurs. I suggest that o

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-23 Thread Christian Balzer
Hello, On Mon, 23 Jun 2014 10:26:32 -0700 Greg Poirier wrote: > 10 OSDs per node So 90 OSDs in total. > 12 physical cores hyperthreaded (24 logical cores exposed to OS) Sounds good. > 64GB RAM With SSDs the effect of a large pagecache on the storage nodes isn't that pronounced, but still nice.

Re: [ceph-users] Deep scrub versus osd scrub load threshold

2014-06-23 Thread Christian Balzer
Hello, On Mon, 23 Jun 2014 14:20:37 -0400 Gregory Farnum wrote: > Looks like it's a doc error (at least on master), but it might have > changed over time. If you're running Dumpling we should change the > docs. Nope, I'm running 0.80.1 currently. Christian > -Greg > Software Engineer #42 @ ht

Re: [ceph-users] Multiple hierarchies and custom placement

2014-06-23 Thread Shayan Saeed
Thanks for getting back with a helpful reply. Assuming that I change the source code to do custom placement, what are the places I need to look in the code to do that? I am currently trying to change the CRUSH code, but is there any place else I need to be concerned about? Regards, Shayan Saeed

Re: [ceph-users] HowTo mark an OSD as down?

2014-06-23 Thread Udo Lembke
Hi Henrik, On 23.06.2014 09:16, Henrik Korkuc wrote: > "ceph osd set noup" will prevent osd's from becoming up. Later remember > to run "ceph osd unset noup". > > You can stop OSD with "stop ceph-osd id=29". > > thanks for the hint! Udo ___ ceph-users m

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-23 Thread Mark Nelson
On 06/23/2014 12:54 PM, Greg Poirier wrote: On Sun, Jun 22, 2014 at 6:44 AM, Mark Nelson mailto:mark.nel...@inktank.com>> wrote: RBD Cache is definitely going to help in this use case. This test is basically just sequentially writing a single 16k chunk of data out, one at a time. I

Re: [ceph-users] Behaviour of ceph pg repair on different replication levels

2014-06-23 Thread Gregory Farnum
On Mon, Jun 23, 2014 at 4:54 AM, Christian Eichelmann wrote: > Hi ceph users, > > since our cluster had a few inconsistent pgs in the last time, i was > wondering what ceph pg repair does, depending on the replication level. > So I just wanted to check if my assumptions are correct: > > Replicatio

Re: [ceph-users] trying to interpret lines in osd.log

2014-06-23 Thread Gregory Farnum
On Mon, Jun 23, 2014 at 4:26 AM, Christian Kauhaus wrote: > I see several instances of the following log messages in the OSD logs each > day: > > 2014-06-21 02:05:27.740697 7fbc58b78700 0 -- 172.22.8.12:6810/31918 >> > 172.22.8.12:6800/28827 pipe(0x7fbe400029f0 sd=764 :6810 s=0 pgs=0 cs=0 l=0 >

Re: [ceph-users] Deep scrub versus osd scrub load threshold

2014-06-23 Thread Gregory Farnum
Looks like it's a doc error (at least on master), but it might have changed over time. If you're running Dumpling we should change the docs. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sun, Jun 22, 2014 at 10:18 PM, Christian Balzer wrote: > > Hello, > > This weekend I

Re: [ceph-users] Multiple hierarchies and custom placement

2014-06-23 Thread Gregory Farnum
On Fri, Jun 20, 2014 at 4:23 PM, Shayan Saeed wrote: > Is it allowed for crush maps to have multiple hierarchies for different > pools. So for example, I want one pool to treat my cluster as flat with > every host being equal but the other pool to have a more hierarchical idea > as hosts->racks->r

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-23 Thread Greg Poirier
On Sun, Jun 22, 2014 at 6:44 AM, Mark Nelson wrote: > RBD Cache is definitely going to help in this use case. This test is > basically just sequentially writing a single 16k chunk of data out, one at > a time. IE, entirely latency bound. At least on OSDs backed by XFS, you > have to wait for t

Re: [ceph-users] Level DB with RADOS

2014-06-23 Thread Gregory Farnum
Well, it's in the Ceph repository, in the "OSD" and "os" directories, available at https://github.com/ceph/ceph. But it's not the kind of thing you can really extract from Ceph, and if you're interested in getting involved in the project you're going to need to spend a lot of time poking around th

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-23 Thread Greg Poirier
10 OSDs per node 12 physical cores hyperthreaded (24 logical cores exposed to OS) 64GB RAM Negligible load iostat shows the disks are largely idle except for bursty writes occasionally. Results of fio from one of the SSDs in the cluster: fiojob: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengi

[ceph-users] Ceph RGW + S3 Client (s3cmd)

2014-06-23 Thread Vickey Singh
*Hello Cephers* *I have followed Ceph documentation and my radios gateway setup is working fine.* # swift -V 1.0 -A http://bmi-pocfe2.scc.fi/auth -U scc:swift -K secretkey list Hello-World bmi-pocfe2 scc packstack test # # radosgw-admin bucket stats --bucket=scc { "bucket": "scc",

[ceph-users] XFS - number of files in a directory

2014-06-23 Thread Guang Yang
Hello Cephers, We used to have a Ceph cluster and setup our data pool as 3 replicas, we estimated the number of files (given disk size and object size) for each PG was around 8K, we disabled folder splitting which mean all files located at the root PG folder. Our testing showed a good performanc

[ceph-users] best way to get the number of monitors

2014-06-23 Thread Jan Kalcic
Hi all, just like "ceph osd create", do I have a similar command to obtain the ID of the next new monitor? (i.e. there are 2 mons in the cluster at the moment, next one will be the third) Any other good way to go back up to the current number of monitors? Thanks, Jan

[ceph-users] 403 error on http://ceph.com/docs/master/

2014-06-23 Thread Christian Kauhaus
Hi, the "Documentation" link on the ceph.com home page leads to a 403 error page. Is this a web server malfunction/misconfiguration or live the docs under different URLs now? Regards Christian -- Dipl.-Inf. Christian Kauhaus <>< · k...@gocept.com · systems administration gocept gmbh & co. kg ·

[ceph-users] Behaviour of ceph pg repair on different replication levels

2014-06-23 Thread Christian Eichelmann
Hi ceph users, since our cluster had a few inconsistent pgs in the last time, i was wondering what ceph pg repair does, depending on the replication level. So I just wanted to check if my assumptions are correct: Replication 2x Since the cluster can not decide which version is correct one, it wou

[ceph-users] trying to interpret lines in osd.log

2014-06-23 Thread Christian Kauhaus
I see several instances of the following log messages in the OSD logs each day: 2014-06-21 02:05:27.740697 7fbc58b78700 0 -- 172.22.8.12:6810/31918 >> 172.22.8.12:6800/28827 pipe(0x7fbe400029f0 sd=764 :6810 s=0 pgs=0 cs=0 l=0 c=0x7fbe40003190).accept connect_seq 30 vs existing 29 state standby 2

[ceph-users] osd flapping ; heartbeat failed

2014-06-23 Thread eric mourgaya
Hi, my version of ceph is 0.72.2 on scientific linux with 2.6.32-431.1.2.el6.x86_64 kernel. after a network trouble on all my nodes. Osd flap up to down periodically. I have to set* nodown parameter* to stabilize it. I have a public_network and a cluster_network. I have this message on most of

Re: [ceph-users] Poor performance on all SSD cluster

2014-06-23 Thread Mark Kirkwood
On 23/06/14 18:51, Christian Balzer wrote: On Sunday, June 22, 2014, Mark Kirkwood rbd cache max dirty = 1073741824 rbd cache max dirty age = 100 Mark, you're giving it a 2GB cache. For a write test that's 1GB in size. "Aggressively set" is a bit of an understatement here. ^o^ Most people wil

Re: [ceph-users] HowTo mark an OSD as down?

2014-06-23 Thread Henrik Korkuc
On 2014.06.23 10:01, Udo Lembke wrote: > Hi, > AFAIK should an "ceph osd down osd.29" marked osd.29 as down. > But what is to do if this don't happens? > > I got following: > root@ceph-02:~# ceph osd down osd.29 > marked down osd.29. > > root@ceph-02:~# ceph osd tree > 2014-06-23 08:51:00.588042 7f

Re: [ceph-users] HowTo mark an OSD as down?

2014-06-23 Thread Udo Lembke
Hi again, found a solution: initctl stop ceph-osd id=29 root@ceph-02:~# ceph osd tree # idweight type name up/down reweight -1 203.8 root default -3 203.8 rack unknownrack -2 29.12 host ceph-01 52 3.64osd.52

[ceph-users] HowTo mark an OSD as down?

2014-06-23 Thread Udo Lembke
Hi, AFAIK should an "ceph osd down osd.29" marked osd.29 as down. But what is to do if this don't happens? I got following: root@ceph-02:~# ceph osd down osd.29 marked down osd.29. root@ceph-02:~# ceph osd tree 2014-06-23 08:51:00.588042 7f15747f5700 0 -- :/1018258 >> 172.20.2.11:6789/0 pipe(0x7