Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't uptosnuff)

2016-02-12 Thread Robert LeBlanc
Christian, Yep, that describes what I see too. Good news is that I made a lot of progress on optimizing the queue today 10-50% performance increase in my microbenchmarks (that is only the improvement in enqueueing and dequeueing ops which is a small part of the whole IO path, but every little bit

Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't uptosnuff)

2016-02-12 Thread Christian Balzer
Hello, for the record what Robert is writing below matches my experience the best. On Fri, 12 Feb 2016 22:17:01 + Steve Taylor wrote: > I could be wrong, but I didn't think a PG would have to peer when an OSD > is restarted with noout set. If I'm wrong, then this peering would > definitely

[ceph-users] learning about increasing osd / pg_num for a pol

2016-02-12 Thread John Hogenmiller (yt)
I started a cluster with 9 OSD across 3 nodes. Then I expanded it to 419 OSDs across 7 nodes. Along the way, I increased the pg_num/pgp_num in the rbd pool. Thanks to help earlier on this list, I was able to do that. Tonight I started to do some perf testing and quickly realized that I never upda

Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't uptosnuff)

2016-02-12 Thread Steve Taylor
I could be wrong, but I didn't think a PG would have to peer when an OSD is restarted with noout set. If I'm wrong, then this peering would definitely block I/O. I just did a quick test on a non-busy cluster and didn't see any peering when my OSD went down or up, but I'm not sure how good a test

Re: [ceph-users] LUG 2016

2016-02-12 Thread Brian Andrus
Hello fellow namesake. Though I'm doubtful there will be representation in an official capacity at the Lustre User's Group, you might want to check out Ceph Days... http://ceph.com/cephdays/ On Fri, Feb 12, 2016 at 1:13 PM, Andrus, Brian Contractor wrote: > Does anyone know if there will be an

[ceph-users] LUG 2016

2016-02-12 Thread Andrus, Brian Contractor
Does anyone know if there will be any representation of ceph at the Lustre Users' Group in Portland this year? If not, is there any event in the US that brings the ceph community together? Brian Andrus ITACS/Research Computing Naval Postgraduate School Monterey, California voice: 831-656-6238

Re: [ceph-users] cls_rbd ops on rbd_id.$name objects in EC pool

2016-02-12 Thread Nick Fisk
> -Original Message- > From: Nick Fisk [mailto:n...@fisk.me.uk] > Sent: 12 February 2016 13:31 > To: 'Sage Weil' > Cc: 'Jason Dillaman' ; 'Samuel Just' > ; ceph-users@lists.ceph.com; ceph- > de...@vger.kernel.org > Subject: RE: cls_rbd ops on rbd_id.$name objects in EC pool > > > -Ori

Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't uptosnuff)

2016-02-12 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 What I've seen is that when an OSD starts up in a busy cluster, as soon as it is "in" (could be "out" before) it starts getting client traffic. However, it has be "in" to start catching up and peering to the other OSDs in the cluster. The OSD is not

Re: [ceph-users] v10.0.3 released

2016-02-12 Thread Tyler Bishop
Great work as always sage! Tyler Bishop Chief Technical Officer 513-299-7108 x10 tyler.bis...@beyondhosting.net If you are not the intended recipient of this transmission you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this inf

Re: [ceph-users] Multipath devices with infernalis

2016-02-12 Thread Tyler Bishop
Your probably running into issues with sysvinit / upstart / whatever. Try partitioning the DM and then mapping it directly in your ceph.conf under the osd section. It should work, ceph is just a process using the filesystem. Tyler Bishop Chief Technical Officer 513-299-7108 x

Re: [ceph-users] [Ceph-community] Getting WARN in __kick_osd_requests doing stress testing

2016-02-12 Thread Joao Eduardo Luis
Hi Bart, This email belongs in ceph-users (CC'ed), or maybe ceph-devel. You're unlikely to get answers to this on ceph-community. -Joao On 09/17/2015 11:33 PM, bart.bar...@osnexus.com wrote: > I'm running in a 3-node cluster and doing osd/rbd creation and deletion, > and ran across this WARN >

Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't uptosnuff)

2016-02-12 Thread Nick Fisk
I wonder if Christian is hitting some performance issue when the OSD or number of OSD's all start up at once? Or maybe the OSD is still doing some internal startup procedure and when the IO hits it on a very busy cluster, it causes it to become overloaded for a few seconds? I've seen similar thing

Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't uptosnuff)

2016-02-12 Thread Steve Taylor
Nick is right. Setting noout is the right move in this scenario. Restarting an OSD shouldn't block I/O unless nodown is also set, however. The exception to this would be a case where min_size can't be achieved because of the down OSD, i.e. min_size=3 and 1 of 3 OSDs is restarting. That would cer

Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't uptosnuff)

2016-02-12 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Christian Balzer > Sent: 12 February 2016 15:38 > To: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't > uptosnuff) > > On Fri, 12 Feb 2

Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't uptosnuff)

2016-02-12 Thread Christian Balzer
On Fri, 12 Feb 2016 15:56:31 +0100 Burkhard Linke wrote: > Hi, > > On 02/12/2016 03:47 PM, Christian Balzer wrote: > > Hello, > > > > yesterday I upgraded our most busy (in other words lethally overloaded) > > production cluster to the latest Firefly in preparation for a Hammer > > upgrade and th

Re: [ceph-users] ceph-disk activate fails (after 33 osd drives)

2016-02-12 Thread Alexey Sheplyakov
John, > 2016-02-12 12:53:43.340526 7f149bc71940 -1 journal FileJournal::_open: unable > to setup io_context (0) Success Try increasing aio-max-nr: echo 131072 > /proc/sys/fs/aio-max-nr Best regards, Alexey On Fri, Feb 12, 2016 at 4:51 PM, John Hogenmiller (yt) wrote: > > > I have 7 se

Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't uptosnuff)

2016-02-12 Thread Burkhard Linke
Hi, On 02/12/2016 03:47 PM, Christian Balzer wrote: Hello, yesterday I upgraded our most busy (in other words lethally overloaded) production cluster to the latest Firefly in preparation for a Hammer upgrade and then phasing in of a cache tier. When restarting the ODSs it took 3 minutes (1 min

[ceph-users] Reducing the impact of OSD restarts (noout ain't up to snuff)

2016-02-12 Thread Christian Balzer
Hello, yesterday I upgraded our most busy (in other words lethally overloaded) production cluster to the latest Firefly in preparation for a Hammer upgrade and then phasing in of a cache tier. When restarting the ODSs it took 3 minutes (1 minute in a consecutive repeat to test the impact of prim

Re: [ceph-users] ceph-disk activate fails (after 33 osd drives)

2016-02-12 Thread Alan Johnson
Can you check the value of kernel.pid_max. This may have to be increased for larger OSD counts, it may have some bearing? From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of John Hogenmiller (yt) Sent: Friday, February 12, 2016 8:52 AM To: ceph-users@lists.ceph.com Subject:

[ceph-users] ceph-disk activate fails (after 33 osd drives)

2016-02-12 Thread John Hogenmiller (yt)
I have 7 servers, each containing 60 x 6TB drives in jbod mode. When I first started, I only activated a couple drives on 3 nodes as Ceph OSDs. Yesterday, I went to expand to the remaining nodes as well as prepare and activate all the drives. ceph-disk prepare worked just fine. However, ceph-disk

[ceph-users] OSDs crashing on garbage data

2016-02-12 Thread Jeffrey McDonald
HI, I'm seeing at of errors like the following. The root cause appears to be the existence of a collection-- garbage data in the filestore.To clean it up, I have to remove a set of empty directories. The directories are old, created last August or September.I've had this happen a number

Re: [ceph-users] cls_rbd ops on rbd_id.$name objects in EC pool

2016-02-12 Thread Nick Fisk
> -Original Message- > From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel- > ow...@vger.kernel.org] On Behalf Of Sage Weil > Sent: 12 February 2016 13:15 > To: Nick Fisk > Cc: 'Jason Dillaman' ; 'Samuel Just' > ; ceph-users@lists.ceph.com; ceph- > de...@vger.kernel.org > Subject: RE

Re: [ceph-users] cls_rbd ops on rbd_id.$name objects in EC pool

2016-02-12 Thread Sage Weil
On Thu, 11 Feb 2016, Sage Weil wrote: > On Thu, 11 Feb 2016, Nick Fisk wrote: > > That’s a relief, I was sensing a major case of face palm occuring when I > > read Jason's email!!! > > https://github.com/ceph/ceph/pull/7617 > > The tangled logic in maybe_handle_cache wasn't respecting the force

Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue?

2016-02-12 Thread Nick Fisk
I will do my best to answer, but some of the questions are starting to stretch the limit of my knowledge > -Original Message- > From: Huan Zhang [mailto:huan.zhang...@gmail.com] > Sent: 12 February 2016 12:15 > To: Nick Fisk > Cc: Irek Fasikhov ; ceph-users us...@ceph.com> > Subject: Re

[ceph-users] Recomendations for building 1PB RadosGW with Erasure Code

2016-02-12 Thread Василий Ангапов
Hello, We are planning to build 1PB Ceph cluster for RadosGW with Erasure Code. It will be used for storing online videos. We do not expect outstanding write performace, something like 200-300MB/s of sequental write will be quite enough, but data safety is very important. What are the most popular

Re: [ceph-users] cls_rbd ops on rbd_id.$name objects in EC pool

2016-02-12 Thread Sage Weil
On Thu, 11 Feb 2016, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Is this only a problem with EC base tiers or would replicated base > tiers see this too? In general proxying to the base tier will work just fine if its replicated, so this is mostly an EC-only iss

Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue?

2016-02-12 Thread Huan Zhang
My enviroment: 32 cores Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz 10GiB NICS 4 osds/host My client is database(mysql) direct/sync write per transaction, a little bit sensitive to io latency(sync/direct). I used sata disk for osd backends, get ~100 iops/4k/1 iodepth, ~10ms io latency , simila

Re: [ceph-users] nova instance cannot boot after remove cache tier--help

2016-02-12 Thread Квапил , Андрей
Good news, while I wrote the previous letter I found the solution, to recovery back my vm's: ceph osd tier remove cold-storage I've been thinking how it can affect what happened. But I still do not understand why overlay option has so strange behavior. I know that overlay option sets overlay

Re: [ceph-users] nova instance cannot boot after remove cache tier--help

2016-02-12 Thread Квапил , Андрей
Hi, at this night I had same issue on Hammer LTS. I think that this is a ceph bug. My history: Ceph version: 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) Distro: Debian 7 (Proxmox 3.4) Kernel: 2.6.32-39-pve We have 9x 6TB SAS Drives in main pool and 6x 128GB PCIe SSD in cache pool on 3

Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue?

2016-02-12 Thread Nick Fisk
Write latency of 1.1ms is ok, but not brilliant. What IO size are you testing with? Don't forget if you have a journal latency of 1.1ms, excluding all other latency introduced by networking, replication and processing in the OSD code, you won't get more than about 900 iops. All the things I me

Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue?

2016-02-12 Thread Huan Zhang
thanks nick, filestore-> journal_latency: ~1.1ms 214.0/180611 0.0011848669239415096 seems ssd write is ok, any other idea is highly appreciated! "filestore": { "journal_queue_max_ops": 300, "journal_queue_ops": 0, "journal_ops": 180611, "journal_queue_ma

Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue?

2016-02-12 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Huan Zhang > Sent: 12 February 2016 10:00 > To: Irek Fasikhov > Cc: ceph-users > Subject: Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue? > > "op_w_latency": > "avgcount":

Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue?

2016-02-12 Thread Huan Zhang
"op_w_latency": "avgcount": 42991, "sum": 402.804741329 402.0/42991 0.009350794352306296 ~9ms latency, that means this ssd not suitable for journal device? "osd": { "op_wip": 0, "op": 58683, "op_in_bytes": 7309042294, "op_out_bytes": 5071374

Re: [ceph-users] Xeon-D 1540 Ceph Nodes

2016-02-12 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Wido den Hollander > Sent: 12 February 2016 09:15 > To: Schlacta, Christ ; Austin Johnson > > Cc: ceph-users@lists.ceph.com; Nick Fisk > Subject: Re: [ceph-users] Xeon-D 1540 Ceph Nodes >

Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue?

2016-02-12 Thread Nick Fisk
200 iops is close to the sync write latency you will get with either slow CPU's or 1GB networking. What sort of hardware/networking are you running? With top of the range hardware and a replica count of 2-3, don't expect to get much above 500-750iops for a single direct write. > -Original Mes

Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue?

2016-02-12 Thread Wido den Hollander
> Op 12 februari 2016 om 10:14 schreef Ferhat Ozkasgarli : > > > Hello Huan, > > If you look at Sebestien blog ( > https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/) > at comment section. You can see that Samsung SSD behaves very and very

Re: [ceph-users] Xeon-D 1540 Ceph Nodes

2016-02-12 Thread Wido den Hollander
> Op 12 februari 2016 om 6:55 schreef Austin Johnson : > > > The Supermicro 5018A-AR12L is built for object storage. In our testing, > they perform pretty well. You would have to invest in discrete 10G nics to > meet all of your requirements. > Using these ones in a archiving cluster in the Ne

Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue?

2016-02-12 Thread Ferhat Ozkasgarli
Hello Huan, If you look at Sebestien blog ( https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/) at comment section. You can see that Samsung SSD behaves very and very poorly on tests: Samsung SSD 850 PRO 256GB 40960 bytes (410 MB) copied

Re: [ceph-users] ceph 9.2.0 SAMSUNG ssd performance issue?

2016-02-12 Thread Huan Zhang
Thanks for reply! not very good, but seems acceptable, how do you think the possible reasons? osd erf counters helpful for this? sudo fio --filename=/dev/sda2 --direct=1 --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting --name=journal-test journal-te