Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-25 Thread Alexandre DERUMIER
he mailing list. Thanks Sage ! - Mail original - De: "Sage Weil" À: "aderumier" Cc: "ceph-users" , "ceph-devel" Envoyé: Vendredi 25 Janvier 2019 10:49:02 Objet: Re: ceph osd commit latency increase over time, until restart Can you captur

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-27 Thread Alexandre DERUMIER
tLatency Rocksdb.submitLatency Rocksdb.submitSyncLatency RecoverystatePerf.repnotrecoveringLatency RecoverystatePerf.waitupthruLatency Osd.opRwPrepareLatency RecoverystatePerf.primaryLatency RecoverystatePerf.replicaactiveLatency RecoverystatePerf.startedLatency RecoverystatePerf.getlogLa

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-29 Thread Alexandre DERUMIER
:49:02 Objet: Re: ceph osd commit latency increase over time, until restart Can you capture a perf top or perf record to see where teh CPU time is going on one of the OSDs wth a high latency? Thanks! sage On Fri, 25 Jan 2019, Alexandre DERUMIER wrote: > > Hi, > > I have

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Alexandre DERUMIER
uot;Sage Weil" > À: "aderumier" > Cc: "ceph-users" , "ceph-devel" > > Envoyé: Vendredi 25 Janvier 2019 10:49:02 > Objet: Re: ceph osd commit latency increase over time, until restart > > Can you capture a perf top or perf record to see

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Alexandre DERUMIER
AND "id" =~ /^([[osd]])$/ AND $timeFilter GROUP BY time($interval), "host", "id" fill(previous) - Mail original - De: "Stefan Priebe, Profihost AG" À: "aderumier" , "Sage Weil" Cc: "ceph-users" , "ceph-devel&quo

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Alexandre DERUMIER
original - De: "Mark Nelson" À: "ceph-users" Envoyé: Mercredi 30 Janvier 2019 18:08:08 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart On 1/30/19 7:45 AM, Alexandre DERUMIER wrote: >>> I don't see any smoking gun here... :/

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Alexandre DERUMIER
rcredi 30 Janvier 2019 19:50:20 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart Hi, Am 30.01.19 um 14:59 schrieb Alexandre DERUMIER: > Hi Stefan, > >>> currently i'm in the process of switching back from jemalloc to tcmalloc >>> li

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Alexandre DERUMIER
e) - Mail original - De: "Stefan Priebe, Profihost AG" À: "aderumier" Cc: "Sage Weil" , "ceph-users" , "ceph-devel" Envoyé: Mercredi 30 Janvier 2019 19:50:20 Objet: Re: [ceph-users] ceph osd commit latency increase over tim

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Alexandre DERUMIER
mpool::pool_allocator<(mempool::pool_index_t)1, std::pair >, 256> >, std::pair&, std::pair*>::increment_slow() | | | | | | + 26.00% StupidAllocator::_aligned_len(interval_set, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair >, 256> >

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Alexandre DERUMIER
be to backport bitmap allocator from Nautilus >>and try the difference... Any plan to backport it to mimic ? (But I can wait for Nautilus) perf results of new bitmap allocator seem very promising from what I've seen in PR. - Mail original - De: "Igor Fedotov" À: &

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Alexandre DERUMIER
s / max possible number of free extents) * 1000"); Here a graph on last month, with bluestore_fragmentation_micros and latency, http://odisoweb1.odiso.net/latency_vs_fragmentation_micros.png ----- Mail original - De: "Alexandre Derumier" À: "Igor Fedotov" Cc: "Stefa

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-08 Thread Alexandre DERUMIER
; À: "aderumier" Cc: "Stefan Priebe, Profihost AG" , "Mark Nelson" , "Sage Weil" , "ceph-users" , "ceph-devel" Envoyé: Mardi 5 Février 2019 18:56:51 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until r

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-08 Thread Alexandre DERUMIER
t;avgcount": 243, "sum": 6.869296500, "avgtime": 0.028268709 }, "started_latency": { "avgcount": 1125, "sum": 13551384.917335850, "avgtime": 12045.675482076

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-08 Thread Alexandre DERUMIER
;sum": 13019.714424060, "avgtime": 0.000658067 }, "subop_pull": 0, "subop_pull_latency": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 }, "subop_push": 0, "subop_push_in_bytes": 0, "subop_push

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-13 Thread Alexandre DERUMIER
2018.13:30.dump_mempools.txt Then is decreasing over time (around 3,7G this morning), but RSS is still at 8G I'm graphing mempools counters too since yesterday, so I'll able to track them over time. - Mail original - De: "Igor Fedotov" À: "Alexandre Derumier"

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-15 Thread Alexandre DERUMIER
o proceed with perf dumps collection to see if the picture stays the same. W.r.t. memory usage you observed I see nothing suspicious so far - No decrease in RSS report is a known artifact that seems to be safe. Thanks, Igor On 2/13/2019 11:42 AM, Alexandre DERUMIER wrote: > Hi Igor, >

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-15 Thread Alexandre DERUMIER
t. We also increased memory target from 4G to 6G on >>these OSDs as the memory would allow it. I have set memory to 6GB this morning, with 2 osds of 3TB for 6TB nvme. (my last test was 8gb with 1osd of 6TB, but that didn't help) - Mail original - De: "Wido den Hollan

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-16 Thread Alexandre DERUMIER
endredi 15 Février 2019 14:59:30 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart On 2/15/19 2:54 PM, Alexandre DERUMIER wrote: >>> Just wanted to chime in, I've seen this with Luminous+BlueStore+NVMe >>> OSDs as well. Over time their laten

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-19 Thread Alexandre DERUMIER
), is lower than before (48GB (8osd x 6GB) vs 56GB (4osd x 12GB). I'll send more stats tomorrow. Alexandre - Mail original - De: "Igor Fedotov" À: "Alexandre Derumier" , "Wido den Hollander" Cc: "ceph-users" , "ceph-devel" En

Re: [ceph-users] Intel P4600 3.2TB U.2 form factor NVMe firmware problems causing dead disks

2019-02-19 Thread Alexandre DERUMIER
I'm running some s4610 (SSDPE2KE064T8), with firmware VDV10140. don't have any problem with them since 6months. But I remember than around september 2017, supermicro has warned me about a firmware bug on s4600. (don't known which firmware version) - Mail original - De: "David Turner"

Re: [ceph-users] Replicating CephFS between clusters

2019-02-19 Thread Alexandre DERUMIER
Hi, I think that cephfs snap mirroring is coming for nautilus https://www.openstack.org/assets/presentation-media/2018.11.15-openstack-ceph-data-services.pdf (slide 26) But I don't known if it's already ready is master ? - Mail original - De: "Vitaliy Filippov" À: "Marc Roos" , "Bala

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-20 Thread Alexandre DERUMIER
ory used (#free), is lower than before (48GB (8osd x 6GB) vs 56GB (4osd x 12GB). I'll send more stats tomorrow. Alexandre - Mail original - De: "Igor Fedotov" À: "Alexandre Derumier" , "Wido den Hollander" Cc: "ceph-users" , "ceph-dev

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-20 Thread Alexandre DERUMIER
ore (48GB (8osd x 6GB) vs 56GB (4osd x 12GB). I'll send more stats tomorrow. Alexandre - Mail original - De: "Igor Fedotov" À: "Alexandre Derumier" , "Wido den Hollander" Cc: "ceph-users" , "ceph-devel" Envoyé: Mardi 19 Févri

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-20 Thread Alexandre DERUMIER
that total memory used (#free), is lower than before (48GB (8osd x 6GB) vs 56GB (4osd x 12GB). I'll send more stats tomorrow. Alexandre - Mail original - De: "Igor Fedotov" À: "Alexandre Derumier" , "Wido den Hollander" Cc: "ceph-users"

Re: [ceph-users] Ceph and TCP States

2019-02-25 Thread Alexandre DERUMIER
Hi, sorry to bump this old thread, but I had this problem recently, with a linux firewall between cephfs client and cluster the problem was easy to reproduce with #firewall is enable with iptables -A FORWARD -m conntrack --ctstate INVALID -j DROP iptables -A FORWARD -m conntrack --ctstate RE

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-03-01 Thread Alexandre DERUMIER
Hi, some news, it seem that it's finally stable for me since 1week. (around 0,7ms of commit latency average) http://odisoweb1.odiso.net/osdstable.png The biggest change is the 18/02, where I have finished to rebuild all my osd, with 2 osd of 3TB for 1NVME 6TB. (previously I only have done it

Re: [ceph-users] rbd cache limiting IOPS

2019-03-08 Thread Alexandre DERUMIER
>>Which options do we have to increase IOPS while writeback cache is used? If I remember they are some kind of global lock/mutex with rbd cache, and I think they are some work currently to improve it. (I think I see a PR about this on performance meeting pad some months ago) - Mail original

Re: [ceph-users] rbd cache limiting IOPS

2019-03-08 Thread Alexandre DERUMIER
>>(I think I see a PR about this on performance meeting pad some months ago) https://github.com/ceph/ceph/pull/25713 - Mail original - De: "aderumier" À: "Engelmann Florian" Cc: "ceph-users" Envoyé: Vendredi 8 Mars 2019 15:03:23 Objet: Re: [ceph-users] rbd cache limiting IOPS >>Whic

Re: [ceph-users] Intel D3-S4610 performance

2019-03-14 Thread Alexandre DERUMIER
Hi, I'm running dc p4610 6TB (nvme), no performance problem. not sure what is the difference with d3-s4610. - Mail original - De: "Kai Wembacher" À: "ceph-users" Envoyé: Mardi 12 Mars 2019 09:13:44 Objet: [ceph-users] Intel D3-S4610 performance Hi everyone, I have an Intel D3-

Re: [ceph-users] VM management setup

2019-04-24 Thread Alexandre DERUMIER
+1 for proxmox. (I'm contributor and I can say that ceph support is very good) - Mail original - De: jes...@krogh.cc À: "ceph-users" Envoyé: Vendredi 5 Avril 2019 21:34:02 Objet: [ceph-users] VM management setup Hi. Knowing this is a bit off-topic but seeking recommendations and advise

Re: [ceph-users] How do you deal with "clock skew detected"?

2019-05-15 Thread Alexandre DERUMIER
since I'm using chrony instead ntpd/openntpd, I don't have clock skew anymore. (chrony is really faster to resync) - Mail original - De: "Jan Kasprzak" À: "ceph-users" Envoyé: Mercredi 15 Mai 2019 13:47:57 Objet: [ceph-users] How do you deal with "clock skew detected"? Hello, Ceph use

Re: [ceph-users] Poor performance for 512b aligned "partial" writes from Windows guests in OpenStack + potential fix

2019-05-15 Thread Alexandre DERUMIER
Many thanks for the analysis ! I'm going to test with 4K on heavy mssql database to see if I'm seeing improvement on ios/latency. I'll report results in this thread. - Mail original - De: "Trent Lloyd" À: "ceph-users" Envoyé: Vendredi 10 Mai 2019 09:59:39 Objet: [ceph-users] Poor per

Re: [ceph-users] Changing the release cadence

2019-06-05 Thread Alexandre DERUMIER
Hi, >>- November: If we release Octopus 9 months from the Nautilus release >>(planned for Feb, released in Mar) then we'd target this November. We >>could shift to a 12 months candence after that. For the 2 last debian releases, the freeze was around january-february, november seem to be a go

Re: [ceph-users] BlueStore bitmap allocator under Luminous and Mimic

2019-07-10 Thread Alexandre DERUMIER
> Can't say anything about latency. >>Anybody else? Wido? I'm running it on mimic since 1 month, no problem until now, and it's definility fixing the latency increasing over time. (aka need restart osd each week) Memory usage is almost the same than before. - Mail original - De: "Kon

Re: [ceph-users] CephFS snapshot for backup & disaster recovery

2019-08-08 Thread Alexandre DERUMIER
Hi, >>I'm running a single-host Ceph cluster for CephFS and I'd like to keep >>backups in Amazon S3 for disaster recovery. Is there a simple way to extract >>a CephFS snapshot as a single file and/or to create a file that represents >>the incremental difference between two snapshots? I think i

Re: [ceph-users] Optimizing terrible RBD performance

2019-10-04 Thread Alexandre DERUMIER
Hi, >>dd if=/dev/zero of=/dev/rbd0 writes at 5MB/s - you are testing with a single thread/iodepth=1 sequentially here. Then only 1 disk at time, and you have network latency too. rados bench is doing 16 concurrent write. Try to test with fio for example, with bigger iodepth, small block/big b

<    1   2   3   4   5