Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-03-01 Thread Igor Fedotov
resending, not sure the prev email reached the mailing list... Hi Chen, thanks for the update. Will prepare patch to periodically reset StupidAllocator today. And just to let you know below is an e-mail from AdamK from RH which might explain the issue with the allocator. Also please note

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-03-01 Thread Igor Fedotov
Hi Chen, thanks for the update. Will prepare patch to periodically reset StupidAllocator today. And just to let you know below is an e-mail from AdamK from RH which might explain the issue with the allocator. Also please note that StupidAllocator might not perform full defragmentation in

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-03-01 Thread Alexandre DERUMIER
otov" , "ceph-users" , "ceph-devel" Envoyé: Jeudi 28 Février 2019 21:57:05 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart Quoting Wido den Hollander (w...@42on.com): > Just wanted to chime in, I've seen this with Luminous+BlueStore+N

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-28 Thread Igor Fedotov
Also I think it makes sense to create a ticket at this point. Any volunteers? On 3/1/2019 1:00 AM, Igor Fedotov wrote: Wondering if somebody would be able to apply simple patch that periodically resets StupidAllocator? Just to verify/disprove the hypothesis it's allocator relateted On

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-28 Thread Igor Fedotov
Wondering if somebody would be able to apply simple patch that periodically resets StupidAllocator? Just to verify/disprove the hypothesis it's allocator relateted On 2/28/2019 11:57 PM, Stefan Kooman wrote: Quoting Wido den Hollander (w...@42on.com): Just wanted to chime in, I've seen

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-28 Thread Stefan Kooman
Quoting Wido den Hollander (w...@42on.com): > Just wanted to chime in, I've seen this with Luminous+BlueStore+NVMe > OSDs as well. Over time their latency increased until we started to > notice I/O-wait inside VMs. On a Luminous 12.2.8 cluster with only SSDs we also hit this issue I guess.

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-20 Thread Alexandre DERUMIER
abling scrubing (currently it's running the night between 01:00-05:00) - Mail original - De: "aderumier" À: "Igor Fedotov" Cc: "ceph-users" , "ceph-devel" Envoyé: Mercredi 20 Février 2019 12:09:08 Objet: Re: [ceph-users] ceph osd commit latency

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-20 Thread Alexandre DERUMIER
original - De: "aderumier" À: "Igor Fedotov" Cc: "ceph-users" , "ceph-devel" Envoyé: Mercredi 20 Février 2019 11:39:34 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart Hi, I have hit the bug again, but this time

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-20 Thread Alexandre DERUMIER
osd restart, I'm between 0.7-1ms - Mail original - De: "aderumier" À: "Igor Fedotov" Cc: "ceph-users" , "ceph-devel" Envoyé: Mardi 19 Février 2019 17:03:58 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-19 Thread Alexandre DERUMIER
fine for a longer period we will go to 8GB per OSD > so it will max out on 80GB leaving 16GB as spare. > > As these OSDs were all restarted earlier this week I can't tell how it > will hold up over a longer period. Monitoring (Zabbix) shows the latency > is fine at the moment

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-19 Thread Igor Fedotov
eplication time ? - Mail original - De: "Wido den Hollander" À: "aderumier" Cc: "Igor Fedotov" , "ceph-users" , "ceph-devel" Envoyé: Vendredi 15 Février 2019 14:59:30 Objet: Re: [ceph-users] ceph osd commit latency increase over time,

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-16 Thread Alexandre DERUMIER
the latency is fine at the moment. Wido > > > - Mail original - > De: "Wido den Hollander" > À: "Alexandre Derumier" , "Igor Fedotov" > > Cc: "ceph-users" , "ceph-devel" > > Envoyé: Vendredi 15 Février 20

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-15 Thread Wido den Hollander
r" > À: "Alexandre Derumier" , "Igor Fedotov" > > Cc: "ceph-users" , "ceph-devel" > > Envoyé: Vendredi 15 Février 2019 14:50:34 > Objet: Re: [ceph-users] ceph osd commit latency increase over time, until > restart > > On

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-15 Thread Alexandre DERUMIER
quot;Alexandre Derumier" , "Igor Fedotov" Cc: "ceph-users" , "ceph-devel" Envoyé: Vendredi 15 Février 2019 14:50:34 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart On 2/15/19 2:31 PM, Alexandre DERUMIER wrote: > Thanks

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-15 Thread Wido den Hollander
ph-devel" > > Envoyé: Vendredi 15 Février 2019 13:47:57 > Objet: Re: [ceph-users] ceph osd commit latency increase over time, until > restart > > Hi Alexander, > > I've read through your reports, nothing obvious so far. > > I can only see several times averag

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-15 Thread Alexandre DERUMIER
quot;ceph-devel" Envoyé: Vendredi 15 Février 2019 13:47:57 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart Hi Alexander, I've read through your reports, nothing obvious so far. I can only see several times average latency increase for OSD write ops (i

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-15 Thread Igor Fedotov
tore_fsck": { >> "items": 0, >> "bytes": 0 >> }, >> "bluestore_txc": { >> "items": 11, >> "bytes": 8184 >> }, >> "bluestore_writing_deferred": { >> "items": 5047,

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-15 Thread Igor Fedotov
De: "Igor Fedotov" À: "Alexandre Derumier" Cc: "Sage Weil" , "ceph-users" , "ceph-devel" Envoyé: Lundi 11 Février 2019 12:03:17 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart On 2/8/2019 6:57 PM, Alexandre

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-13 Thread Alexandre DERUMIER
" , "ceph-users" , "ceph-devel" Envoyé: Lundi 11 Février 2019 12:03:17 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart On 2/8/2019 6:57 PM, Alexandre DERUMIER wrote: > another mempool dump after 1h run. (laten

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-11 Thread Igor Fedotov
: 1681, "stat_bytes": 6401248198656, "stat_bytes_used": 3777979072512, "stat_bytes_avail": 2623269126144, "copyfrom": 0, "tier_promote": 0, "tier_flush": 0, "tier_flush_fail": 0, "tier_try_flush": 0, "tier_try_flush_f

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-08 Thread Alexandre DERUMIER
;osdmap_mapping": { "items": 0, "bytes": 0 }, "pgmap": { "items": 0, "bytes": 0 }, "mds_co": { &qu

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-08 Thread Alexandre DERUMIER
count": 243, "sum": 6.869296500, "avgtime": 0.028268709 }, "started_latency": { "avgcount": 1125, "sum": 13551384.917335850, "avgtime": 12045.675482076

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-08 Thread Alexandre DERUMIER
rumier" Cc: "Stefan Priebe, Profihost AG" , "Mark Nelson" , "Sage Weil" , "ceph-users" , "ceph-devel" Envoyé: Mardi 5 Février 2019 18:56:51 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart On 2/4/

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-05 Thread Igor Fedotov
e, Profihost AG" , "Mark Nelson" , "Sage Weil" , "ceph-users" , "ceph-devel" Envoyé: Lundi 4 Février 2019 16:04:38 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart Thanks Igor, Could you please collect BlueStore

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Alexandre DERUMIER
ebe, Profihost AG" , "Mark Nelson" , "Sage Weil" , "ceph-users" , "ceph-devel" Envoyé: Lundi 4 Février 2019 16:04:38 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart Thanks Igor, >>Could you please collect

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Alexandre DERUMIER
e, Profihost AG" , "Mark Nelson" Cc: "Sage Weil" , "ceph-users" , "ceph-devel" Envoyé: Lundi 4 Février 2019 15:51:30 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart Hi Alexandre, looks like a bug in StupidAlloc

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Igor Fedotov
:iterator, unsigned long) - Mail original - De: "Alexandre Derumier" À: "Stefan Priebe, Profihost AG" Cc: "Sage Weil" , "ceph-users" , "ceph-devel" Envoyé: Lundi 4 Février 2019 09:38:11 Objet: Re: [ceph-users] ceph osd

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Alexandre DERUMIER
, unsigned long) - Mail original - De: "Alexandre Derumier" À: "Stefan Priebe, Profihost AG" Cc: "Sage Weil" , "ceph-users" , "ceph-devel" Envoyé: Lundi 4 Février 2019 09:38:11 Objet: Re: [ceph-users] ceph osd com

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Alexandre DERUMIER
"ceph-users" , "ceph-devel" Envoyé: Mercredi 30 Janvier 2019 19:58:15 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart >>Thanks. Is there any reason you monitor op_w_latency but not >>op_r_latency but instead op_latency? >> &g

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Alexandre DERUMIER
on='osd' AND "id" =~ > /^([[osd]])$/ AND $timeFilter GROUP BY time($interval), "host", "id" > fill(previous) Thanks. Is there any reason you monitor op_w_latency but not op_r_latency but instead op_latency? Also why do you monitor op_w_process_latency? but not

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Alexandre DERUMIER
"Mark Nelson" À: "ceph-users" Envoyé: Mercredi 30 Janvier 2019 18:08:08 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart On 1/30/19 7:45 AM, Alexandre DERUMIER wrote: >>> I don't see any smoking gun here... :/ > I need to test to co

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Stefan Priebe - Profihost AG
", "id" > fill(previous) Thanks. Is there any reason you monitor op_w_latency but not op_r_latency but instead op_latency? Also why do you monitor op_w_process_latency? but not op_r_process_latency? greets, Stefan > > > > > > - Mail original

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Mark Nelson
On 1/30/19 7:45 AM, Alexandre DERUMIER wrote: I don't see any smoking gun here... :/ I need to test to compare when latency are going very high, but I need to wait more days/weeks. The main difference between a warm OSD and a cold one is that on startup the bluestore cache is empty. You

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Alexandre DERUMIER
imeFilter GROUP BY time($interval), "host", "id" fill(previous) - Mail original - De: "Stefan Priebe, Profihost AG" À: "aderumier" , "Sage Weil" Cc: "ceph-users" , "ceph-devel" Envoyé: Mercredi 30 Janvier 2019 08:45:3

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Alexandre DERUMIER
>>I don't see any smoking gun here... :/ I need to test to compare when latency are going very high, but I need to wait more days/weeks. >>The main difference between a warm OSD and a cold one is that on startup >>the bluestore cache is empty. You might try setting the bluestore cache

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Sage Weil
On Wed, 30 Jan 2019, Alexandre DERUMIER wrote: > Hi, > > here some new results, > different osd/ different cluster > > before osd restart latency was between 2-5ms > after osd restart is around 1-1.5ms > > http://odisoweb1.odiso.net/cephperf2/bad.txt (2-5ms) >

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-29 Thread Stefan Priebe - Profihost AG
Hi, Am 30.01.19 um 08:33 schrieb Alexandre DERUMIER: > Hi, > > here some new results, > different osd/ different cluster > > before osd restart latency was between 2-5ms > after osd restart is around 1-1.5ms > > http://odisoweb1.odiso.net/cephperf2/bad.txt (2-5ms) >

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-29 Thread Alexandre DERUMIER
Hi, here some new results, different osd/ different cluster before osd restart latency was between 2-5ms after osd restart is around 1-1.5ms http://odisoweb1.odiso.net/cephperf2/bad.txt (2-5ms) http://odisoweb1.odiso.net/cephperf2/ok.txt (1-1.5ms) http://odisoweb1.odiso.net/cephperf2/diff.txt

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-27 Thread Alexandre DERUMIER
([[osd]])$/ AND $timeFilter GROUP BY time($interval), "host", "id" fill(previous) dashboard is here: https://grafana.com/dashboards/7995 - Mail original - De: "Marc Roos" À: "aderumier" Cc: "ceph-users" Envoyé: Dimanche 27 Janvier 2019

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-27 Thread Marc Roos
RecoverystatePerf.waitremoterecoveryreservedLatency -Original Message- From: Alexandre DERUMIER [mailto:aderum...@odiso.com] Sent: vrijdag 25 januari 2019 17:40 To: Sage Weil Cc: ceph-users; ceph-devel Subject: Re: [ceph-users] ceph osd commit latency increase over time, until restart also, here the result of "

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-25 Thread Alexandre DERUMIER
Janvier 2019 17:32:02 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart Hi again, I was able to perf it today, before restart, commit latency was between 3-5ms after restart at 17:11, latency is around 1ms http://odisoweb1.odiso.net/osd3_latency_3ms_vs_1ms

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-25 Thread Alexandre DERUMIER
Hi again, I was able to perf it today, before restart, commit latency was between 3-5ms after restart at 17:11, latency is around 1ms http://odisoweb1.odiso.net/osd3_latency_3ms_vs_1ms.png here some perf reports: with 3ms latency: - perf report by caller:

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-25 Thread Alexandre DERUMIER
>>Can you capture a perf top or perf record to see where teh CPU time is >>going on one of the OSDs wth a high latency? Yes, sure. I'll do it next week and send result to the mailing list. Thanks Sage ! - Mail original - De: "Sage Weil" À: "aderumier" Cc: "ceph-users" , "ceph-devel"

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-25 Thread Sage Weil
Can you capture a perf top or perf record to see where teh CPU time is going on one of the OSDs wth a high latency? Thanks! sage On Fri, 25 Jan 2019, Alexandre DERUMIER wrote: > > Hi, > > I have a strange behaviour of my osd, on multiple clusters, > > All cluster are running mimic

[ceph-users] ceph osd commit latency increase over time, until restart

2019-01-25 Thread Alexandre DERUMIER
Hi, I have a strange behaviour of my osd, on multiple clusters, All cluster are running mimic 13.2.1,bluestore, with ssd or nvme drivers, workload is rbd only, with qemu-kvm vms running with librbd + snapshot/rbd export-diff/snapshotdelete each day for backup When the osd are refreshly