Re: [ceph-users] Optimizing terrible RBD performance

2019-10-04 Thread Alexandre DERUMIER
Hi, >>dd if=/dev/zero of=/dev/rbd0 writes at 5MB/s - you are testing with a single thread/iodepth=1 sequentially here. Then only 1 disk at time, and you have network latency too. rados bench is doing 16 concurrent write. Try to test with fio for example, with bigger iodepth, small block/big

Re: [ceph-users] CephFS snapshot for backup & disaster recovery

2019-08-08 Thread Alexandre DERUMIER
Hi, >>I'm running a single-host Ceph cluster for CephFS and I'd like to keep >>backups in Amazon S3 for disaster recovery. Is there a simple way to extract >>a CephFS snapshot as a single file and/or to create a file that represents >>the incremental difference between two snapshots? I think

Re: [ceph-users] BlueStore bitmap allocator under Luminous and Mimic

2019-07-10 Thread Alexandre DERUMIER
> Can't say anything about latency. >>Anybody else? Wido? I'm running it on mimic since 1 month, no problem until now, and it's definility fixing the latency increasing over time. (aka need restart osd each week) Memory usage is almost the same than before. - Mail original - De:

Re: [ceph-users] Changing the release cadence

2019-06-05 Thread Alexandre DERUMIER
Hi, >>- November: If we release Octopus 9 months from the Nautilus release >>(planned for Feb, released in Mar) then we'd target this November. We >>could shift to a 12 months candence after that. For the 2 last debian releases, the freeze was around january-february, november seem to be a

Re: [ceph-users] Poor performance for 512b aligned "partial" writes from Windows guests in OpenStack + potential fix

2019-05-16 Thread Alexandre DERUMIER
Many thanks for the analysis ! I'm going to test with 4K on heavy mssql database to see if I'm seeing improvement on ios/latency. I'll report results in this thread. - Mail original - De: "Trent Lloyd" À: "ceph-users" Envoyé: Vendredi 10 Mai 2019 09:59:39 Objet: [ceph-users] Poor

Re: [ceph-users] How do you deal with "clock skew detected"?

2019-05-15 Thread Alexandre DERUMIER
since I'm using chrony instead ntpd/openntpd, I don't have clock skew anymore. (chrony is really faster to resync) - Mail original - De: "Jan Kasprzak" À: "ceph-users" Envoyé: Mercredi 15 Mai 2019 13:47:57 Objet: [ceph-users] How do you deal with "clock skew detected"? Hello, Ceph

Re: [ceph-users] VM management setup

2019-04-24 Thread Alexandre DERUMIER
+1 for proxmox. (I'm contributor and I can say that ceph support is very good) - Mail original - De: jes...@krogh.cc À: "ceph-users" Envoyé: Vendredi 5 Avril 2019 21:34:02 Objet: [ceph-users] VM management setup Hi. Knowing this is a bit off-topic but seeking recommendations and advise

Re: [ceph-users] Intel D3-S4610 performance

2019-03-14 Thread Alexandre DERUMIER
Hi, I'm running dc p4610 6TB (nvme), no performance problem. not sure what is the difference with d3-s4610. - Mail original - De: "Kai Wembacher" À: "ceph-users" Envoyé: Mardi 12 Mars 2019 09:13:44 Objet: [ceph-users] Intel D3-S4610 performance Hi everyone, I have an Intel

Re: [ceph-users] rbd cache limiting IOPS

2019-03-08 Thread Alexandre DERUMIER
>>(I think I see a PR about this on performance meeting pad some months ago) https://github.com/ceph/ceph/pull/25713 - Mail original - De: "aderumier" À: "Engelmann Florian" Cc: "ceph-users" Envoyé: Vendredi 8 Mars 2019 15:03:23 Objet: Re: [ceph-users] rbd cache limiting IOPS

Re: [ceph-users] rbd cache limiting IOPS

2019-03-08 Thread Alexandre DERUMIER
>>Which options do we have to increase IOPS while writeback cache is used? If I remember they are some kind of global lock/mutex with rbd cache, and I think they are some work currently to improve it. (I think I see a PR about this on performance meeting pad some months ago) - Mail

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-03-01 Thread Alexandre DERUMIER
Hi, some news, it seem that it's finally stable for me since 1week. (around 0,7ms of commit latency average) http://odisoweb1.odiso.net/osdstable.png The biggest change is the 18/02, where I have finished to rebuild all my osd, with 2 osd of 3TB for 1NVME 6TB. (previously I only have done

Re: [ceph-users] Ceph and TCP States

2019-02-25 Thread Alexandre DERUMIER
Hi, sorry to bump this old thread, but I had this problem recently, with a linux firewall between cephfs client and cluster the problem was easy to reproduce with #firewall is enable with iptables -A FORWARD -m conntrack --ctstate INVALID -j DROP iptables -A FORWARD -m conntrack --ctstate

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-20 Thread Alexandre DERUMIER
, is lower than before (48GB (8osd x 6GB) vs 56GB (4osd x 12GB). I'll send more stats tomorrow. Alexandre - Mail original - De: "Igor Fedotov" À: "Alexandre Derumier" , "Wido den Hollander" Cc: "ceph-users" , "ceph-devel" Envoy

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-20 Thread Alexandre DERUMIER
GB) vs 56GB (4osd x 12GB). I'll send more stats tomorrow. Alexandre - Mail original - De: "Igor Fedotov" À: "Alexandre Derumier" , "Wido den Hollander" Cc: "ceph-users" , "ceph-devel" Envoyé: Mardi 19 Février 2019 11:12:43 Ob

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-20 Thread Alexandre DERUMIER
), is lower than before (48GB (8osd x 6GB) vs 56GB (4osd x 12GB). I'll send more stats tomorrow. Alexandre - Mail original - De: "Igor Fedotov" À: "Alexandre Derumier" , "Wido den Hollander" Cc: "ceph-users" , "ceph-devel" Envoyé:

Re: [ceph-users] Replicating CephFS between clusters

2019-02-19 Thread Alexandre DERUMIER
Hi, I think that cephfs snap mirroring is coming for nautilus https://www.openstack.org/assets/presentation-media/2018.11.15-openstack-ceph-data-services.pdf (slide 26) But I don't known if it's already ready is master ? - Mail original - De: "Vitaliy Filippov" À: "Marc Roos" ,

Re: [ceph-users] Intel P4600 3.2TB U.2 form factor NVMe firmware problems causing dead disks

2019-02-19 Thread Alexandre DERUMIER
I'm running some s4610 (SSDPE2KE064T8), with firmware VDV10140. don't have any problem with them since 6months. But I remember than around september 2017, supermicro has warned me about a firmware bug on s4600. (don't known which firmware version) - Mail original - De: "David Turner"

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-19 Thread Alexandre DERUMIER
lower than before (48GB (8osd x 6GB) vs 56GB (4osd x 12GB). I'll send more stats tomorrow. Alexandre - Mail original - De: "Igor Fedotov" À: "Alexandre Derumier" , "Wido den Hollander" Cc: "ceph-users" , "ceph-devel" Envoyé: Mard

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-16 Thread Alexandre DERUMIER
di 15 Février 2019 14:59:30 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart On 2/15/19 2:54 PM, Alexandre DERUMIER wrote: >>> Just wanted to chime in, I've seen this with Luminous+BlueStore+NVMe >>> OSDs as well. Over time their latency increas

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-15 Thread Alexandre DERUMIER
increased memory target from 4G to 6G on >>these OSDs as the memory would allow it. I have set memory to 6GB this morning, with 2 osds of 3TB for 6TB nvme. (my last test was 8gb with 1osd of 6TB, but that didn't help) - Mail original - De: "Wido den Hollander" À: &

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-15 Thread Alexandre DERUMIER
collection to see if the picture stays the same. W.r.t. memory usage you observed I see nothing suspicious so far - No decrease in RSS report is a known artifact that seems to be safe. Thanks, Igor On 2/13/2019 11:42 AM, Alexandre DERUMIER wrote: > Hi Igor, > > Thanks again for h

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-13 Thread Alexandre DERUMIER
.dump_mempools.txt Then is decreasing over time (around 3,7G this morning), but RSS is still at 8G I'm graphing mempools counters too since yesterday, so I'll able to track them over time. - Mail original - De: "Igor Fedotov" À: "Alexandre Derumier" Cc: "Sage Weil

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-08 Thread Alexandre DERUMIER
quot;: 13019.714424060, "avgtime": 0.000658067 }, "subop_pull": 0, "subop_pull_latency": { "avgcount": 0, "sum": 0.0, "avgtime": 0.0 }, "subop_push": 0, "subop_push_in_bytes": 0, "subop_push_late

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-08 Thread Alexandre DERUMIER
count": 243, "sum": 6.869296500, "avgtime": 0.028268709 }, "started_latency": { "avgcount": 1125, "sum": 13551384.917335850, "avgtime": 12045.675482076

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-08 Thread Alexandre DERUMIER
rumier" Cc: "Stefan Priebe, Profihost AG" , "Mark Nelson" , "Sage Weil" , "ceph-users" , "ceph-devel" Envoyé: Mardi 5 Février 2019 18:56:51 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart On 2/4/

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Alexandre DERUMIER
ax possible number of free extents) * 1000"); Here a graph on last month, with bluestore_fragmentation_micros and latency, http://odisoweb1.odiso.net/latency_vs_fragmentation_micros.png - Mail original - De: "Alexandre Derumier" À: "Igor Fedotov" Cc: "Stefan Pri

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Alexandre DERUMIER
utilus >>and try the difference... Any plan to backport it to mimic ? (But I can wait for Nautilus) perf results of new bitmap allocator seem very promising from what I've seen in PR. - Mail original - De: "Igor Fedotov" À: "Alexandre Derumier" , "Stefan Prieb

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Alexandre DERUMIER
l_allocator<(mempool::pool_index_t)1, std::pair >, 256> >, std::pair&, std::pair*>::increment_slow() | | | | | | + 26.00% StupidAllocator::_aligned_len(interval_set, mempool::pool_allocator<(mempool::pool_index_t)1, std::pair >, 256> >::iterator

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Alexandre DERUMIER
-- De: "Stefan Priebe, Profihost AG" À: "aderumier" Cc: "Sage Weil" , "ceph-users" , "ceph-devel" Envoyé: Mercredi 30 Janvier 2019 19:50:20 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart Hi, A

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Alexandre DERUMIER
i 30 Janvier 2019 19:50:20 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart Hi, Am 30.01.19 um 14:59 schrieb Alexandre DERUMIER: > Hi Stefan, > >>> currently i'm in the process of switching back from jemalloc to tcmalloc >>> like sugge

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Alexandre DERUMIER
"Mark Nelson" À: "ceph-users" Envoyé: Mercredi 30 Janvier 2019 18:08:08 Objet: Re: [ceph-users] ceph osd commit latency increase over time, until restart On 1/30/19 7:45 AM, Alexandre DERUMIER wrote: >>> I don't see any smoking gun here... :/ > I need to test to co

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Alexandre DERUMIER
imeFilter GROUP BY time($interval), "host", "id" fill(previous) - Mail original - De: "Stefan Priebe, Profihost AG" À: "aderumier" , "Sage Weil" Cc: "ceph-users" , "ceph-devel" Envoyé: Mercredi 30 Janvier 2019 08:45:3

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-30 Thread Alexandre DERUMIER
" > Cc: "ceph-users" , "ceph-devel" > > Envoyé: Vendredi 25 Janvier 2019 10:49:02 > Objet: Re: ceph osd commit latency increase over time, until restart > > Can you capture a perf top or perf record to see where teh CPU time is > going on one

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-29 Thread Alexandre DERUMIER
et: Re: ceph osd commit latency increase over time, until restart Can you capture a perf top or perf record to see where teh CPU time is going on one of the OSDs wth a high latency? Thanks! sage On Fri, 25 Jan 2019, Alexandre DERUMIER wrote: > > Hi, > > I have a strange

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-27 Thread Alexandre DERUMIER
y Rocksdb.submitSyncLatency RecoverystatePerf.repnotrecoveringLatency RecoverystatePerf.waitupthruLatency Osd.opRwPrepareLatency RecoverystatePerf.primaryLatency RecoverystatePerf.replicaactiveLatency RecoverystatePerf.startedLatency RecoverystatePerf.getlogLatency RecoverystatePerf.initi

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-25 Thread Alexandre DERUMIER
list. Thanks Sage ! - Mail original - De: "Sage Weil" À: "aderumier" Cc: "ceph-users" , "ceph-devel" Envoyé: Vendredi 25 Janvier 2019 10:49:02 Objet: Re: ceph osd commit latency increase over time, until restart Can you capture a perf t

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-25 Thread Alexandre DERUMIER
users" , "ceph-devel" Envoyé: Vendredi 25 Janvier 2019 10:49:02 Objet: Re: ceph osd commit latency increase over time, until restart Can you capture a perf top or perf record to see where teh CPU time is going on one of the OSDs wth a high latency? Thanks! sage On Fri,

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-25 Thread Alexandre DERUMIER
Cc: "ceph-users" , "ceph-devel" Envoyé: Vendredi 25 Janvier 2019 10:49:02 Objet: Re: ceph osd commit latency increase over time, until restart Can you capture a perf top or perf record to see where teh CPU time is going on one of the OSDs wth a high latency? Thanks! s

[ceph-users] ceph osd commit latency increase over time, until restart

2019-01-25 Thread Alexandre DERUMIER
Hi, I have a strange behaviour of my osd, on multiple clusters, All cluster are running mimic 13.2.1,bluestore, with ssd or nvme drivers, workload is rbd only, with qemu-kvm vms running with librbd + snapshot/rbd export-diff/snapshotdelete each day for backup When the osd are refreshly

Re: [ceph-users] MDS performance issue

2019-01-21 Thread Alexandre DERUMIER
>>How can you see that the cache is filling up and you need to execute >>"echo 2 > /proc/sys/vm/drop_caches"? you can monitor number of ceph dentry in slabinfo here a small script I'm running in cron. #!/bin/bash if pidof -o %PPID -x "dropcephinodecache.sh">/dev/null; then echo

Re: [ceph-users] CephFS - Small file - single thread - read performance.

2019-01-18 Thread Alexandre DERUMIER
Hi, I don't have so big latencies: # time cat 50bytesfile > /dev/null real0m0,002s user0m0,001s sys 0m0,000s (It's on an ceph ssd cluster (mimic), kernel cephfs client (4.18), 10GB network with small latency too, client/server have 3ghz cpus) - Mail original - De:

Re: [ceph-users] Encryption questions

2019-01-10 Thread Alexandre DERUMIER
>>1) Are RBD connections encrypted or is there an option to use encryption >>between clients and Ceph? From reading the documentation, I have the >>impression that the only option to guarantee encryption in >>transit is to >>force clients to encrypt volumes via dmcrypt. Is there another option?

Re: [ceph-users] v13.2.4 Mimic released

2019-01-07 Thread Alexandre DERUMIER
Hi, >>* Ceph v13.2.2 includes a wrong backport, which may cause mds to go into >>'damaged' state when upgrading Ceph cluster from previous version. >>The bug is fixed in v13.2.3. If you are already running v13.2.2, >>upgrading to v13.2.3 does not require special action. Any special action

Re: [ceph-users] cephfs : rsync backup create cache pressure on clients, filling caps

2019-01-06 Thread Alexandre DERUMIER
Re: [ceph-users] cephfs : rsync backup create cache pressure on clients, filling caps On Fri, Jan 4, 2019 at 11:40 AM Alexandre DERUMIER wrote: > > Hi, > > I'm currently doing cephfs backup, through a dedicated clients mounting the > whole filesystem at root. > othe

[ceph-users] cephfs : rsync backup create cache pressure on clients, filling caps

2019-01-03 Thread Alexandre DERUMIER
Hi, I'm currently doing cephfs backup, through a dedicated clients mounting the whole filesystem at root. others clients are mounting part of the filesystem. (kernel cephfs clients) I have around 22millions inodes, before backup, I have around 5M caps loaded by clients #ceph daemonperf

Re: [ceph-users] cephfs kernel, hang with libceph: osdx X.X.X.X socket closed (con state OPEN)

2018-11-08 Thread Alexandre DERUMIER
have to cross subnets or VLANs, you will run into weird problems later. Fuse has much better tolerance for that scenario. From: ceph-users on behalf of Alexandre DERUMIER Sent: Friday, 9 November 2018 12:06:43 PM To: ceph-users Subject: Re: [ceph-users] cephfs kernel, hang with libceph:

Re: [ceph-users] cephfs kernel, hang with libceph: osdx X.X.X.X socket closed (con state OPEN)

2018-11-08 Thread Alexandre DERUMIER
Ok, It seem to come from firewall, I'm seeing dropped session exactly 15min before the log. The sessions are the session to osd, session to mon && mds are ok. Seem that keeplive2 is used to monitor the mon session https://patchwork.kernel.org/patch/7105641/ but I'm not sure about osd sessions

Re: [ceph-users] cephfs kernel, hang with libceph: osdx X.X.X.X socket closed (con state OPEN)

2018-11-08 Thread Alexandre DERUMIER
To be more precise, the logs occurs when the hang is finished. I have looked at stats on 10 differents hang, and the duration is always around 15 minutes. Maybe related to: ms tcp read timeout Description:If a client or daemon makes a request to another Ceph daemon and does not drop an

[ceph-users] cephfs kernel, hang with libceph: osdx X.X.X.X socket closed (con state OPEN)

2018-11-08 Thread Alexandre DERUMIER
Hi, we are currently test cephfs with kernel module (4.17 and 4.18) instead fuse (worked fine), and we have hang, iowait jump like crazy for around 20min. client is a qemu 2.12 vm with virtio-net interface. Is the client logs, we are seeing this kind of logs: [jeu. nov. 8 12:20:18 2018]

Re: [ceph-users] Don't upgrade to 13.2.2 if you use cephfs

2018-10-17 Thread Alexandre DERUMIER
Hi, Is it possible to have more infos or announce about this problem ? I'm currently waiting to migrate from luminious to mimic, (I need new quota feature for cephfs) is it safe to upgrade to 13.2.2 ? or better to wait to 13.2.3 ? or install 13.2.1 for now ? -- Alexandre - Mail

Re: [ceph-users] Mimic and Debian 9

2018-10-17 Thread Alexandre DERUMIER
It's also possible to install ubuntu xenial packages on stretch, but this need the old libssl1.0.0 package (you can install manually the deb from jessie) - Mail original - De: "Hervé Ballans" À: "ceph-users" Envoyé: Mercredi 17 Octobre 2018 11:21:14 Objet: [ceph-users] Mimic and

Re: [ceph-users] Anyone tested Samsung 860 DCT SSDs?

2018-10-12 Thread Alexandre DERUMIER
don't have tested them, but be careful of dwpd 0.2 DWPD :/ - Mail original - De: "Kenneth Van Alstyne" À: "ceph-users" Envoyé: Vendredi 12 Octobre 2018 15:53:43 Objet: [ceph-users] Anyone tested Samsung 860 DCT SSDs? Cephers: As the subject suggests, has anyone tested Samsung 860

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-08-08 Thread Alexandre DERUMIER
Hi, I have upgraded to 12.2.7 , 2 weeks ago, and I don't see anymore memory increase ! (can't confirm that it was related to your patch). Thanks again for helping ! Regards, Alexandre Derumier - Mail original - De: "Zheng Yan" À: "aderumier" Cc: "ceph

Re: [ceph-users] krbd vs librbd performance with qemu

2018-07-18 Thread Alexandre DERUMIER
Hi, qemu use only 1 thread for disk, generally the performance limitation come from cpu. (you can have 1 thread for each disk using iothread). I'm not sure how it's work with krbd, but with librbd and qemu rbd driver, it's only use 1core by disk. So, you need to have fast cpu frequency,

Re: [ceph-users] Fwd: v13.2.0 Mimic is out

2018-06-01 Thread Alexandre DERUMIER
CephFS snapshot is now stable and enabled by default on new filesystems :) Alexandre Derumier Ingénieur système et stockage Manager Infrastructure Fixe : +33 3 59 82 20 10 125 Avenue de la république 59110 La Madeleine [ https://twitter.com/OdisoHosting ] [ https

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-29 Thread Alexandre DERUMIER
al ? Could you try path https://github.com/ceph/ceph/pull/22240/files. The leakage of MMDSBeacon messages can explain your issue. Regards Yan, Zheng On Mon, May 28, 2018 at 12:06 PM, Alexandre DERUMIER wrote: >>>could you send me full output of dump_mempools > > # ceph d

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-27 Thread Alexandre DERUMIER
sum": 0.0, "avgtime": 0.0 } }, "throttle-objecter_bytes": { "val": 0, "max": 104857600, "get_started": 0, "get": 0, "get_sum": 0,

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-24 Thread Alexandre DERUMIER
018 05:56:31 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? On Thu, May 24, 2018 at 11:34 PM, Alexandre DERUMIER <aderum...@odiso.com> wrote: >>>Still don't find any clue. Does the cephfs have idle period. If it >>>has, could you decrease mds's cac

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-24 Thread Alexandre DERUMIER
_size 1 ? - Mail original - De: "Zheng Yan" <uker...@gmail.com> À: "aderumier" <aderum...@odiso.com> Cc: "ceph-users" <ceph-users@lists.ceph.com> Envoyé: Jeudi 24 Mai 2018 16:27:21 Objet: Re: [ceph-users] ceph mds memory usage 20GB : i

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-24 Thread Alexandre DERUMIER
..@odiso.com> Cc: "ceph-users" <ceph-users@lists.ceph.com> Envoyé: Jeudi 24 Mai 2018 11:34:20 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? On Tue, May 22, 2018 at 3:11 PM, Alexandre DERUMIER <aderum...@odiso.com> wrote: > Hi,some new stats, mds me

Re: [ceph-users] Ceph replication factor of 2

2018-05-24 Thread Alexandre DERUMIER
Hi, >>My thoughts on the subject are that even though checksums do allow to find >>which replica is corrupt without having to figure which 2 out of 3 copies are >>the same, this is not the only reason min_size=2 was required. AFAIK, compare copies (like 2 out of 3 copies are the same) has

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-22 Thread Alexandre DERUMIER
"omap_wr": 3595801, "omap_rd": 232070972, "omap_del": 272598 }, "purge_queue": { "pq_executing_ops": 0, "pq_executing": 0, "pq_executed": 1659514 }, "throttle-msgr_dispatc

Re: [ceph-users] a big cluster or several small

2018-05-16 Thread Alexandre DERUMIER
Hi, >>Our main reason for using multiple clusters is that Ceph has a bad >>reliability history when scaling up and even now there are many issues >>unresolved (https://tracker.ceph.com/issues/21761 for example) so by >>dividing single, large cluster into few smaller ones, we reduce the impact

Re: [ceph-users] Intel Xeon Scalable and CPU frequency scaling on NVMe/SSD Ceph OSDs

2018-05-16 Thread Alexandre DERUMIER
Hi, I'm able to have fixed frequency with intel_pstate=disable intel_idle.max_cstate=0 processor.max_cstate=1 Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz # cat /proc/cpuinfo |grep MHz cpu MHz : 3400.002 cpu MHz : 3399.994 cpu MHz : 3399.995 cpu MHz : 3399.994 cpu

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-12 Thread Alexandre DERUMIER
0, "inode_max": 300, "inodes": 3001264, "inodes_top": 160517, "inodes_bottom": 226577, "inodes_pin_tail": 2614170, "inodes_pinned": 2770689, "inodes_expired": 2920014835, "inodes_with_caps":

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-12 Thread Alexandre DERUMIER
ot;: 0, "traverse_dir_fetch": 180547123, "traverse_remote_ino": 122257, "traverse_lock": 5957156, "load_cent": 18446743934203149911, "q": 54, "exported": 0, "exported_inodes": 0, "imported": 0, "imported_in

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-05-11 Thread Alexandre DERUMIER
Hi, I'm still seeing memory leak with 12.2.5. seem to leak some MB each 5 minutes. I'll try to resent some stats next weekend. - Mail original - De: "Patrick Donnelly" <pdonn...@redhat.com> À: "Brady Deetz" <bde...@gmail.com> Cc: "Alexandre Derumie

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-04-18 Thread Alexandre DERUMIER
"get_started": 0, "get": 831356927, "get_sum": 4299208168815, "get_or_fail_fail": 0, "get_or_fail_success": 831356927, "take": 0, "take_sum": 0, "put": 831356

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-04-14 Thread Alexandre DERUMIER
} }, "throttle-objecter_bytes": { "val": 0, "max": 104857600, "get_started": 0, "get": 0, "get_sum": 0, "get_or_fail_fail": 0, "get_or_fail_success": 0,

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-03-28 Thread Alexandre DERUMIER
nfs ->cephfs daemon running on this cluster. (but no client connected to it) >>Note [1] will be in 12.2.5 and may help with your issue. >>[1] https://github.com/ceph/ceph/pull/20527 ok thanks ! - Mail original - De: "Patrick Donnelly" <pdonn...@red

Re: [ceph-users] Luminous and jemalloc

2018-03-23 Thread Alexandre DERUMIER
Hi, I think it's no more a problem since async messenger is default. Difference is minimal now between jemalloc and tcmalloc. Regards, Alexandre - Mail original - De: "Xavier Trilla" À: "ceph-users" Cc: "Arnau Marcé"

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-03-23 Thread Alexandre DERUMIER
01:08:46 Objet: Re: [ceph-users] ceph mds memory usage 20GB : is it normal ? Did the fs have lots of mount/umount? We recently found a memory leak bug in that area https://github.com/ceph/ceph/pull/20148 Regards Yan, Zheng On Thu, Mar 22, 2018 at 5:29 PM, Alexandre DERUMIER <aderum..

[ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-03-22 Thread Alexandre DERUMIER
Hi, I'm running cephfs since 2 months now, and my active msd memory usage is around 20G now (still growing). ceph 1521539 10.8 31.2 20929836 20534868 ? Ssl janv.26 8573:34 /usr/bin/ceph-mds -f --cluster ceph --id 2 --setuser ceph --setgroup ceph USER PID %CPU %MEMVSZ RSS

Re: [ceph-users] Memory leak in Ceph OSD?

2018-03-06 Thread Alexandre DERUMIER
Hi, I'm also seeing slow memory increase over time with my bluestore nvme osds (3,2tb each) , with default ceph.conf settings. (ceph 12.2.2) each osd start around 5G memory, and go up to 8GB. Currently I'm restarting them around each month to free memory. here a dump of osd.0 after 1week

Re: [ceph-users] Migrating to new pools

2018-02-21 Thread Alexandre DERUMIER
Hi, if you use qemu, it's also possible to use drive-mirror feature from qemu. (can mirror and migrate from 1 storage to another storage without downtime). I don't known if openstack has implemented it, but It's working fine on proxmox. - Mail original - De: "Anthony D'Atri"

Re: [ceph-users] Does anyone else still experiancing memory issues with 12.2.2 and Bluestore?

2018-02-10 Thread Alexandre DERUMIER
Hi, I still have my osd memory growing slowly. Default config, with ssd osd, start around 5Gb, and after 1-2 months, near 8gb. (Maybe related to fragmentation ?) USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND ceph 1718009 2.5 11.7 8542012 7725992 ? Ssl

Re: [ceph-users] Question about librbd with qemu-kvm

2018-01-02 Thread Alexandre DERUMIER
It's not possible to use multiple threads by disk in qemu currently. (It's on qemu roadmap). but you can create multiple disk/rbd image and use multiple qemu iothreads. (1 by disk). (BTW, I'm able to reach around 70k iops max with 4k read, with 3,1ghz cpu, rbd_cache=none, disabling debug and

Re: [ceph-users] Corrupted files on CephFS since Luminous upgrade

2017-12-08 Thread Alexandre DERUMIER
do you have disabled fuse pagecache on your clients ceph.conf ? [client] fuse_disable_pagecache = true - Mail original - De: "Florent Bautista" À: "ceph-users" Envoyé: Vendredi 8 Décembre 2017 10:54:59 Objet: Re: [ceph-users] Corrupted

Re: [ceph-users] luminous vs jewel rbd performance

2017-09-21 Thread Alexandre DERUMIER
direct=1 [write-rbd1-4k-depth1] rbdname=rbd-tester-fio bs=4k iodepth=1 rw=write stonewall [write-rbd2-4k-depth16] rbdname=rbd-tester-fio-2 bs=4k iodepth=16 rw=write stonewall Raf On 20 September 2017 at 16:43, Alexandre DERUMIER < [ mailto:aderum...@odiso.com | aderum...@odis

Re: [ceph-users] luminous vs jewel rbd performance

2017-09-20 Thread Alexandre DERUMIER
Hi so, you use also filestore on luminous ? do you have also upgraded librbd on client ? (are you benching inside a qemu machine ? or directly with fio-rbd ?) (I'm going to do a lot of benchmarks in coming week, I'll post results on mailing soon.) - Mail original - De: "Rafael

Re: [ceph-users] output discards (queue drops) on switchport

2017-09-08 Thread Alexandre DERUMIER
for you. (I'm currently try to backports last mlx5 patches from kernel 4.12 to kernel 4.10, to see if it's helping) I have open a thread on pve-devel mailing list today. - Mail original - De: "Alexandre Derumier" <aderum...@odiso.com> À: "Burkhar

Re: [ceph-users] output discards (queue drops) on switchport

2017-09-08 Thread Alexandre DERUMIER
Hi, >> public network Mellanox ConnectX-4 Lx dual-port 25 GBit/s which kernel/distro do you use ? I have same card, and I had problem with centos7 kernel 3.10 recently, with packet drop i have also problems with ubuntu kernel 4.10 and lacp kernel 4.4 or 4.12 are working fine for me.

Re: [ceph-users] PCIe journal benefit for SSD OSDs

2017-09-07 Thread Alexandre DERUMIER
Hi Stefan >>Have you already done tests how he performance changes with bluestore >>while putting all 3 block devices on the same ssd? I'm going to test bluestore with 3 nodes , 18 x intel s3610 1,6TB in coming weeks. I'll send results on the mailing. - Mail original - De: "Stefan

Re: [ceph-users] Broken Ceph Cluster when adding new one - Proxmox 5.0 & Ceph Luminous

2017-07-26 Thread Alexandre DERUMIER
Hi Phil, It's possible that rocksdb have a bug with some old cpus currently (old xeon and some opteron) I have the same behaviour with new cluster when creating mons http://tracker.ceph.com/issues/20529 What is your cpu model ? in your log: sh[1869]: in thread 7f6d85db3c80

Re: [ceph-users] qemu-img convert vs rbd import performance

2017-07-21 Thread Alexandre DERUMIER
be affecting the qemu-img convert performance? On Fri, Jul 21, 2017 at 2:24 PM, Alexandre DERUMIER < [ mailto:aderum...@odiso.com | aderum...@odiso.com ] > wrote: It's already in qemu 2.9 [ http://git.qemu.org/?p=qemu.git;a=commit;h=2d9187bc65727d9dd63e2c410b5500add3db0b0d | h

Re: [ceph-users] qemu-img convert vs rbd import performance

2017-07-21 Thread Alexandre DERUMIER
It's already in qemu 2.9 http://git.qemu.org/?p=qemu.git;a=commit;h=2d9187bc65727d9dd63e2c410b5500add3db0b0d " This patches introduces 2 new cmdline parameters. The -m parameter to specify the number of coroutines running in parallel (defaults to 8). And the -W parameter to allow qemu-img to

Re: [ceph-users] qemu-img convert vs rbd import performance

2017-07-21 Thread Alexandre DERUMIER
Hi, they are an RFC here: "[RFC] qemu-img: make convert async" https://patchwork.kernel.org/patch/9552415/ maybe it could help - Mail original - De: "Jason Dillaman" À: "Mahesh Jambhulkar" Cc: "ceph-users"

Re: [ceph-users] Ceph mount rbd

2017-06-30 Thread Alexandre DERUMIER
>>Of course, I always have to ask the use-case behind mapping the same image on >>multiple hosts. Perhaps CephFS would be a better fit if you are trying to >>serve out a filesystem? Hi jason, Currently I'm sharing rbd images between multiple webservers vm with ocfs2 on top. They have old

Re: [ceph-users] Ceph packages for Debian Stretch?

2017-06-21 Thread Alexandre DERUMIER
Hi, Proxmox is maintening a ceph-luminous repo for stretch http://download.proxmox.com/debian/ceph-luminous/ git is here, with patches and modifications to get it work https://git.proxmox.com/?p=ceph.git;a=summary - Mail original - De: "Alfredo Deza" À: "Christian

Re: [ceph-users] Sharing SSD journals and SSD drive choice

2017-04-27 Thread Alexandre DERUMIER
Hi, >>What I'm trying to get from the list is /why/ the "enterprise" drives >>are important. Performance? Reliability? Something else? performance, for sure (for SYNC write, https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/) Reliabity

Re: [ceph-users] ceph packages on stretch from eu.ceph.com

2017-04-26 Thread Alexandre DERUMIER
you can try the proxmox stretch repository if you want http://download.proxmox.com/debian/ceph-luminous/dists/stretch/ - Mail original - De: "Wido den Hollander" À: "ceph-users" , "Ronny Aasen" Envoyé: Mercredi 26

Re: [ceph-users] libjemalloc.so.1 not used?

2017-03-27 Thread Alexandre DERUMIER
you need to recompile ceph with jemalloc, without have tcmalloc dev librairies. LD_PRELOAD has never work for jemalloc and ceph - Mail original - De: "Engelmann Florian" À: "ceph-users" Envoyé: Lundi 27 Mars 2017 16:54:33

Re: [ceph-users] noout, nodown and blocked requests

2017-03-13 Thread Alexandre DERUMIER
Hi, >>Currently I have the. noout and nodown flags set while doing the maintenance >>work. you only need noout to avoid rebalancing see documentation: http://docs.ceph.com/docs/kraken/rados/troubleshooting/troubleshooting-osd/ "STOPPING W/OUT REBALANCING". Your clients are hanging because of

Re: [ceph-users] Posix AIO vs libaio read performance

2017-03-10 Thread Alexandre DERUMIER
sixaio --buffered=0 --direct=1 --rw=randread --bs=4k --size=1024m --iodepth=32 It would be really helpful to know if I'm the only one affected or this is happening in all qemu + ceph setups. Thanks! Xavier El 10 mar 2017, a las 8:07, Alexandre DERUMIER < [ mailto:aderum...@odiso.com | ader

Re: [ceph-users] Posix AIO vs libaio read performance

2017-03-09 Thread Alexandre DERUMIER
>>But it still looks like there is some bottleneck in QEMU o Librbd I cannot >>manage to find. you can improve latency on client with disable debug. on your client, create a /etc/ceph/ceph.conf with [global] debug asok = 0/0 debug auth = 0/0 debug buffer = 0/0 debug client = 0/0 debug

Re: [ceph-users] KVM/QEMU rbd read latency

2017-02-17 Thread Alexandre DERUMIER
son Dillaman" <jdill...@redhat.com> À: "aderumier" <aderum...@odiso.com> Cc: "Phil Lacroute" <lacro...@skyportsystems.com>, "ceph-users" <ceph-users@lists.ceph.com> Envoyé: Vendredi 17 Février 2017 15:16:39 Objet: Re: [ceph-users] KVM/QEM

Re: [ceph-users] KVM/QEMU rbd read latency

2017-02-16 Thread Alexandre DERUMIER
Hi, Currently I can reduce the latency with - compile qemu to use jemalloc - disabling rbd_cache (or qemu cache=none) disabling debug in /etc/ceph.conf on the client node [global] debug asok = 0/0 debug auth = 0/0 debug buffer = 0/0 debug client = 0/0 debug context = 0/0 debug crush =

Re: [ceph-users] Ceph counters decrementing after changing pg_num

2017-01-20 Thread Alexandre DERUMIER
if you change pg_num value, ceph will reshuffle almost all datas, so depend of the size of your storage, it can take some times ... - Mail original - De: "Kai Storbeck" À: "ceph-users" Envoyé: Vendredi 20 Janvier 2017 17:17:08 Objet:

Re: [ceph-users] 答复: Does this indicate a "CPU bottleneck"?

2017-01-19 Thread Alexandre DERUMIER
Do you have checked cpu usage on clients ? also, when you increase number of osd, do you increase pg_num ? can you provide your fio job config ? - Mail original - De: "许雪寒" À: "John Spray" Cc: "ceph-users" Envoyé:

Re: [ceph-users] Calamari or Alternative

2017-01-13 Thread Alexandre DERUMIER
Another tool : http://openattic.org/ - Mail original - De: "Marko Stojanovic" À: "Tu Holmes" , "John Petrini" Cc: "ceph-users" Envoyé: Vendredi 13 Janvier 2017 09:30:16 Objet: Re: [ceph-users]

  1   2   3   4   5   >