Re: [ceph-users] Meetup in Paris

2014-02-05 Thread Alexandre DERUMIER
I'm on the waiting list too, but anyway, not sure to come yet. (I come from Lille, and I need to check with my boss to leave early) By the way, any chance to have some ceph conference or meetup at solutions linux expo at paris this year ? - Mail original - De: Loic Dachary

Re: [ceph-users] RBD Caching - How to enable?

2014-02-06 Thread Alexandre DERUMIER
The documentation states that setting rbd cache = true in [global] enables it, but doesn't elaborate on whether you need to restart any Ceph processe It's on the client side ! (so no need to restart ceph daemons) - Mail original - De: Graeme Lambert glamb...@adepteo.net À:

Re: [ceph-users] RBD Caching - How to enable?

2014-02-06 Thread Alexandre DERUMIER
.conf :) http://ceph.com/docs/next/rbd/qemu-rbd/ - Mail original - De: Graeme Lambert glamb...@adepteo.net À: Alexandre DERUMIER aderum...@odiso.com Cc: ceph-users@lists.ceph.com Envoyé: Jeudi 6 Février 2014 12:03:00 Objet: Re: [ceph-users] RBD Caching - How to enable? Hi

Re: [ceph-users] RBD Caching - How to enable?

2014-02-07 Thread Alexandre DERUMIER
] RBD Caching - How to enable? Does anybody else think there is a problem with the docs/settings here... Message: 13 Date: Thu, 06 Feb 2014 12:11:53 +0100 (CET) From: Alexandre DERUMIER aderum...@odiso.com To: Graeme Lambert glamb...@adepteo.net Cc: ceph-users@lists.ceph.com Subject

[ceph-users] Questions about coming cache pool

2014-02-07 Thread Alexandre DERUMIER
Hi, I have some questions about comming cache pool feature. Is it only a cache ? (are the datas on both cache pool and main pool ?) Or are the datas migrated from the main pool to cache pool ? Do we need to enable replication on the cache pool ? What happen if we loose osds from cache pool ?

Re: [ceph-users] Paris Ceph meetup

2014-02-18 Thread Alexandre DERUMIER
They were in french unfortunately. Hi, I'm interested by the french presentations ;) - Mail original - De: Loic Dachary l...@dachary.org À: Karan Singh karan.si...@csc.fi Cc: ceph-users ceph-users@lists.ceph.com, Ceph Community ceph-commun...@lists.ceph.com Envoyé: Mardi 18 Février

[ceph-users] questions about ceph cluster in multi-dacenter

2014-02-20 Thread Alexandre DERUMIER
Hi, I'm looking to build a ceph cluster in a multi datacenter architecture. Datas will be stored on DC1 and DC2 (mon + osds), and we'll have a third datacenter DC3, with a monitor only for quorum 2x dark fiber DC1-DC2 | | |

Re: [ceph-users] [Annonce]The progress of KeyValueStore in Firely

2014-02-27 Thread Alexandre DERUMIER
Thanks for the report ! Results seem to be encouraging . (Is is leveldb keystore ?) Thanks to fio-rbd, it'll be easier to do random io benchmark now ! (I'm waiting to see if rocksdb will improve things in the future) Regards, Alexandre - Mail original - De: Haomai Wang

Re: [ceph-users] Enabling discard/trim

2014-03-04 Thread Alexandre DERUMIER
Hi, you should have -drive file=rbd:libvirt,discard=on in command line, to have discard enabled - Mail original - De: ljm李嘉敏 jm...@ctrip.com À: ceph-us...@ceph.com Envoyé: Mercredi 5 Mars 2014 02:37:52 Objet: [ceph-users] Enabling discard/trim Dear all, I try to use

Re: [ceph-users] v0.90 released

2014-12-28 Thread Alexandre DERUMIER
After apt-get update and upgrade i stil see 0.87 release .. any hint ? what is your repository in sources.list ? - Mail original - De: Zeeshan Ali Shah zas...@pdc.kth.se À: Florent MONTHEL fmont...@flox-arts.net Cc: Sage Weil sw...@redhat.com, ceph-users ceph-us...@ceph.com, René

Re: [ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-02 Thread Alexandre DERUMIER
...@dachary.org À: aderumier aderum...@odiso.com, ceph-users ceph-users@lists.ceph.com Envoyé: Dimanche 1 Février 2015 18:42:51 Objet: Re: [ceph-users] erasure code : number of chunks for a small cluster ? Hi Alexandre, On 01/02/2015 18:15, Alexandre DERUMIER wrote: Hi, I'm currently trying

Re: [ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-02 Thread Alexandre DERUMIER
host can be down for maintenance. Udo PS: you also can't change k+m on a pool later... On 01.02.2015 18:15, Alexandre DERUMIER wrote: Hi, I'm currently trying to understand how to setup correctly a pool with erasure code https://ceph.com/docs/v0.80/dev/osd_internals/erasure_coding

Re: [ceph-users] features of the next stable release

2015-02-03 Thread Alexandre DERUMIER
Hi, From my tests with giant, this was the cpu which limit the performance on osd. I'm going to do some benchmark with 2x10 cores 3,1ghz for 6ssd next month. I'll post results on the mailing list. - Mail original - De: mad Engineer themadengin...@gmail.com À: Gregory Farnum

Re: [ceph-users] Update 0.80.7 to 0.80.8 -- Restart Order

2015-02-03 Thread Alexandre DERUMIER
debian deb packages update are not restarting services. (So, I think it should be the same for ubuntu). you need to restart daemons in this order: -monitor -osd -mds -rados gateway http://ceph.com/docs/master/install/upgrading-ceph/ - Mail original - De: Stephen Jahl

Re: [ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-06 Thread Alexandre DERUMIER
Oh, I didn't thinked about this. Thanks Hector ! - Mail original - De: Hector Martin hec...@marcansoft.com À: ceph-users ceph-users@lists.ceph.com Envoyé: Vendredi 6 Février 2015 09:06:29 Objet: Re: [ceph-users] erasure code : number of chunks for a small cluster ? On 02/02/15 03:38,

[ceph-users] crush tunables : optimal : upgrade from firefly to hammer behaviour ?

2015-02-08 Thread Alexandre DERUMIER
Hi, I'm currently use crush tunables optimal value. If I upgrade from firefly to hammer, does the optimal value will upgrade to optimal values for hammer. So, does my clients (qemu-librbd) need to be also upgraded to hammer to support new hammer features ? If yes, I think to: - change

Re: [ceph-users] crush tunables : optimal : upgrade from firefly to hammer behaviour ?

2015-02-08 Thread Alexandre DERUMIER
: optimal : upgrade from firefly to hammer behaviour ? On Mon, 9 Feb 2015, Alexandre DERUMIER wrote: Hi, I'm currently use crush tunables optimal value. If I upgrade from firefly to hammer, does the optimal value will upgrade to optimal values for hammer. The tunables won't change

Re: [ceph-users] Journals on all SSD cluster

2015-01-21 Thread Alexandre DERUMIER
Hi, From my last benchmark, I was around 12 iops rand read 4k , 2iops rand write 4k (3 nodes with 2ssd osd+journal ssd intel 3500) My main bottleneck was cpu (it's was 2x4cores 1,4ghz intel), both on osd and client. I'm going to test next month my production cluster, with bigger

Re: [ceph-users] Journals on all SSD cluster

2015-01-22 Thread Alexandre DERUMIER
...@gol.com À: ceph-users ceph-users@lists.ceph.com Cc: aderumier aderum...@odiso.com Envoyé: Jeudi 22 Janvier 2015 09:37:03 Objet: Re: [ceph-users] Journals on all SSD cluster Hello, On Thu, 22 Jan 2015 08:32:13 +0100 (CET) Alexandre DERUMIER wrote: Hi, From my last benchmark, Using which version

Re: [ceph-users] stuck with dell perc 710p / (aka mega raid 2208?)

2015-02-10 Thread Alexandre DERUMIER
Hi, you need to import foreign config from openmanage webui. somewhere in storage controller BTW, I'm currently testing new dell r630 with a perc h330 ( lsi 3008) With this controller, it's possible to do hardware for some disks, and passthrough for some others disks. So, perfect for ceph

Re: [ceph-users] Ceph Dumpling/Firefly/Hammer SSD/Memstore performance comparison

2015-02-18 Thread Alexandre DERUMIER
Nice Work Mark ! I don't see any tuning about sharding in the config file sample (osd_op_num_threads_per_shard,osd_op_num_shards,...) as you only use 1 ssd for the bench, I think it should improve results for hammer ? - Mail original - De: Mark Nelson mnel...@redhat.com À:

Re: [ceph-users] mongodb on top of rbd volumes (through krbd) ?

2015-02-12 Thread Alexandre DERUMIER
/2015 05:55, Alexandre DERUMIER wrote: Hi, I'm currently running a big mongodb cluster, around 2TB, (sharding + replication). And I have a lot of problems with mongo replication (out of syncs and need to full replicate again and again datas between my mongo replicats). So, I thinked

Re: [ceph-users] re: Upgrade 0.80.5 to 0.80.8 --the VM's read requestbecome too slow

2015-02-12 Thread Alexandre DERUMIER
Hi, Can you test with disabling rbd_cache ? I remember of a bug detected in giant, not sure it's also the case for fireflt This was this tracker: http://tracker.ceph.com/issues/9513 But It has been solved and backported to firefly. Also, can you test 0.80.6 and 0.80.7 ? - Mail

Re: [ceph-users] Introducing Learning Ceph : The First ever Book on Ceph

2015-02-13 Thread Alexandre DERUMIER
Just buy it. Nice book, I don't have read yet all the book, but it seem to cover all ceph features. Good job ! - Mail original - De: Karan Singh karan.si...@csc.fi À: Ceph Community ceph-commun...@lists.ceph.com, ceph-users ceph-users@lists.ceph.com, ceph-maintain...@ceph.com,

Re: [ceph-users] OSD slow requests causing disk aborts in KVM

2015-02-13 Thread Alexandre DERUMIER
Can this timeout be increased in some way? I've searched around and found the /sys/block/sdx/device/timeout knob, which in my case is set to 30s. yes, sure echo 60 /sys/block/sdx/device/timeout for 60s for example - Mail original - De: Krzysztof Nowicki

[ceph-users] new ssd intel s3610, has somebody tested them ?

2015-02-19 Thread Alexandre DERUMIER
Hi, Intel has just released new ssd s3610: http://www.anandtech.com/show/8954/intel-launches-ssd-dc-s3610-s3710-enterprise-ssds endurance is 10x bigger than 3500, for 10% cost addition. Has somebody already tested them ? Regards, Alexandre ___

[ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-01 Thread Alexandre DERUMIER
Hi, I'm currently trying to understand how to setup correctly a pool with erasure code https://ceph.com/docs/v0.80/dev/osd_internals/erasure_coding/developer_notes/ My cluster is 3 nodes with 6 osd for each node (18 osd total). I want to be able to survive of 2 disk failures, but also a

Re: [ceph-users] slow read-performance inside the vm

2015-01-11 Thread Alexandre DERUMIER
Hi, also check your cpu usage, dell poweredge 2900 are quite old (6-8 years old), The more iops you need, to more cpu you need. I don't remember what is the default blocksize of rados bench. - Mail original - De: Patrik Plank pat...@plank.me À: ceph-users ceph-users@lists.ceph.com

Re: [ceph-users] Upgrade 0.80.5 to 0.80.8 --the VM's read request become too slow

2015-02-12 Thread Alexandre DERUMIER
Hi, Do you have also tested 0.80.6 and 0.80.7 librbd ? could be usefull to search commits in git. (I'm not sure that all changes are in the release note) - Mail original - De: 杨万元 yangwanyuan8...@gmail.com À: ceph-users ceph-users@lists.ceph.com Envoyé: Jeudi 12 Février 2015 04:14:15

Re: [ceph-users] OSD slow requests causing disk aborts in KVM

2015-02-12 Thread Alexandre DERUMIER
To my surprise however these slow requests caused aborts from the block device on the VM side, which ended up corrupting files This is very strange, you shouldn't have corruption. Do you use writeback ? if yes, do you have disable barrier on your filesystem ? (What is the qemu version ? guest

[ceph-users] mongodb on top of rbd volumes (through krbd) ?

2015-02-11 Thread Alexandre DERUMIER
Hi, I'm currently running a big mongodb cluster, around 2TB, (sharding + replication). And I have a lot of problems with mongo replication (out of syncs and need to full replicate again and again datas between my mongo replicats). So, I thinked to use rbd to replicate the storage and keep

Re: [ceph-users] ceph Performance random write is more then sequential

2015-02-04 Thread Alexandre DERUMIER
Hi, What I saw after enabling RBD cache it is working as expected, means sequential write has better MBps than random write. can somebody explain this behaviour ? This is because rbd_cache merge coalesced ios in bigger ios, so it's working only with sequential workload. you'll do less ios

Re: [ceph-users] SSD Hardware recommendation

2015-03-18 Thread Alexandre DERUMIER
will see it when the stuff is being assembled :o Do you feel that there something in the setup that could be better if you would decide on hardware as of today? Also, will you try out replication set of 2 as well? Thanks Josef On 18 Mar 2015, at 08:19, Alexandre DERUMIER aderum...@odiso.com

Re: [ceph-users] RBD read-ahead not working in 0.87.1

2015-03-18 Thread Alexandre DERUMIER
Hi, I don't known how rbd read-ahead is working, but with qemu virtio-scsi, you can have read merge request (for sequential reads), so it's doing bigger ops to ceph cluster and improve throughput. virtio-blk merge request will be supported in coming qemu 2.3. (I'm not sure of virtio-win

Re: [ceph-users] ceph.conf

2015-03-18 Thread Alexandre DERUMIER
Hi, you don't need to defined ip and host for osd, but you need to defined monitor ips [osd] keyring = /var/lib/ceph/osd/ceph-$id/keyring [mon.0] host = node1 mon addr = X.X.X.X:6789 [mon.1] host = node2 mon addr = X.X.X.X:6789 [mon.2] host

Re: [ceph-users] SSD Hardware recommendation

2015-03-18 Thread Alexandre DERUMIER
Hi Josef, I'm going to benchmark a 3nodes cluster with 6ssd each node (2x10 cores 3,1ghz). From my previous bench, you need fast cpus if you need a lot of iops, and writes are lot more expansive than reads. Now i'm you are doing only small iops (big blocks / big throughput), you don't need too

Re: [ceph-users] [SPAM] Changing pg_num = RBD VM down !

2015-03-16 Thread Alexandre DERUMIER
That full system slows down, OK, but brutal stop... This is strange, that could be: - qemu crash, maybe a bug in rbd block storage (if you use librbd) - oom-killer on you host (any logs ?) what is your qemu version ? - Mail original - De: Florent Bautista flor...@coppint.com À:

Re: [ceph-users] [SPAM] Changing pg_num = RBD VM down !

2015-03-16 Thread Alexandre DERUMIER
: aderumier aderum...@odiso.com Cc: ceph-users ceph-users@lists.ceph.com Envoyé: Lundi 16 Mars 2015 11:14:45 Objet: Re: [ceph-users] [SPAM] Changing pg_num = RBD VM down ! On 03/16/2015 11:03 AM, Alexandre DERUMIER wrote: This is strange, that could be: - qemu crash, maybe a bug in rbd block

[ceph-users] centos vs ubuntu for production ceph cluster ?

2015-03-20 Thread Alexandre DERUMIER
Hi, I'll build my full ssd production soon, I wonder which distrib is best tested with inktank and ceph team ? ceph.com doc is quite old, and don't have reference for giant or hammer http://ceph.com/docs/master/start/os-recommendations/ Seem than in past only ubuntu and rhel was well tested,

Re: [ceph-users] [SPAM] Changing pg_num = RBD VM down !

2015-03-16 Thread Alexandre DERUMIER
aderum...@odiso.com Cc: ceph-users ceph-users@lists.ceph.com Envoyé: Lundi 16 Mars 2015 12:35:11 Objet: Re: [ceph-users] [SPAM] Changing pg_num = RBD VM down ! On 03/16/2015 12:23 PM, Alexandre DERUMIER wrote: We use Proxmox, so I think it uses librbd ? As It's me that I made the proxmox rbd plugin

Re: [ceph-users] Multiple OSD's in a Each node with replica 2

2015-03-06 Thread Alexandre DERUMIER
Is it possible all replicas of an object to be saved in the same node? No. (until you don't wrongly modify the crushmap manually) Is it possible to lose any? with replicat x2, if you loose 2osd on 2differents nodes, with the same object inside, you'll lost the object Is there a mechanism

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Alexandre DERUMIER
: 06 March 2015 16:02 To: Alexandre DERUMIER; Nick Fisk Cc: ceph-users Subject: Re: [ceph-users] Strange krbd behaviour with queue depths Nick, I think this is because of the krbd you are using is using Naggle's algorithm i.e TCP_NODELAY = false by default. The latest krbd module

Re: [ceph-users] Update 0.80.5 to 0.80.8 --the VM's read request become too slow

2015-03-02 Thread Alexandre DERUMIER
I think this will be fixed in next firefly point release tracker for firefly 0.80.8 speed decrease http://tracker.ceph.com/issues/10956 Jason Dillaman link it to the famous object_cacher bug: http://tracker.ceph.com/issues/9854 - Mail original - De: Gregory Farnum g...@gregs42.com À:

Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel

2015-02-28 Thread Alexandre DERUMIER
cluster with 3 nodes and 9 OSD with 3.16-3 kernel Am 28.02.2015 um 12:43 schrieb Alexandre DERUMIER aderum...@odiso.com: Hi, First, test if your ssd can write fast with O_DSYNC check this blog: http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable

Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel

2015-02-28 Thread Alexandre DERUMIER
As optimisation, try to set ioscheduler to noop, and also enable rbd_cache=true. (It's really helping for for sequential writes) but your results seem quite low, 926kb/s with 4k, it's only 200io/s. check if you don't have any big network latencies, or mtu fragementation problem. Maybe also

Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel

2015-02-28 Thread Alexandre DERUMIER
Hi, First, test if your ssd can write fast with O_DSYNC check this blog: http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ Then, try with ceph Giant (or maybe wait for Hammer), because they are a lot of optimisations for ssd for threads

Re: [ceph-users] Possibly misleading/outdated documentation about qemu/kvm and rbd cache settings

2015-02-27 Thread Alexandre DERUMIER
Hi, from qemu rbd.c if (flags BDRV_O_NOCACHE) { rados_conf_set(s-cluster, rbd_cache, false); } else { rados_conf_set(s-cluster, rbd_cache, true); } and block.c int bdrv_parse_cache_flags(const char *mode, int *flags) { *flags = ~BDRV_O_CACHE_MASK; if

Re: [ceph-users] v0.87.1 Giant released

2015-02-26 Thread Alexandre DERUMIER
I just upgraded my debian giant cluster, 1)on each node: - apt-get update apt-get dist-upgrade 2)on each node: - /etc/init.d/ceph restart mon #ceph -w --- verify that HEALTH is ok before doing another node 3)on each node: - /etc/init.d/ceph restart osd

Re: [ceph-users] Ceph Hammer OSD Shard Tuning Test Results

2015-02-26 Thread Alexandre DERUMIER
comparison thread, Alexandre DERUMIER wondered if changing the default shard and threads per shard OSD settings might have a positive effect on performance in our tests. I went back and used one of the PCIe SSDs from our previous tests to experiment with a recent master pull. I wanted to know how

Re: [ceph-users] Perf problem after upgrade from dumpling to firefly

2015-03-04 Thread Alexandre DERUMIER
Mars 2015 16:42:13 Objet: Re: [ceph-users] Perf problem after upgrade from dumpling to firefly Only writes ;) Le mercredi 04 mars 2015 à 16:19 +0100, Alexandre DERUMIER a écrit : The change is only on OSD (and not on OSD journal). do you see twice iops for read and write ? if only read

Re: [ceph-users] CEPH hardware recommendations and cluster design questions

2015-03-04 Thread Alexandre DERUMIER
Hi for hardware, inktank have good guides here: http://www.inktank.com/resource/inktank-hardware-selection-guide/ http://www.inktank.com/resource/inktank-hardware-configuration-guide/ ceph works well with multiple osd daemon (1 osd by disk), so you should not use raid. (xfs is the recommended

Re: [ceph-users] Perf problem after upgrade from dumpling to firefly

2015-03-04 Thread Alexandre DERUMIER
(and not on OSD journal). Le mercredi 04 mars 2015 à 15:05 +0100, Alexandre DERUMIER a écrit : The load problem is permanent : I have twice IO/s on HDD since firefly. Oh, permanent, that's strange. (If you don't see more traffic coming from clients, I don't understand...) do you see also

Re: [ceph-users] Perf problem after upgrade from dumpling to firefly

2015-03-04 Thread Alexandre DERUMIER
Hi, maybe this is related ?: http://tracker.ceph.com/issues/9503 Dumpling: removing many snapshots in a short time makes OSDs go berserk http://tracker.ceph.com/issues/9487 dumpling: snaptrimmer causes slow requests while backfilling. osd_snap_trim_sleep not helping

Re: [ceph-users] Perf problem after upgrade from dumpling to firefly

2015-03-04 Thread Alexandre DERUMIER
firefly. And yes, the problem hang the production at night during snap trimming. I suppose there is a new OSD parameter which change behavior of the journal, or something like that. But didn't find anything about that. Olivier Le mercredi 04 mars 2015 à 14:44 +0100, Alexandre DERUMIER

Re: [ceph-users] Ceph Hammer OSD Shard Tuning Test Results

2015-03-01 Thread Alexandre DERUMIER
, In the Ceph Dumpling/Firefly/Hammer SSD/Memstore performance comparison thread, Alexandre DERUMIER wondered if changing the default shard and threads per shard OSD settings might have a positive effect on performance in our tests. I went back and used one of the PCIe SSDs from our previous tests

Re: [ceph-users] Ceph Hammer OSD Shard Tuning Test Results

2015-03-01 Thread Alexandre DERUMIER
2015 22:49:23 Objet: Re: [ceph-users] Ceph Hammer OSD Shard Tuning Test Results Can I ask what xio and simple messenger are and the differences? Kind regards Kevin Walker +968 9765 1742 On 1 Mar 2015, at 18:38, Alexandre DERUMIER aderum...@odiso.com wrote: Hi Mark, I found an previous

Re: [ceph-users] re: Upgrade 0.80.5 to 0.80.8 --the VM's read requestbecome too slow

2015-02-21 Thread Alexandre DERUMIER
can sure this problem is cause from 0.80.8 2015-02-12 19:33 GMT+08:00 Alexandre DERUMIER aderum...@odiso.com : Hi, Can you test with disabling rbd_cache ? I remember of a bug detected in giant, not sure it's also the case for fireflt This was this tracker: http://tracker.ceph.com/issues

Re: [ceph-users] Ceph 0.87-1

2015-02-25 Thread Alexandre DERUMIER
Hi, I known that Loic Dachary was currently working on backporting new feature on giant, I see that 0.87.1 has been tagged in git too: here the difference: https://github.com/ceph/ceph/compare/v0.87...v0.87.1 Loic, any annoucement/release note, yet ? - Mail original - De: Lindsay

Re: [ceph-users] Write IO Problem

2015-03-24 Thread Alexandre DERUMIER
Hi, dd if=/dev/zero of=tempfile bs=1M count=1024 conv=fdatasync,notrunc 1073741824 Bytes (1,1 GB) kopiert, 2,53986 s, 423 MB/s How much do you get with o_dsync? (ceph journal use o_dsync, and some ssd are pretty slow with dsync)

Re: [ceph-users] Write IO Problem

2015-03-24 Thread Alexandre DERUMIER
07:56:33 +0100 (CET) Alexandre DERUMIER wrote: Hi, dd if=/dev/zero of=tempfile bs=1M count=1024 conv=fdatasync,notrunc 1073741824 Bytes (1,1 GB) kopiert, 2,53986 s, 423 MB/s How much do you get with o_dsync? (ceph journal use o_dsync, and some ssd are pretty slow with dsync

Re: [ceph-users] SSD Hardware recommendation

2015-03-23 Thread Alexandre DERUMIER
Hi, Isn't it in the nature of ssd to have write amplication ? Generaly, they have a erase block size of 128k, so the worst case could be 128/4 = 32x write amplification. (of course ssd algorithms and optimisations reduce this write amplification). Now, it could be great to see if it's coming

Re: [ceph-users] SSD Hardware recommendation

2015-03-23 Thread Alexandre DERUMIER
(not tested, but I think with journal and O_DSYNC writes, it can give use ssd write amplification) also, I think that enterprise ssd with supercapacitor, should be able to cache theses o_dsync writes in the ssd buffer, and do bigger writes to reduce amplification. Don't known how ssd internal

Re: [ceph-users] New Intel 750 PCIe SSD

2015-04-02 Thread Alexandre DERUMIER
performance seem quite impressive (40iops 4k) :) for endurance, It's looking like intel s3500 I still preferer s3610 for now, 400GB for $419 , with 3.0PB endurance. - Mail original - De: Mark Nelson mnel...@redhat.com À: ceph-users ceph-users@lists.ceph.com Envoyé: Jeudi 2

Re: [ceph-users] RADOS Bench slow write speed

2015-04-20 Thread Alexandre DERUMIER
Hi, for writes, ceph write twice to the disk, 1 for journal 1 for datas. (so half write bandwith) and journal is writen with O_DSYNC (you should test your disk with fio --sync=1 to compare). That's why the recommandation is to use ssd for journal disks. - Mail original - De:

Re: [ceph-users] Possible improvements for a slow write speed (excluding independent SSD journals)

2015-04-20 Thread Alexandre DERUMIER
Hi, I'm currently benching full ssd setup (don't have finished yet), but with 4osd, ssd intel s3500, (replication x1), with randwrite 4M, I'm around 550MB/S with random 4K, i'm around 4iops (1iops by osd, limit is the disk write o_dsync speed) This is with hammer. - Mail

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-22 Thread Alexandre DERUMIER
[mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: Wednesday, April 22, 2015 7:31 PM To: Alexandre DERUMIER; Milosz Tanski Cc: ceph-devel; ceph-users Subject: Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-22 Thread Alexandre DERUMIER
: this is one of the reasons I like small nodes with single sockets and fewer OSDs. Mark On 04/22/2015 08:56 AM, Alexandre DERUMIER wrote: Hi, I have done a lot of test today, and it seem indeed numa related. My numastat was # numastat node0 node1 numa_hit 99075422 153976877 numa_miss

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-22 Thread Alexandre DERUMIER
from 100k iops to 300k iops On Wed, Apr 22, 2015 at 5:01 AM, Alexandre DERUMIER aderum...@odiso.com wrote: I wonder if it could be numa related, I'm using centos 7.1, and auto numa balacning is enabled cat /proc/sys/kernel/numa_balancing = 1 Maybe osd daemon access to buffer on wrong

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-24 Thread Alexandre DERUMIER
it a shot. :D Mark On 04/24/2015 12:36 PM, Stefan Priebe - Profihost AG wrote: Is jemalloc recommanded in general? Does it also work for firefly? Stefan Excuse my typo sent from my mobile phone. Am 24.04.2015 um 18:38 schrieb Alexandre DERUMIER aderum...@odiso.com mailto:aderum

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-24 Thread Alexandre DERUMIER
. Thanks Regards Somnath -Original Message- From: ceph-users [mailto: ceph-users-boun...@lists.ceph.com ] On Behalf Of Alexandre DERUMIER Sent: Thursday, April 23, 2015 4:56 AM To: Mark Nelson Cc: ceph-users; ceph-devel; Milosz Tanski Subject: Re: [ceph-users] strange benchmark problem

Re: [ceph-users] Possible improvements for a slow write speed (excluding independent SSD journals)

2015-04-25 Thread Alexandre DERUMIER
I'm able to reach around 2-25000iops with 4k block with s3500 (with o_dsync) (so yes, around 80-100MB/S). I'l bench new s3610 soon to compare. - Mail original - De: Anthony Levesque aleves...@gtcomm.net À: Christian Balzer ch...@gol.com Cc: ceph-users ceph-users@lists.ceph.com

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-23 Thread Alexandre DERUMIER
, Srinivas -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: Wednesday, April 22, 2015 7:31 PM To: Alexandre DERUMIER; Milosz Tanski Cc: ceph-devel; ceph-users Subject: Re: [ceph-users] strange benchmark problem

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-23 Thread Alexandre DERUMIER
[mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: Wednesday, April 22, 2015 7:31 PM To: Alexandre DERUMIER; Milosz Tanski Cc: ceph-devel; ceph-users Subject: Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-23 Thread Alexandre DERUMIER
how it does. In some ways I'm glad it turned out not to be NUMA. I still suspect we will have to deal with it at some point, but perhaps not today. ;) Mark On 04/23/2015 05:58 AM, Alexandre DERUMIER wrote: Maybe it's tcmalloc related I thinked to have patched it correctly, but perf show

[ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-22 Thread Alexandre DERUMIER
Hi, I was doing some benchmarks, I have found an strange behaviour. Using fio with rbd engine, I was able to reach around 100k iops. (osd datas in linux buffer, iostat show 0% disk access) then after restarting all osd daemons, the same fio benchmark show now around 300k iops. (osd datas in

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-22 Thread Alexandre DERUMIER
I wonder if it could be numa related, I'm using centos 7.1, and auto numa balacning is enabled cat /proc/sys/kernel/numa_balancing = 1 Maybe osd daemon access to buffer on wrong numa node. I'll try to reproduce the problem - Mail original - De: aderumier aderum...@odiso.com À:

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-27 Thread Alexandre DERUMIER
) Thanks in advance, JV On Sun, Apr 26, 2015 at 10:46 PM, Alexandre DERUMIER aderum...@odiso.com wrote: I'll retest tcmalloc, because I was prety sure to have patched it correctly. Ok, I really think I have patched tcmalloc wrongly. I have repatched it, reinstalled it, and now I'm getting

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-26 Thread Alexandre DERUMIER
it a shot. :D Mark On 04/24/2015 12:36 PM, Stefan Priebe - Profihost AG wrote: Is jemalloc recommanded in general? Does it also work for firefly? Stefan Excuse my typo sent from my mobile phone. Am 24.04.2015 um 18:38 schrieb Alexandre DERUMIER aderum...@odiso.com mailto:aderum

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-26 Thread Alexandre DERUMIER
Alexandre DERUMIER aderum...@odiso.com mailto:aderum...@odiso.com: Hi, I have finished to rebuild ceph with jemalloc, all seem to working fine. I got a constant 300k iops for the moment, so no speed regression. I'll do more long benchmark next week. Regards, Alexandre

Re: [ceph-users] Calamari server not working after upgrade 0.87-1 - 0.94-1

2015-04-27 Thread Alexandre DERUMIER
Hi, can you check on your ceph node /var/log/salt/minion ? I have had some similar problem, I have need to remove rm /etc/salt/pki/minion/minion_master.pub /etc/init.d/salt-minion restart (I don't known if calamari-ctl clear change the salt master key) - Mail original - De:

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-27 Thread Alexandre DERUMIER
conclusions about jemalloc vs tcmalloc until we can figure out what went wrong. Mark On 04/27/2015 12:46 AM, Alexandre DERUMIER wrote: I'll retest tcmalloc, because I was prety sure to have patched it correctly. Ok, I really think I have patched tcmalloc wrongly. I have repatched

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-27 Thread Alexandre DERUMIER
Avril 2015 18:34:50 Objet: Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops On 04/27/2015 10:11 AM, Alexandre DERUMIER wrote: Is it possible that you were suffering from the bug during the first test but once reinstalled you

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Alexandre DERUMIER
maybe this could help to repair pgs ? http://www.sebastien-han.fr/blog/2015/04/27/ceph-manually-repair-object/ (6 disk at the same time seem pretty strange. do you have some kind of writeback cache enable of theses disks ?) - Mail original - De: Yujian Peng

Re: [ceph-users] How to backup hundreds or thousands of TB

2015-05-06 Thread Alexandre DERUMIER
for the moment, you can use snapshot for backup https://ceph.com/community/blog/tag/backup/ I think that async mirror is on the roadmap https://wiki.ceph.com/Planning/Blueprints/Hammer/RBD%3A_Mirroring if you use qemu, you can do qemu full backup. (qemu incremental backup is coming for qemu

Re: [ceph-users] Networking question

2015-05-07 Thread Alexandre DERUMIER
Hi, If I have two networks (public and cluster network) and one link in public network is broken ( cluster network is fine) what I will see in my cluster ? See http://ceph.com/docs/master/rados/configuration/network-config-ref/ only osd between them use private network. so if public

Re: [ceph-users] osd does not start when object store is set to newstore

2015-05-11 Thread Alexandre DERUMIER
I tries searching on internet and could not find a el7 package with liburcu-bp.la file, let me know which rpm package has this libtool archive. Hi, maybe can you try ./install-deps.sh to install needed dependencies. - Mail original - De: Srikanth Madugundi

Re: [ceph-users] Turning on rbd cache safely

2015-05-05 Thread Alexandre DERUMIER
Hi, rbd_cache is client config only, so no need to restart osd. if you set cache=writeback in libvirt, it'll enable it, so you don't need to setup rbd_cache=true in ceph.conf. (it should override it) you can verify it enable, doing a sequantial write benchmark with 4k block. you should have a

Re: [ceph-users] Debian Jessie packages?

2015-05-12 Thread Alexandre DERUMIER
Hi, as Debian Jessie is already released for some time, I'd like to ask is there any plans to build newer Ceph packages for it? Yes it's planned, I'm currently helping on create images for build integration platform. If you want, I have build ceph packages for jessie:

Re: [ceph-users] New Calamari server

2015-05-12 Thread Alexandre DERUMIER
Hi, when you have remove salt from nodes, do you have remove the old master key /etc/salt/pki/minion/minion_master.pub ? I have add the same behavior than you when reinstalling calamari server, and previously installed salt on ceph nodes (with explicit error about the key in

Re: [ceph-users] very different performance on two volumes in the same pool #2

2015-05-11 Thread Alexandre DERUMIER
Hi, I'm currently doing benchmark too, and I don't see this behavior I get very nice performance of up to 200k IOPS. However once the volume is written to (ie when I map it using rbd map and dd whole volume with some random data), and repeat the benchmark, random performance drops to ~23k IOPS.

[ceph-users] ceph-deploy : systemd unit files not deployed to a centos7 nodes

2015-04-17 Thread Alexandre DERUMIER
Hi, I'm currently try to deploy a new ceph test cluster on centos7, (hammer) from ceph-deploy (on a debian wheezy). And it seem that systemd unit files are not deployed Seem that ceph git have systemd unit file https://github.com/ceph/ceph/tree/hammer/systemd I don't have look inside the rpm

Re: [ceph-users] Network redundancy pro and cons, best practice, suggestions?

2015-04-13 Thread Alexandre DERUMIER
So what would you suggest, what are your experiences? Hi, you can have a look at mellanox sx1012 for example http://www.mellanox.com/page/products_dyn?product_family=163 12 ports 40GB for around 4000€ you can use breakout cables to have 4x12 10GB ports. They can be stacked with mlag and lacp

[ceph-users] full ssd setup preliminary hammer bench

2015-04-17 Thread Alexandre DERUMIER
Hi Mark, I finally got my hardware for my production full ssd cluster. Here a first preliminary bench. (1osd). I got around 45K iops with randread 4K with a small 10GB rbd volume I'm pretty happy because I don't see anymore huge cpu difference between krbd lirbd. In my previous bench I was

Re: [ceph-users] full ssd setup preliminary hammer bench

2015-04-17 Thread Alexandre DERUMIER
Kozanecki | Linux Administrator | E: mkozane...@evertz.com -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alexandre DERUMIER Sent: April-17-15 11:38 AM To: Mark Nelson; ceph-users Subject: [ceph-users] full ssd setup preliminary hammer

Re: [ceph-users] full ssd setup preliminary hammer bench

2015-04-17 Thread Alexandre DERUMIER
...@redhat.com, ceph-users ceph-users@lists.ceph.com Envoyé: Vendredi 17 Avril 2015 20:57:42 Objet: Re: [ceph-users] full ssd setup preliminary hammer bench Am 17.04.2015 um 17:37 schrieb Alexandre DERUMIER: Hi Mark, I finally got my hardware for my production full ssd cluster. Here a first

Re: [ceph-users] Cores/Memory/GHz recommendation for SSD based OSD servers

2015-04-02 Thread Alexandre DERUMIER
interfaces per node. Does that not limit the thgouthput and hence the load on the CPUs? -Sreenath On 4/2/15, Alexandre DERUMIER aderum...@odiso.com wrote: Hi, with HP SL4540 server? this model http://www8.hp.com/h20195/v2/GetDocument.aspx?docname=c04128155 ? (60

Re: [ceph-users] full ssd setup preliminary hammer bench

2015-04-19 Thread Alexandre DERUMIER
: [ceph-users] full ssd setup preliminary hammer bench Am 18.04.2015 um 07:24 schrieb Alexandre DERUMIER aderum...@odiso.com: any idea whether this might be the tcmalloc bug? I still don't known if centos/redhat packages have also the bug or not. gperftools.x86_64 2.1-1.el7 From

Re: [ceph-users] rbd cache + libvirt

2015-06-08 Thread Alexandre DERUMIER
Hi, looking at the latest version of QEMU, It's seem that it's was already this behaviour since the add of rbd_cache parsing in rbd.c by josh in 2012

Re: [ceph-users] rbd cache + libvirt

2015-06-09 Thread Alexandre DERUMIER
-users] rbd cache + libvirt On Tue, Jun 9, 2015 at 7:59 AM, Alexandre DERUMIER aderum...@odiso.com wrote: host conf : rbd_cache=true : guest cache=none : result : cache (wrong) Thanks Alexandre, so you are confirming that this exact case misbehaves

Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

2015-06-09 Thread Alexandre DERUMIER
to performance. It would nice to route cause the problem if that is the case. On Tue, Jun 9, 2015 at 11:21 AM, Alexandre DERUMIER aderum...@odiso.com wrote: Hi, I'm doing benchmark (ceph master branch), with randread 4k qdepth=32, and rbd_cache=true seem to limit the iops around 40k no cache

<    1   2   3   4   5   >