Re: [ceph-users] Consumer-grade SSD in Ceph

2019-12-19 Thread Udo Lembke
Hi, if you add on more than one server an SSD with an short lifetime, you can run in real trouble (dataloss)! Even if, all other SSDs are enterprise grade. Ceph mix all data in PGs, which are spread over many disks - if one disk fails - no poblem, but if the next two fails after that due high io

Re: [ceph-users] Broken Ceph Cluster when adding new one - Proxmox 5.0 & Ceph Luminous

2017-07-16 Thread Udo Lembke
Hi, On 16.07.2017 15:04, Phil Schwarz wrote: > ... > Same result, the OSD is known by the node, but not by the cluster. > ... Firewall? Or missmatch in /etc/hosts or DNS?? Udo ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] Broken Ceph Cluster when adding new one - Proxmox 5.0 & Ceph Luminous

2017-07-15 Thread Udo Lembke
Hi, On 15.07.2017 16:01, Phil Schwarz wrote: > Hi, > ... > > While investigating, i wondered about my config : > Question relative to /etc/hosts file : > Should i use private_replication_LAN Ip or public ones ? private_replication_LAN!! And the pve-cluster should use another network (nics) if

Re: [ceph-users] Re-weight Entire Cluster?

2017-05-29 Thread Udo Lembke
Hi Mike, On 30.05.2017 01:49, Mike Cave wrote: > > Greetings All, > > > > I recently started working with our ceph cluster here and have been > reading about weighting. > > > > It appears the current best practice is to weight each OSD according > to it’s size (3.64 for 4TB drive, 7.45 for

Re: [ceph-users] How to think a two different disk's technologies architecture

2017-03-23 Thread Udo Lembke
Hi, ceph speeds up with more nodes and more OSDs - so go for 6 nodes with mixed SSD+SATA. Udo On 23.03.2017 18:55, Alejandro Comisario wrote: > Hi everyone! > I have to install a ceph cluster (6 nodes) with two "flavors" of > disks, 3 servers with SSD and 3 servers with SATA. > > Y will purchase

Re: [ceph-users] Upgrading 2K OSDs from Hammer to Jewel. Our experience

2017-03-11 Thread Udo Lembke
Hi, thanks for the usefull infos. On 11.03.2017 12:21, cephmailingl...@mosibi.nl wrote: > > Hello list, > > A week ago we upgraded our Ceph clusters from Hammer to Jewel and with > this email we want to share our experiences. > > ... > > > e) find /var/lib/ceph/ ! -uid 64045 -print0|xargs

Re: [ceph-users] Testing a node by fio - strange results to me

2017-01-22 Thread Udo Lembke
Hi, I don't use mds, but I thinks it's the same like with rdb - the readed data are cached on the OSD-nodes. The 4MB-chunks of the 3G-file fit completly in the cache, the other not. Udo On 18.01.2017 07:50, Ahmed Khuraidah wrote: > Hello community, > > I need your help to understand a little

Re: [ceph-users] Why would "osd marked itself down" will not recognised?

2017-01-12 Thread Udo Lembke
Hi Sam, the webfrontend of an external ceph-dash was interrupted till the node was up again. The reboot took app. 5 min. But the ceph -w output shows some IO much faster. I will look tomorrow at the output again and create an ticket. Thanks Udo On 12.01.2017 20:02, Samuel Just wrote: >

Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread Udo Lembke
Hi, but I assume you measure also cache in this scenario - the osd-nodes has cached the writes in the filebuffer (due this the latency should be very small). Udo On 12.12.2016 03:00, V Plus wrote: > Thanks Somnath! > As you recommended, I executed: > dd if=/dev/zero bs=1M count=4096 of=/dev/rbd0

Re: [ceph-users] 10.2.4 Jewel released

2016-12-09 Thread Udo Lembke
Hi, unfortunately there are no Debian Jessie packages... Don't know that an recompile take such an long time for ceph... I think such an important fix should hit the repros faster. Udo On 09.12.2016 18:54, Francois Lafont wrote: > On 12/09/2016 06:39 PM, Alex Evonosky wrote: > >> Sounds

Re: [ceph-users] Help needed ! cluster unstable after upgrade from Hammer to Jewel

2016-11-16 Thread Udo Lembke
Hi, On 16.11.2016 19:01, Vincent Godin wrote: > Hello, > > We now have a full cluster (Mon, OSD & Clients) in jewel 10.2.2 > (initial was hammer 0.94.5) but we have still some big problems on our > production environment : > > * some ceph filesystem are not mounted at startup and we have to >

Re: [ceph-users] Need help! Ceph backfill_toofull and recovery_wait+degraded

2016-11-01 Thread Udo Lembke
Hi again, and change the value with something like this ceph tell osd.* injectargs '--mon_osd_full_ratio 0.96' Udo On 01.11.2016 21:16, Udo Lembke wrote: > Hi Marcus, > > for a fast help you can perhaps increase the mon_osd_full_ratio? > > What values do you have? > Plea

Re: [ceph-users] Need help! Ceph backfill_toofull and recovery_wait+degraded

2016-11-01 Thread Udo Lembke
Hi Marcus, for a fast help you can perhaps increase the mon_osd_full_ratio? What values do you have? Please post the output of (on host ceph1, because osd.0.asok) ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep full_ratio after that it would be helpfull to use on all hosts

Re: [ceph-users] multiple journals on SSD

2016-07-12 Thread Udo Lembke
Hi Vincent, On 12.07.2016 15:03, Vincent Godin wrote: > Hello. > > I've been testing Intel 3500 as journal store for few HDD-based OSD. I > stumble on issues with multiple partitions (>4) and UDEV (sda5, sda6,etc > sometime do not appear after partition creation). And I'm thinking that >

Re: [ceph-users] ceph storage capacity does not free when deleting contents from RBD volumes

2016-05-19 Thread Udo Lembke
Hi Albert, to free unused space you must enable trim (or do an fstrim) in the vm - and all things in the storage chain must support this. The normal virtio-driver don't support trim, but if you use scsi-disks with virtio-scsi-driver you can use it. Work well but need some time for huge

Re: [ceph-users] Slow read on RBD mount, Hammer 0.94.5

2016-04-25 Thread Udo Lembke
Hi Mike, Am 21.04.2016 um 15:20 schrieb Mike Miller: Hi Udo, thanks, just to make sure, further increased the readahead: $ sudo blockdev --getra /dev/rbd0 1048576 $ cat /sys/block/rbd0/queue/read_ahead_kb 524288 No difference here. First one is sectors (512 bytes), second one KB. oops,

Re: [ceph-users] Slow read on RBD mount, Hammer 0.94.5

2016-04-21 Thread Udo Lembke
Hi Mike, Am 21.04.2016 um 09:07 schrieb Mike Miller: Hi Nick and Udo, thanks, very helpful, I tweaked some of the config parameters along the line Udo suggests, but still only some 80 MB/s or so. this mean you have reached factor 3 (this are round about the value I see with single thread on

Re: [ceph-users] Howto reduce the impact from cephx with small IO

2016-04-21 Thread Udo Lembke
these tests: http://www.spinics.net/lists/ceph-devel/msg22416.html Mark On 04/20/2016 11:50 AM, Udo Lembke wrote: Hi, on an small test-system (3 nodes (mon + osd), 6 OSDs, ceph 0.94.6) I compare with and without cephx. I use fio for that inside an VM on an host, outside the 3 ceph-nodes

[ceph-users] Howto reduce the impact from cephx with small IO

2016-04-20 Thread Udo Lembke
Hi, on an small test-system (3 nodes (mon + osd), 6 OSDs, ceph 0.94.6) I compare with and without cephx. I use fio for that inside an VM on an host, outside the 3 ceph-nodes, with this command: fio --max-jobs=1 --numjobs=1 --readwrite=read --blocksize=4k --size=4G --direct=1 --name=fiojob_4k

Re: [ceph-users] Slow read on RBD mount, Hammer 0.94.5

2016-04-20 Thread Udo Lembke
Hi Mike, I don't have experiences with RBD mounts, but see the same effect with RBD. You can do some tuning to get better results (disable debug and so on). As hint some values from a ceph.conf: [osd] debug asok = 0/0 debug auth = 0/0 debug buffer = 0/0 debug client = 0/0

Re: [ceph-users] Deprecating ext4 support

2016-04-12 Thread Udo Lembke
Hi Sage, we run ext4 only on our 8node-cluster with 110 OSDs and are quite happy with ext4. We start with xfs but the latency was much higher comparable to ext4... But we use RBD only with "short" filenames like rbd_data.335986e2ae8944a.000761e1. If we can switch from Jewel to K* and

Re: [ceph-users] v0.94.6 Hammer released

2016-02-25 Thread Udo Lembke
Hi, Am 24.02.2016 um 17:27 schrieb Alfredo Deza: > On Wed, Feb 24, 2016 at 4:31 AM, Dan van der Ster wrote: >> Thanks Sage, looking forward to some scrub randomization. >> >> Were binaries built for el6? http://download.ceph.com/rpm-hammer/el6/x86_64/ > > We are no longer

Re: [ceph-users] All SSD Pool - Odd Performance

2015-11-22 Thread Udo Lembke
--verify=0 --verify_fatal=0 --numjobs=4 --rw=randwrite --blocksize=4k --group_reporting Udo On 22.11.2015 23:59, Udo Lembke wrote: > Hi Zoltan, > you are right ( but this was two running systems...). > > I see also an big failure: "--filename=/mnt/test.bin" (use simply > c

Re: [ceph-users] All SSD Pool - Odd Performance

2015-11-22 Thread Udo Lembke
ents clean tomorow. Udo On 22.11.2015 14:29, Zoltan Arnold Nagy wrote: > It would have been more interesting if you had tweaked only one option > as now we can’t be sure which changed had what impact… :-) > >> On 22 Nov 2015, at 04:29, Udo Lembke <ulem...@polarzone.de >

Re: [ceph-users] All SSD Pool - Odd Performance

2015-11-21 Thread Udo Lembke
Hi Sean, Haomai is right, that qemu can have a huge performance differences. I have done two test to the same ceph-cluster (different pools, but this should not do any differences). One test with proxmox ve 4 (qemu 2.4, iothread for device, and cache=writeback) gives 14856 iops Same test with

Re: [ceph-users] two or three replicas?

2015-11-03 Thread Udo Lembke
Hi, for production (with enough OSDs) is three replicas the right choice. The chance for data loss if two ODSs fails at one time is to high. And if this happens most of your data ist lost, because the data is spead over many OSDs... And yes - two replicas is faster for writes. Udo On

Re: [ceph-users] Network performance

2015-10-22 Thread Udo Lembke
Hi Jonas, you can create an bond over multible NICs (depends on your switch which modes are possible) to use one IP addresses but more than one NIC. Udo On 21.10.2015 10:23, Jonas Björklund wrote: > Hello, > > In the configuration I have read about "cluster network" and "cluster addr". > Is it

Re: [ceph-users] v0.94.4 Hammer released

2015-10-20 Thread Udo Lembke
Hi, do you have changed the ownership like discribed in Sages mail about "v9.1.0 Infernalis release candidate released"? #. Fix the ownership:: chown -R ceph:ceph /var/lib/ceph or set ceph.conf to use root instead? When upgrading, administrators have two options: #. Add

Re: [ceph-users] Cache tier experiences (for ample sized caches ^o^)

2015-10-07 Thread Udo Lembke
Hi Christian, On 07.10.2015 09:04, Christian Balzer wrote: > > ... > > My main suspect for the excessive slowness are actually the Toshiba DT > type drives used. > We only found out after deployment that these can go into a zombie mode > (20% of their usual performance for ~8 hours if not

Re: [ceph-users] [sepia] debian jessie repository ?

2015-09-25 Thread Udo Lembke
Hi, you can use this sources-list cat /etc/apt/sources.list.d/ceph.list deb http://gitbuilder.ceph.com/ceph-deb-jessie-x86_64-basic/ref/v0.94.3 jessie main Udo On 25.09.2015 15:10, Jogi Hofmüller wrote: > Hi, > > Am 2015-09-11 um 13:20 schrieb Florent B: > >> Jessie repository will be available

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-07 Thread Udo Lembke
Hi Vickey, I had the same rados bench output after changing the motherboard of the monitor node with the lowest IP... Due to the new mainboard, I assume the hw-clock was wrong during startup. Ceph health show no errors, but all VMs aren't able to do IO (very high load on the VMs - but no traffic).

Re: [ceph-users] Storage node refurbishing, a "freeze" OSD feature would be nice

2015-08-31 Thread Udo Lembke
Hi Christian, for my setup "b" takes too long - too much data movement and stress to all nodes. I have simply (with replica 3) "set noout", reinstall one node (with new filesystem on the OSDs, but leave them in the crushmap) and start all OSDs (at friday night) - takes app. less than one day

Re: [ceph-users] Different filesystems on OSD hosts at the same cluster

2015-08-07 Thread Udo Lembke
Hi, some time ago I switched all OSDs from XFS to ext4 (step by step). I had no issues during mixed osd-format (the process takes some weeks). And yes, for me ext4 performs also better (esp. the latencies). Udo Am 07.08.2015 13:31, schrieb Межов Игорь Александрович: Hi! We do some

Re: [ceph-users] Different filesystems on OSD hosts at the samecluster

2015-08-07 Thread Udo Lembke
1) the default is relatime which has minimal impact on performance 2) AFAIK some ceph features actually use atime (cache tiering was it?) or at least so I gathered from some bugs I saw Jan On 07 Aug 2015, at 16:30, Udo Lembke ulem...@polarzone.de wrote: Hi, I use the ext4-parameters like

Re: [ceph-users] dropping old distros: el6, precise 12.04, debian wheezy?

2015-07-30 Thread Udo Lembke
Hi, dropping debian wheezy are quite fast - till now there aren't packages for jessie?! Dropping of squeeze I understand, but wheezy at this time? Udo On 30.07.2015 15:54, Sage Weil wrote: As time marches on it becomes increasingly difficult to maintain proper builds and packages for older

Re: [ceph-users] Did maximum performance reached?

2015-07-28 Thread Udo Lembke
Hi, On 28.07.2015 12:02, Shneur Zalman Mattern wrote: Hi! And so, in your math I need to build size = osd, 30 replicas for my cluster of 120TB - to get my demans 30 replicas is the wrong math! Less replicas = more speed (because of less writing). More replicas less speed. Fore data

Re: [ceph-users] different omap format in one cluster (.sst + .ldb) - new installed OSD-node don't start any OSD

2015-07-23 Thread Udo Lembke
osd? How many osds meet this problems? This assert failure means that osd detects a upgraded pg meta object but failed to read(or lack of 1 key) meta keys from object. On Thu, Jul 23, 2015 at 7:03 PM, Udo Lembke ulem...@polarzone.de wrote: Am 21.07.2015 12:06, schrieb Udo Lembke: Hi all

Re: [ceph-users] different omap format in one cluster (.sst + .ldb) - new installed OSD-node don't start any OSD

2015-07-23 Thread Udo Lembke
Am 21.07.2015 12:06, schrieb Udo Lembke: Hi all, ... Normaly I would say, if one OSD-Node die, I simply reinstall the OS and ceph and I'm back again... but this looks bad for me. Unfortunality the system also don't start 9 OSDs as I switched back to the old system-disk... (only three

[ceph-users] different omap format in one cluster (.sst + .ldb) - new installed OSD-node don't start any OSD

2015-07-21 Thread Udo Lembke
Hi all, we had an ceph cluster with 7 OSD-nodes (Debian Jessie (because patched tcmalloc) with ceph 0.94) which we expand with one further node. For this node we use puppet with Debian 7.8, because ceph 0.92.2 doesn't install on Jessie (upgrade 0.94.1 work on the other nodes but 0.94.2 looks not

Re: [ceph-users] He8 drives

2015-07-13 Thread Udo Lembke
Hi, I have just expand our ceph-cluster (7 nodes) with one 8TB HGST (change from 4TB to 8TB) on each node (and 11 4TB HGST). But I have set the primary affinity to 0 for the 8 TB-disks... in this case my performance values are not 8-TB-disk related. Udo On 08.07.2015 02:28, Blair Bethwaite

Re: [ceph-users] How to estimate whether putting a journal on SSD will help with performance?

2015-05-01 Thread Udo Lembke
Hi, On 01.05.2015 10:30, Piotr Wachowicz wrote: Is there any way to confirm (beforehand) that using SSDs for journals will help? yes SSD-Journal helps a lot (if you use the right SSDs) for write speed, and I made the experiences that this also helped (but not too much) for read-performance.

Re: [ceph-users] Hammer release data and a Design question

2015-03-27 Thread Udo Lembke
Hi, Am 26.03.2015 11:18, schrieb 10 minus: Hi , I 'm just starting on small Ceph implementation and wanted to know the release date for Hammer. Will it coincide with relase of Openstack. My Conf: (using 10G and Jumboframes on Centos 7 / RHEL7 ) 3x Mons (VMs) : CPU - 2 Memory - 4G

Re: [ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-26 Thread Udo Lembke
could have specified enough PGs to make it impossible to form PGs out of 84 OSDs (I'm assuming your SSDs are in a separate root) but I have to ask... -don- -Original Message- From: Udo Lembke [mailto:ulem...@polarzone.de] Sent: 25 March, 2015 08:54 To: Don Doerner; ceph-us

[ceph-users] How to see the content of an EC Pool after recreate the SSD-Cache tier?

2015-03-26 Thread Udo Lembke
Hi all, due an very silly approach, I removed the cache tier of an filled EC pool. After recreate the pool and connect with the EC pool I don't see any content. How can I see the rbd_data and other files through the new ssd cache tier? I think, that I must recreate the rbd_directory (and fill

Re: [ceph-users] How to see the content of an EC Pool after recreate the SSD-Cache tier?

2015-03-26 Thread Udo Lembke
, is to create new rbd-disks and copy all blocks with rados get - file - rados put. The problem is the time it's take (days to weeks for 3 * 16TB)... Udo -Greg On Thu, Mar 26, 2015 at 8:56 AM, Udo Lembke ulem...@polarzone.de wrote: Hi Greg, ok! It's looks like, that my problem is more

Re: [ceph-users] How to see the content of an EC Pool after recreate the SSD-Cache tier?

2015-03-26 Thread Udo Lembke
show up when listing on the cache pool. -Greg On Thu, Mar 26, 2015 at 3:43 AM, Udo Lembke ulem...@polarzone.de wrote: Hi all, due an very silly approach, I removed the cache tier of an filled EC pool. After recreate the pool and connect with the EC pool I don't see any content. How can I see

[ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-25 Thread Udo Lembke
Hi, due to two more hosts (now 7 storage nodes) I want to create an new ec-pool and get an strange effect: ceph@admin:~$ ceph health detail HEALTH_WARN 2 pgs degraded; 2 pgs stuck degraded; 2 pgs stuck unclean; 2 pgs stuck undersized; 2 pgs undersized pg 22.3e5 is stuck unclean since forever,

Re: [ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-25 Thread Udo Lembke
300 PGs... Udo Am 25.03.2015 14:52, schrieb Gregory Farnum: On Wed, Mar 25, 2015 at 1:20 AM, Udo Lembke ulem...@polarzone.de wrote: Hi, due to two more hosts (now 7 storage nodes) I want to create an new ec-pool and get an strange effect: ceph@admin:~$ ceph health detail HEALTH_WARN 2 pgs

Re: [ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-25 Thread Udo Lembke
Hi Don, thanks for the info! looks that choose_tries set to 200 do the trick. But the setcrushmap takes a long long time (alarming, but the client have still IO)... hope it's finished soon ;-) Udo Am 25.03.2015 16:00, schrieb Don Doerner: Assuming you've calculated the number of PGs

[ceph-users] won leader election with quorum during osd setcrushmap

2015-03-25 Thread Udo Lembke
Hi, due to PG-trouble with an EC-Pool I modify the crushmap (step set_choose_tries 200) from rule ec7archiv { ruleset 6 type erasure min_size 3 max_size 20 step set_chooseleaf_tries 5 step take default step chooseleaf indep 0 type host

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Udo Lembke
Hi Tony, sounds like an good idea! Udo On 09.03.2015 21:55, Tony Harris wrote: I know I'm not even close to this type of a problem yet with my small cluster (both test and production clusters) - but it would be great if something like that could appear in the cluster HEALTHWARN, if Ceph could

[ceph-users] too few pgs in cache tier

2015-02-27 Thread Udo Lembke
Hi all, we use an EC-Pool with an small cache tier in front of, for our archive-data (4 * 16TB VM-disks). The ec-pool has k=3;m=2 because we startet with 5 nodes and want to migrate to an new ec-pool with k=5;m=2. Therefor we migrate one VM-disk (16TB) from the ceph-cluster to an fc-raid with the

Re: [ceph-users] Power failure recovery woes

2015-02-17 Thread Udo Lembke
Hi Jeff, is the osd /var/lib/ceph/osd/ceph-2 mounted? If not, does it helps, if you mounted the osd and start with service ceph start osd.2 ?? Udo Am 17.02.2015 09:54, schrieb Jeff: Hi, We had a nasty power failure yesterday and even with UPS's our small (5 node, 12 OSD) cluster is having

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Udo Lembke
Hi, your will get further trouble, because your weight is not correct. You need an weight = 0.01 for each OSD. This mean, you OSD must be 10GB or greater! Udo Am 10.02.2015 12:22, schrieb B L: Hi Vickie, My OSD tree looks like this: ceph@ceph-node3:/home/ubuntu$ ceph osd tree #

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Udo Lembke
Hi, use: ceph osd crush set 0 0.01 pool=default host=ceph-node1 ceph osd crush set 1 0.01 pool=default host=ceph-node1 ceph osd crush set 2 0.01 pool=default host=ceph-node3 ceph osd crush set 3 0.01 pool=default host=ceph-node3 ceph osd crush set 4 0.01 pool=default host=ceph-node2 ceph osd crush

Re: [ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-06 Thread Udo Lembke
Am 06.02.2015 09:06, schrieb Hector Martin: On 02/02/15 03:38, Udo Lembke wrote: With 3 hosts only you can't survive an full node failure, because for that you need host = k + m. Sure you can. k=2, m=1 with the failure domain set to host will survive a full host failure. Hi, Alexandre

Re: [ceph-users] command to flush rbd cache?

2015-02-04 Thread Udo Lembke
Hi Dan, I mean qemu-kvm, also librbd. But how I can kvm told to flush the buffer? Udo On 05.02.2015 07:59, Dan Mick wrote: On 02/04/2015 10:44 PM, Udo Lembke wrote: Hi all, is there any command to flush the rbd cache like the echo 3 /proc/sys/vm/drop_caches for the os cache? Udo Do you

Re: [ceph-users] command to flush rbd cache?

2015-02-04 Thread Udo Lembke
Hi Josh, thanks for the info. detach/reattach schould be fine for me, because it's only for performance testing. #2468 would be fine of course. Udo On 05.02.2015 08:02, Josh Durgin wrote: On 02/05/2015 07:44 AM, Udo Lembke wrote: Hi all, is there any command to flush the rbd cache like

[ceph-users] command to flush rbd cache?

2015-02-04 Thread Udo Lembke
Hi all, is there any command to flush the rbd cache like the echo 3 /proc/sys/vm/drop_caches for the os cache? Udo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-04 Thread Udo Lembke
Hi Marco, Am 04.02.2015 10:20, schrieb Colombo Marco: ... We choosen the 6TB of disk, because we need a lot of storage in a small amount of server and we prefer server with not too much disks. However we plan to use max 80% of a 6TB Disk 80% is too much! You will run into trouble. Ceph

Re: [ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-01 Thread Udo Lembke
Hi Alexandre, nice to meet you here ;-) With 3 hosts only you can't survive an full node failure, because for that you need host = k + m. And k:1 m:2 don't make any sense. I start with 5 hosts and use k:3, m:2. In this case two hdds can fail or one host can be down for maintenance. Udo PS:

Re: [ceph-users] OSD capacity variance ?

2015-02-01 Thread Udo Lembke
Hi Howard, I assume it's an typo with 160 + 250 MB. Ceph OSDs must be min. 10GB to get an weight of 0.01 Udo On 31.01.2015 23:39, Howard Thomson wrote: Hi All, I am developing a custom disk storage backend for the Bacula backup system, and am in the process of setting up a trial Ceph system,

Re: [ceph-users] estimate the impact of changing pg_num

2015-02-01 Thread Udo Lembke
Hi Xu, On 01.02.2015 21:39, Xu (Simon) Chen wrote: RBD doesn't work extremely well when ceph is recovering - it is common to see hundreds or a few thousands of blocked requests (30s to finish). This translates high IO wait inside of VMs, and many applications don't deal with this well. this

Re: [ceph-users] RBD caching on 4K reads???

2015-01-30 Thread Udo Lembke
Hi Bruce, hmm, sounds for me like the rbd cache. Can you look, if the cache is realy disabled in the running config with ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep cache Udo On 30.01.2015 21:51, Bruce McFarland wrote: I have a cluster and have created a rbd device -

Re: [ceph-users] RBD caching on 4K reads???

2015-01-30 Thread Udo Lembke
verify if it’s disabled at the librbd level on the client. If you mean on the storage nodes I’ve had some issues dumping the config. Does the rbd caching occur on the storage nodes, client, or both? *From:*Udo Lembke [mailto:ulem...@polarzone.de] *Sent:* Friday, January 30, 2015 1:00 PM

Re: [ceph-users] Sizing SSD's for ceph

2015-01-29 Thread Udo Lembke
Hi, Am 29.01.2015 07:53, schrieb Christian Balzer: On Thu, 29 Jan 2015 01:30:41 + Ramakrishna Nishtala (rnishtal) wrote: * Per my understanding once writes are complete to journal then it is read again from the journal before writing to data disk. Does this mean, we have to do,

Re: [ceph-users] slow read-performance inside the vm

2015-01-27 Thread Udo Lembke
Hi Patrik, Am 27.01.2015 14:06, schrieb Patrik Plank: ... I am really happy, these values above are enough for my little amount of vms. Inside the vms I get now for write 80mb/s and read 130mb/s, with write-cache enabled. But there is one little problem. Are there some tuning

Re: [ceph-users] Better way to use osd's of different size

2015-01-16 Thread Udo Lembke
Hi Megov, you should weight the OSD so it's represent the size (like an weight of 3.68 for an 4TB HDD). cephdeploy do this automaticly. Nevertheless also with the correct weight the disk was not filled in equal distribution. For that purposes you can use reweight for single OSDs, or automaticly

[ceph-users] Part 2: ssd osd fails often with FAILED assert(soid scrubber.start || soid = scrubber.end)

2015-01-14 Thread Udo Lembke
Hi again, sorry for not threaded, but my last email don't came back on the mailing list (often miss some posts!). Just after sending the last mail, the first time another SSD fails - in this case an cheap one, but with the same error: root@ceph-04:/var/log/ceph# more ceph-osd.62.log 2015-01-13

[ceph-users] ssd osd fails often with FAILED assert(soid scrubber.start || soid = scrubber.end)

2015-01-13 Thread Udo Lembke
Hi, since last thursday we had an ssd-pool (cache tier) in front of an ec-pool and fill the pools with data via rsync (app. 50MB/s). The ssd-pool has tree disks and one of them (an DC S3700) fails four times since that. I simply start the osd again and the pool pas rebuilded and work again for

Re: [ceph-users] backfill_toofull, but OSDs not full

2015-01-09 Thread Udo Lembke
Hi, I had an similiar effect two weeks ago - 1PG backfill_toofull and due reweighting and delete there was enough free space but the rebuild process stopped after a while. After stop and start ceph on the second node, the rebuild process runs without trouble and the backfill_toofull are gone.

Re: [ceph-users] Improving Performance with more OSD's?

2015-01-04 Thread Udo Lembke
Hi Lindsay, On 05.01.2015 06:52, Lindsay Mathieson wrote: ... So two OSD Nodes had: - Samsung 840 EVO SSD for Op. Sys. - Intel 530 SSD for Journals (10GB Per OSD) - 3TB WD Red - 1 TB WD Blue - 1 TB WD Blue - Each disk weighted at 1.0 - Primary affinity of the WD Red (slow) set to 0 the

Re: [ceph-users] v0.90 released

2014-12-23 Thread Udo Lembke
Hi Sage, Am 23.12.2014 15:39, schrieb Sage Weil: ... You can't reduce the PG count without creating new (smaller) pools and migrating data. does this also work with the pool metadata, or is this pool essential for ceph? Udo ___ ceph-users mailing

Re: [ceph-users] Any Good Ceph Web Interfaces?

2014-12-23 Thread Udo Lembke
Hi, for monitoring only I use the Ceph Dashboard https://github.com/Crapworks/ceph-dash/ Fo me it's an nice tool for an good overview - for administration i use the cli. Udo On 23.12.2014 01:11, Tony wrote: Please don't mention calamari :-) The best web interface for ceph that actually

Re: [ceph-users] How to see which crush tunables are active in a ceph-cluster?

2014-12-20 Thread Udo Lembke
an chooseleaf_vary_r 1 (from 0) take round about the same time to finished?? Regards Udo On 04.12.2014 14:09, Udo Lembke wrote: Hi, to answer myself. With ceph osd crush show-tunables I see a little bit more, but doesn't know how far away from firefly-tunables I'm at the procuction cluster

Re: [ceph-users] How to see which crush tunables are active in a ceph-cluster?

2014-12-20 Thread Udo Lembke
at the same time. On Sat, Dec 20, 2014 at 3:26 AM, Udo Lembke ulem...@polarzone.de mailto:ulem...@polarzone.de wrote: Hi, for information for other cepher... I switched from unknown crush tunables to firefly and it's takes 6 hour (30.853% degration) to finisched on our

Re: [ceph-users] Help with SSDs

2014-12-18 Thread Udo Lembke
Hi Mark, On 18.12.2014 07:15, Mark Kirkwood wrote: While you can't do much about the endurance lifetime being a bit low, you could possibly improve performance using a journal *file* that is located on the 840's (you'll need to symlink it - disclaimer - have not tried this myself, but will

[ceph-users] Any tuning of LVM-Storage inside an VM related to ceph?

2014-12-18 Thread Udo Lembke
Hi all, I have some fileserver with insufficient read speed. Enabling read ahead inside the VM improve the read speed, but it's looks, that this has an drawback during lvm-operations like pvmove. For test purposes, I move the lvm-storage inside an VM from vdb to vdc1. It's take days, because it's

Re: [ceph-users] Reproducable Data Corruption with cephfs kernel driver

2014-12-18 Thread Udo Lembke
Hi Lindsay, have you tried the different cache-options (no cache, write through, ...) which proxmox offer, for the drive? Udo On 18.12.2014 05:52, Lindsay Mathieson wrote: I'be been experimenting with CephFS for funning KVM images (proxmox). cephfs fuse version - 0.87 cephfs kernel module

Re: [ceph-users] Help with SSDs

2014-12-17 Thread Udo Lembke
Hi Mikaël, I have EVOs too, what to you mean by not playing well with D_SYNC? Is there something I can test on my side to compare results with you, as I have mine flashed? http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ described how

Re: [ceph-users] Multiple issues :( Ubuntu 14.04, latest Ceph

2014-12-15 Thread Udo Lembke
Hi Benjamin, On 15.12.2014 03:31, Benjamin wrote: Hey there, I've set up a small VirtualBox cluster of Ceph VMs. I have one ceph-admin0 node, and three ceph0,ceph1,ceph2 nodes for a total of 4. I've been following this guide: http://ceph.com/docs/master/start/quick-ceph-deploy/ to the

Re: [ceph-users] Multiple issues :( Ubuntu 14.04, latest Ceph

2014-12-15 Thread Udo Lembke
Hi, see here: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg15546.html Udo On 16.12.2014 05:39, Benjamin wrote: I increased the OSDs to 10.5GB each and now I have a different issue... cephy@ceph-admin0:~/ceph-cluster$ echo {Test-data} testfile.txt

[ceph-users] For all LSI SAS9201-16i user - don't upgrate to firmware P20

2014-12-11 Thread Udo Lembke
Hi all, I have upgrade two LSI SAS9201-16i HBAs to the latest Firmware P20.00.00 and after that I got following syslog messages: Dec 9 18:11:31 ceph-03 kernel: [ 484.602834] mpt2sas0: log_info(0x3108): originator(PL), code(0x08), sub_code(0x) Dec 9 18:12:15 ceph-03 kernel: [

Re: [ceph-users] Old OSDs on new host, treated as new?

2014-12-05 Thread Udo Lembke
Hi, perhaps an stupid question, but why you change the hostname? Not tried, but I guess if you boot the node with an new hostname, the old hostname are in the crush map, but without any OSDs - because they are on the new host. Don't know ( I guess not) if the degration level stay also on 5% if

[ceph-users] How to see which crush tunables are active in a ceph-cluster?

2014-12-01 Thread Udo Lembke
Hi all, http://ceph.com/docs/master/rados/operations/crush-map/#crush-tunables described how to set the tunables to legacy, argonaut, bobtail, firefly or optimal. But how can I see, which profile is active in an ceph-cluster? With ceph osd getcrushmap I got not realy much info (only tunable

Re: [ceph-users] Typical 10GbE latency

2014-11-06 Thread Udo Lembke
Hi, from one host to five OSD-hosts. NIC Intel 82599EB; jumbo-frames; single Switch IBM G8124 (blade network). rtt min/avg/max/mdev = 0.075/0.114/0.231/0.037 ms rtt min/avg/max/mdev = 0.088/0.164/0.739/0.072 ms rtt min/avg/max/mdev = 0.081/0.141/0.229/0.030 ms rtt min/avg/max/mdev =

Re: [ceph-users] Typical 10GbE latency

2014-11-06 Thread Udo Lembke
on the host? Thanks. Thu Nov 06 2014 at 16:57:36, Udo Lembke ulem...@polarzone.de mailto:ulem...@polarzone.de: Hi, from one host to five OSD-hosts. NIC Intel 82599EB; jumbo-frames; single Switch IBM G8124 (blade network). rtt min/avg/max/mdev = 0.075/0.114/0.231/0.037 ms

Re: [ceph-users] question about activate OSD

2014-10-31 Thread Udo Lembke
Hi German, if i'm right the journal-creation on /dev/sdc1 failed (perhaps because you only say /dev/sdc instead of /dev/sdc1?). Do you have partitions on sdc? Udo On 31.10.2014 22:02, German Anders wrote: Hi all, I'm having some issues while trying to activate a new osd in a new

Re: [ceph-users] Replacing a disk: Best practices?

2014-10-16 Thread Udo Lembke
Am 15.10.2014 22:08, schrieb Iban Cabrillo: HI Cephers, I have an other question related to this issue, What would be the procedure to restore a server fail (a whole server for example due to a mother board trouble with no damage on disk). Regards, I Hi, - change serverboard. -

Re: [ceph-users] [PG] Slow request *** seconds old,v4 currently waiting for pg to exist locally

2014-09-25 Thread Udo Lembke
Hi, looks that some osds are down?! What is the output of ceph osd tree Udo Am 25.09.2014 04:29, schrieb Aegeaner: The cluster healthy state is WARN: health HEALTH_WARN 118 pgs degraded; 8 pgs down; 59 pgs incomplete; 28 pgs peering; 292 pgs stale; 87 pgs stuck inactive;

Re: [ceph-users] [PG] Slow request *** seconds old,v4 currently waiting for pg to exist locally

2014-09-25 Thread Udo Lembke
Hi again, sorry - forgot my post... see osdmap e421: 9 osds: 9 up, 9 in shows that all your 9 osds are up! Do you have trouble with your journal/filesystem? Udo Am 25.09.2014 08:01, schrieb Udo Lembke: Hi, looks that some osds are down?! What is the output of ceph osd tree Udo Am

Re: [ceph-users] Newbie Ceph Design Questions

2014-09-22 Thread Udo Lembke
Hi Christian, On 22.09.2014 05:36, Christian Balzer wrote: Hello, On Sun, 21 Sep 2014 21:00:48 +0200 Udo Lembke wrote: Hi Christian, On 21.09.2014 07:18, Christian Balzer wrote: ... Personally I found ext4 to be faster than XFS in nearly all use cases and the lack of full, real kernel

Re: [ceph-users] Newbie Ceph Design Questions

2014-09-21 Thread Udo Lembke
Hi Christian, On 21.09.2014 07:18, Christian Balzer wrote: ... Personally I found ext4 to be faster than XFS in nearly all use cases and the lack of full, real kernel integration of ZFS is something that doesn't appeal to me either. a little bit OT... what kind of ext4-mount options do you

[ceph-users] kvm guest with rbd-disks are unaccesible after app. 3h afterwards one OSD node fails

2014-09-01 Thread Udo Lembke
Hi list, on the weekend one of five OSD-nodes fails (hung with kernel panic). The cluster degraded (12 of 60 osds), but from our monitoring-host the noout-flag is set in this case. But around three hours later the kvm-guest, which used storage on the ceph cluster (and use writes) are

Re: [ceph-users] slow read speeds from kernel rbd (Firefly 0.80.4)

2014-07-26 Thread Udo Lembke
Hi, don't see an improvement with tcp_window_scaling=0 with my configuration. More the other way: the iperf-performance are much less: root@ceph-03:~# iperf -c 172.20.2.14 Client connecting to 172.20.2.14, TCP port 5001 TCP window size:

Re: [ceph-users] slow read speeds from kernel rbd (Firefly 0.80.4)

2014-07-24 Thread Udo Lembke
Hi Steve, I'm also looking for improvements of single-thread-reads. A little bit higher values (twice?) should be possible with your config. I have 5 nodes with 60 4-TB hdds and got following: rados -p test bench -b 4194304 60 seq -t 1 --no-cleanup Total time run:60.066934 Total reads

Re: [ceph-users] slow read speeds from kernel rbd (Firefly 0.80.4)

2014-07-24 Thread Udo Lembke
Hi again, forget to say - I'm still on 0.72.2! Udo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-14 Thread Udo Lembke
Hi, which values are all changed with ceph osd crush tunables optimal? Is it perhaps possible to change some parameter the weekends before the upgrade is running, to have more time? (depends if the parameter are available in 0.72...). The warning told, it's can take days... we have an cluster

Re: [ceph-users] Generic Tuning parameters?

2014-06-28 Thread Udo Lembke
Hi Erich, I'm also on searching for improvements. You should use the right mountoptions, to prevent fragmentation (for XFS). [osd] osd mount options xfs = rw,noatime,inode64,logbsize=256k,delaylog,allocsize=4M osd_op_threads = 4 osd_disk_threads = 4 With 45 OSDs per node you need an powerfull

Re: [ceph-users] How to improve performance of ceph objcect storage cluster

2014-06-26 Thread Udo Lembke
Hi, Am 25.06.2014 16:48, schrieb Aronesty, Erik: I'm assuming you're testing the speed of cephfs (the file system) and not ceph object storage. for my part I mean object storage (VM disk via rbd). Udo ___ ceph-users mailing list

  1   2   >