[ceph-users] kvm guest with rbd-disks are unaccesible after app. 3h afterwards one OSD node fails

2014-09-01 Thread Udo Lembke
Hi list, on the weekend one of five OSD-nodes fails (hung with kernel panic). The cluster degraded (12 of 60 osds), but from our monitoring-host the noout-flag is set in this case. But around three hours later the kvm-guest, which used storage on the ceph cluster (and use writes) are

Re: [ceph-users] Newbie Ceph Design Questions

2014-09-21 Thread Udo Lembke
Hi Christian, On 21.09.2014 07:18, Christian Balzer wrote: ... Personally I found ext4 to be faster than XFS in nearly all use cases and the lack of full, real kernel integration of ZFS is something that doesn't appeal to me either. a little bit OT... what kind of ext4-mount options do you

Re: [ceph-users] Newbie Ceph Design Questions

2014-09-22 Thread Udo Lembke
Hi Christian, On 22.09.2014 05:36, Christian Balzer wrote: Hello, On Sun, 21 Sep 2014 21:00:48 +0200 Udo Lembke wrote: Hi Christian, On 21.09.2014 07:18, Christian Balzer wrote: ... Personally I found ext4 to be faster than XFS in nearly all use cases and the lack of full, real kernel

Re: [ceph-users] [PG] Slow request *** seconds old,v4 currently waiting for pg to exist locally

2014-09-25 Thread Udo Lembke
Hi, looks that some osds are down?! What is the output of ceph osd tree Udo Am 25.09.2014 04:29, schrieb Aegeaner: The cluster healthy state is WARN: health HEALTH_WARN 118 pgs degraded; 8 pgs down; 59 pgs incomplete; 28 pgs peering; 292 pgs stale; 87 pgs stuck inactive;

Re: [ceph-users] [PG] Slow request *** seconds old,v4 currently waiting for pg to exist locally

2014-09-25 Thread Udo Lembke
Hi again, sorry - forgot my post... see osdmap e421: 9 osds: 9 up, 9 in shows that all your 9 osds are up! Do you have trouble with your journal/filesystem? Udo Am 25.09.2014 08:01, schrieb Udo Lembke: Hi, looks that some osds are down?! What is the output of ceph osd tree Udo Am

Re: [ceph-users] Replacing a disk: Best practices?

2014-10-16 Thread Udo Lembke
Am 15.10.2014 22:08, schrieb Iban Cabrillo: HI Cephers, I have an other question related to this issue, What would be the procedure to restore a server fail (a whole server for example due to a mother board trouble with no damage on disk). Regards, I Hi, - change serverboard. -

Re: [ceph-users] question about activate OSD

2014-10-31 Thread Udo Lembke
Hi German, if i'm right the journal-creation on /dev/sdc1 failed (perhaps because you only say /dev/sdc instead of /dev/sdc1?). Do you have partitions on sdc? Udo On 31.10.2014 22:02, German Anders wrote: Hi all, I'm having some issues while trying to activate a new osd in a new

Re: [ceph-users] Typical 10GbE latency

2014-11-06 Thread Udo Lembke
Hi, from one host to five OSD-hosts. NIC Intel 82599EB; jumbo-frames; single Switch IBM G8124 (blade network). rtt min/avg/max/mdev = 0.075/0.114/0.231/0.037 ms rtt min/avg/max/mdev = 0.088/0.164/0.739/0.072 ms rtt min/avg/max/mdev = 0.081/0.141/0.229/0.030 ms rtt min/avg/max/mdev =

Re: [ceph-users] Typical 10GbE latency

2014-11-06 Thread Udo Lembke
on the host? Thanks. Thu Nov 06 2014 at 16:57:36, Udo Lembke ulem...@polarzone.de mailto:ulem...@polarzone.de: Hi, from one host to five OSD-hosts. NIC Intel 82599EB; jumbo-frames; single Switch IBM G8124 (blade network). rtt min/avg/max/mdev = 0.075/0.114/0.231/0.037 ms

[ceph-users] How to see which crush tunables are active in a ceph-cluster?

2014-12-01 Thread Udo Lembke
Hi all, http://ceph.com/docs/master/rados/operations/crush-map/#crush-tunables described how to set the tunables to legacy, argonaut, bobtail, firefly or optimal. But how can I see, which profile is active in an ceph-cluster? With ceph osd getcrushmap I got not realy much info (only tunable

Re: [ceph-users] Old OSDs on new host, treated as new?

2014-12-05 Thread Udo Lembke
Hi, perhaps an stupid question, but why you change the hostname? Not tried, but I guess if you boot the node with an new hostname, the old hostname are in the crush map, but without any OSDs - because they are on the new host. Don't know ( I guess not) if the degration level stay also on 5% if

[ceph-users] For all LSI SAS9201-16i user - don't upgrate to firmware P20

2014-12-11 Thread Udo Lembke
Hi all, I have upgrade two LSI SAS9201-16i HBAs to the latest Firmware P20.00.00 and after that I got following syslog messages: Dec 9 18:11:31 ceph-03 kernel: [ 484.602834] mpt2sas0: log_info(0x3108): originator(PL), code(0x08), sub_code(0x) Dec 9 18:12:15 ceph-03 kernel: [

Re: [ceph-users] Multiple issues :( Ubuntu 14.04, latest Ceph

2014-12-15 Thread Udo Lembke
Hi Benjamin, On 15.12.2014 03:31, Benjamin wrote: Hey there, I've set up a small VirtualBox cluster of Ceph VMs. I have one ceph-admin0 node, and three ceph0,ceph1,ceph2 nodes for a total of 4. I've been following this guide: http://ceph.com/docs/master/start/quick-ceph-deploy/ to the

Re: [ceph-users] Multiple issues :( Ubuntu 14.04, latest Ceph

2014-12-15 Thread Udo Lembke
Hi, see here: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg15546.html Udo On 16.12.2014 05:39, Benjamin wrote: I increased the OSDs to 10.5GB each and now I have a different issue... cephy@ceph-admin0:~/ceph-cluster$ echo {Test-data} testfile.txt

Re: [ceph-users] Help with SSDs

2014-12-17 Thread Udo Lembke
Hi Mikaël, I have EVOs too, what to you mean by not playing well with D_SYNC? Is there something I can test on my side to compare results with you, as I have mine flashed? http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ described how

Re: [ceph-users] Help with SSDs

2014-12-18 Thread Udo Lembke
Hi Mark, On 18.12.2014 07:15, Mark Kirkwood wrote: While you can't do much about the endurance lifetime being a bit low, you could possibly improve performance using a journal *file* that is located on the 840's (you'll need to symlink it - disclaimer - have not tried this myself, but will

[ceph-users] Any tuning of LVM-Storage inside an VM related to ceph?

2014-12-18 Thread Udo Lembke
Hi all, I have some fileserver with insufficient read speed. Enabling read ahead inside the VM improve the read speed, but it's looks, that this has an drawback during lvm-operations like pvmove. For test purposes, I move the lvm-storage inside an VM from vdb to vdc1. It's take days, because it's

Re: [ceph-users] Reproducable Data Corruption with cephfs kernel driver

2014-12-18 Thread Udo Lembke
Hi Lindsay, have you tried the different cache-options (no cache, write through, ...) which proxmox offer, for the drive? Udo On 18.12.2014 05:52, Lindsay Mathieson wrote: I'be been experimenting with CephFS for funning KVM images (proxmox). cephfs fuse version - 0.87 cephfs kernel module

Re: [ceph-users] OSD space usage 2x object size after rados put

2014-04-10 Thread Udo Lembke
Hi, On 10.04.2014 20:03, Russell E. Glaue wrote: I am seeing the same thing, and was wondering the same. We have 16 OSDs on 4 hosts. The File system is Xfs. The OS is CentOS 6.4. ceph version 0.72.2 I am importing a 3.3TB disk image into a rbd image. At 2.6TB, and still importing, 5.197TB

Re: [ceph-users] mon server down

2014-04-15 Thread Udo Lembke
Hi, is the mon-process running? netstat -an | grep 6789 | grep -i listen is the filesystem nearly full? df -k any error output if you start the mon in the foreground (here mon b) ceph-mon -i b -d -c /etc/ceph/ceph.conf Udo Am 15.04.2014 16:11, schrieb Jonathan Gowar: Hi, I had an OSD

Re: [ceph-users] SSD journal overload?

2014-04-28 Thread Udo Lembke
Hi, perhaps due IOs from the journal? You can test with iostat (like iostat -dm 5 sdg). on debian iostat is in the package sysstat. Udo Am 28.04.2014 07:38, schrieb Indra Pramana: Hi Craig, Good day to you, and thank you for your enquiry. As per your suggestion, I have created a 3rd

Re: [ceph-users] Slow IOPS on RBD compared to journal and backing devices

2014-05-08 Thread Udo Lembke
Hi, I think not that's related, but how full is your ceph-cluster? Perhaps it's has something to do with the fragmentation on the xfs-filesystem (xfs_db -c frag -r device)? Udo Am 08.05.2014 02:57, schrieb Christian Balzer: Hello, ceph 0.72 on Debian Jessie, 2 storage nodes with 2 OSDs

Re: [ceph-users] Slow IOPS on RBD compared to journal and backing devices

2014-05-08 Thread Udo Lembke
Hi again, sorry, too fast - but this can't be an problem due to your 4GB cache... Udo Am 08.05.2014 17:20, schrieb Udo Lembke: Hi, I think not that's related, but how full is your ceph-cluster? Perhaps it's has something to do with the fragmentation on the xfs-filesystem (xfs_db -c frag -r

[ceph-users] Number of PGs with multible pools

2014-06-03 Thread Udo Lembke
Hi all, I know the formula ( num osds * 100 / replica ) for pg_num and pgp_num (extend to the next power of 2 value). But does something changed with two (or three) active pools? E.G. we have two pools which should have an pg_num of 4096. Should use the 4096 or 2048 because of two pools? best

[ceph-users] HowTo mark an OSD as down?

2014-06-23 Thread Udo Lembke
Hi, AFAIK should an ceph osd down osd.29 marked osd.29 as down. But what is to do if this don't happens? I got following: root@ceph-02:~# ceph osd down osd.29 marked down osd.29. root@ceph-02:~# ceph osd tree 2014-06-23 08:51:00.588042 7f15747f5700 0 -- :/1018258 172.20.2.11:6789/0

Re: [ceph-users] HowTo mark an OSD as down?

2014-06-23 Thread Udo Lembke
to force use of aio anyway 2014-06-23 09:08:05.313059 7f1ecb5d6780 -1 flushed journal /srv/journal/osd.29.journal for object store /var/lib/ceph/osd/ceph-29 root@ceph-02:~# umount /var/lib/ceph/osd/ceph-29 But why don't work ceph osd down osd.29? Udo Am 23.06.2014 09:01, schrieb Udo Lembke: Hi

Re: [ceph-users] HowTo mark an OSD as down?

2014-06-23 Thread Udo Lembke
Hi Henrik, On 23.06.2014 09:16, Henrik Korkuc wrote: ceph osd set noup will prevent osd's from becoming up. Later remember to run ceph osd unset noup. You can stop OSD with stop ceph-osd id=29. thanks for the hint! Udo ___ ceph-users mailing list

Re: [ceph-users] How to improve performance of ceph objcect storage cluster

2014-06-25 Thread Udo Lembke
Hi, I am also searching for tuning the single thread performance. You can try following parameters: [osd] osd mount options xfs = rw,noatime,inode64,logbsize=256k,delaylog,allocsize=4M osd_op_threads = 4 osd_disk_threads = 4 Udo Am 25.06.2014 08:52, schrieb wsnote: OS: CentOS 6.5 Version:

Re: [ceph-users] How to improve performance of ceph objcect storage cluster

2014-06-26 Thread Udo Lembke
Hi, Am 25.06.2014 16:48, schrieb Aronesty, Erik: I'm assuming you're testing the speed of cephfs (the file system) and not ceph object storage. for my part I mean object storage (VM disk via rbd). Udo ___ ceph-users mailing list

Re: [ceph-users] Generic Tuning parameters?

2014-06-28 Thread Udo Lembke
Hi Erich, I'm also on searching for improvements. You should use the right mountoptions, to prevent fragmentation (for XFS). [osd] osd mount options xfs = rw,noatime,inode64,logbsize=256k,delaylog,allocsize=4M osd_op_threads = 4 osd_disk_threads = 4 With 45 OSDs per node you need an powerfull

Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-14 Thread Udo Lembke
Hi, which values are all changed with ceph osd crush tunables optimal? Is it perhaps possible to change some parameter the weekends before the upgrade is running, to have more time? (depends if the parameter are available in 0.72...). The warning told, it's can take days... we have an cluster

Re: [ceph-users] slow read speeds from kernel rbd (Firefly 0.80.4)

2014-07-24 Thread Udo Lembke
Hi Steve, I'm also looking for improvements of single-thread-reads. A little bit higher values (twice?) should be possible with your config. I have 5 nodes with 60 4-TB hdds and got following: rados -p test bench -b 4194304 60 seq -t 1 --no-cleanup Total time run:60.066934 Total reads

Re: [ceph-users] slow read speeds from kernel rbd (Firefly 0.80.4)

2014-07-24 Thread Udo Lembke
Hi again, forget to say - I'm still on 0.72.2! Udo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] slow read speeds from kernel rbd (Firefly 0.80.4)

2014-07-26 Thread Udo Lembke
Hi, don't see an improvement with tcp_window_scaling=0 with my configuration. More the other way: the iperf-performance are much less: root@ceph-03:~# iperf -c 172.20.2.14 Client connecting to 172.20.2.14, TCP port 5001 TCP window size:

Re: [ceph-users] ceph (deploy?) and drive paths / mounting / best practice.

2013-11-19 Thread Udo Lembke
On 19.11.2013 06:56, Robert van Leeuwen wrote: Hi, ... It looks like it is just using /dev/sdX for this instead of the /dev/disk/by-id /by-path given by ceph-deploy. ... Hi Robert, I'm using the disk-label: fstab: LABEL=osd.0 /var/lib/ceph/osd/ceph-0 xfs noatime,nodiratime 0

[ceph-users] OSD-hierachy and crush

2013-12-20 Thread Udo Lembke
Hi, yesterday I expand our 3-Node ceph-cluster with an fourth node (additional 13 OSDs - all OSDs have the same size (4TB)). I use the same command like before to add OSDs and change the weight: ceph osd crush set 44 0.2 pool=default rack=unknownrack host=ceph-04 But ceph osd tree show all OSDs

Re: [ceph-users] One OSD always dieing

2014-01-15 Thread Udo Lembke
Hi, perhaps the disk has an problem? Have you look with smartctl? (apt-get install smartmontools; smartctl -A /dev/sdX ) Udo On 15.01.2014 10:49, Rottmann, Jonas (centron GmbH) wrote: Hi, I now did an upgrade to dumpling (ceph version 0.67.5 (a60ac9194718083a4b6a225fc17cad6096c69bd1)),

[ceph-users] rbd client affected with only one node down

2014-01-21 Thread Udo Lembke
Hi, I need a little bit help. We have an 4-node ceph cluster and the clients run in trouble if one node is down (due to maintenance). After the node is switched on again ceph health shows (for a little time): HEALTH_WARN 4 pgs incomplete; 14 pgs peering; 370 pgs stale; 12 pgs stuck unclean; 36

Re: [ceph-users] rbd client affected with only one node down

2014-01-22 Thread Udo Lembke
Hi Aaron, thanks for the very usefull hint! With ceph osd set noout it's works without trouble. Typical beginner's mistake. regards Udo Am 21.01.2014 20:45, schrieb Aaron Ten Clay: Udo, I think you might have better luck using ceph osd set noout before doing maintenance, rather than ceph

Re: [ceph-users] PG not getting clean

2014-02-14 Thread Udo Lembke
On 14.02.2014 17:58, Karan Singh wrote: mds cluster is degraded Hi, have you tried to create two more mds? About degraded: have you canged the weight of the osd's after an healthy cluster? Udo ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] 1 mons down, ceph-create-keys

2014-02-15 Thread Udo Lembke
Hi, perhaps your filesystem is too full? df -k du -hs /var/lib/ceph/mon/ceph-st3/store.db What output/Error-Message you get if you start the mon in the foreground? ceph-mon -i st3 -d -c /etc/ceph/ceph.conf Udo On 15.02.2014 09:30, Vadim Vatlin wrote: Hello Could you help me please ceph

Re: [ceph-users] Problem starting RADOS Gateway

2014-02-15 Thread Udo Lembke
Hi, does ceph -s also stuck on missing keyring? Do you have an keyring like: cat /etc/ceph/keyring [client.admin] key = AQCdkHZR2NBYMBAATe/rqIwCI96LTuyS3gmMXp== Or do you have anothe defined keyring in ceph.conf? global-section - keyring = /etc/ceph/keyring The key is in ceph - see ceph

[ceph-users] How to fix an incomplete PG on an 2 copy ceph-cluster?

2014-02-16 Thread Udo Lembke
Hi, I switch some disks from manual format to ceph-deploy (because slightly different xfs-parameters) - all disks are on a single node of an 4-node cluster. After rebuilding the osd-disk one PG are incomplete: ceph -s cluster 591db070-15c1-4c7a-b107-67717bdb87d9 health HEALTH_WARN 1 pgs

Re: [ceph-users] How to fix an incomplete PG on an 2 copy ceph-cluster?

2014-02-16 Thread Udo Lembke
On Sun, Feb 16, 2014 at 12:32 AM, Udo Lembke ulem...@polarzone.de wrote: Hi, I switch some disks from manual format to ceph-deploy (because slightly different xfs-parameters) - all disks are on a single node of an 4-node cluster. After rebuilding the osd-disk one PG are incomplete: ceph -s

Re: [ceph-users] How to fix an incomplete PG on an 2 copy ceph-cluster?

2014-02-18 Thread Udo Lembke
Hi Greg, I have used the ultimative way with ceph osd lost 42 --yes-i-really-mean-it but the pg is further down: ceph -s cluster 591db070-15c1-4c7a-b107-67717bdb87d9 health HEALTH_WARN 206 pgs degraded; 1 pgs down; 57 pgs incomplete; 1 pgs peering; 31 pgs stuck inactive; 145 pgs stuck

[ceph-users] correct way to increase the weight of all OSDs from 1 to 3.64

2014-03-04 Thread Udo Lembke
Hi all, I have startet the ceph-cluster with an weight of 1 for all osd-disks (4TB). Later I switched to ceph-deploy and ceph-deploy use normaly an weight of 3.64 for this disks, which makes much more sense! Now I wan't to change the weight of all 52 osds (on 4 nodes) to 3.64 and the question is,

Re: [ceph-users] correct way to increase the weight of all OSDs from 1 to 3.64

2014-03-04 Thread Udo Lembke
Hi Sage, thanks for the info! I will tried at weekend. Udo Am 04.03.2014 15:16, schrieb Sage Weil: The goal should be to increase the weights in unison, which should prevent any actual data movement (modulo some rounding error, perhaps). At the moment that can't be done via the CLI, but

Re: [ceph-users] How to see which crush tunables are active in a ceph-cluster?

2014-12-20 Thread Udo Lembke
an chooseleaf_vary_r 1 (from 0) take round about the same time to finished?? Regards Udo On 04.12.2014 14:09, Udo Lembke wrote: Hi, to answer myself. With ceph osd crush show-tunables I see a little bit more, but doesn't know how far away from firefly-tunables I'm at the procuction cluster

Re: [ceph-users] How to see which crush tunables are active in a ceph-cluster?

2014-12-20 Thread Udo Lembke
at the same time. On Sat, Dec 20, 2014 at 3:26 AM, Udo Lembke ulem...@polarzone.de mailto:ulem...@polarzone.de wrote: Hi, for information for other cepher... I switched from unknown crush tunables to firefly and it's takes 6 hour (30.853% degration) to finisched on our

Re: [ceph-users] v0.90 released

2014-12-23 Thread Udo Lembke
Hi Sage, Am 23.12.2014 15:39, schrieb Sage Weil: ... You can't reduce the PG count without creating new (smaller) pools and migrating data. does this also work with the pool metadata, or is this pool essential for ceph? Udo ___ ceph-users mailing

Re: [ceph-users] Any Good Ceph Web Interfaces?

2014-12-23 Thread Udo Lembke
Hi, for monitoring only I use the Ceph Dashboard https://github.com/Crapworks/ceph-dash/ Fo me it's an nice tool for an good overview - for administration i use the cli. Udo On 23.12.2014 01:11, Tony wrote: Please don't mention calamari :-) The best web interface for ceph that actually

Re: [ceph-users] command to flush rbd cache?

2015-02-04 Thread Udo Lembke
Hi Dan, I mean qemu-kvm, also librbd. But how I can kvm told to flush the buffer? Udo On 05.02.2015 07:59, Dan Mick wrote: On 02/04/2015 10:44 PM, Udo Lembke wrote: Hi all, is there any command to flush the rbd cache like the echo 3 /proc/sys/vm/drop_caches for the os cache? Udo Do you

Re: [ceph-users] command to flush rbd cache?

2015-02-04 Thread Udo Lembke
Hi Josh, thanks for the info. detach/reattach schould be fine for me, because it's only for performance testing. #2468 would be fine of course. Udo On 05.02.2015 08:02, Josh Durgin wrote: On 02/05/2015 07:44 AM, Udo Lembke wrote: Hi all, is there any command to flush the rbd cache like

[ceph-users] command to flush rbd cache?

2015-02-04 Thread Udo Lembke
Hi all, is there any command to flush the rbd cache like the echo 3 /proc/sys/vm/drop_caches for the os cache? Udo ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-04 Thread Udo Lembke
Hi Marco, Am 04.02.2015 10:20, schrieb Colombo Marco: ... We choosen the 6TB of disk, because we need a lot of storage in a small amount of server and we prefer server with not too much disks. However we plan to use max 80% of a 6TB Disk 80% is too much! You will run into trouble. Ceph

Re: [ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-06 Thread Udo Lembke
Am 06.02.2015 09:06, schrieb Hector Martin: On 02/02/15 03:38, Udo Lembke wrote: With 3 hosts only you can't survive an full node failure, because for that you need host = k + m. Sure you can. k=2, m=1 with the failure domain set to host will survive a full host failure. Hi, Alexandre

Re: [ceph-users] Better way to use osd's of different size

2015-01-16 Thread Udo Lembke
Hi Megov, you should weight the OSD so it's represent the size (like an weight of 3.68 for an 4TB HDD). cephdeploy do this automaticly. Nevertheless also with the correct weight the disk was not filled in equal distribution. For that purposes you can use reweight for single OSDs, or automaticly

Re: [ceph-users] Power failure recovery woes

2015-02-17 Thread Udo Lembke
Hi Jeff, is the osd /var/lib/ceph/osd/ceph-2 mounted? If not, does it helps, if you mounted the osd and start with service ceph start osd.2 ?? Udo Am 17.02.2015 09:54, schrieb Jeff: Hi, We had a nasty power failure yesterday and even with UPS's our small (5 node, 12 OSD) cluster is having

Re: [ceph-users] Sizing SSD's for ceph

2015-01-29 Thread Udo Lembke
Hi, Am 29.01.2015 07:53, schrieb Christian Balzer: On Thu, 29 Jan 2015 01:30:41 + Ramakrishna Nishtala (rnishtal) wrote: * Per my understanding once writes are complete to journal then it is read again from the journal before writing to data disk. Does this mean, we have to do,

Re: [ceph-users] RBD caching on 4K reads???

2015-01-30 Thread Udo Lembke
Hi Bruce, hmm, sounds for me like the rbd cache. Can you look, if the cache is realy disabled in the running config with ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show | grep cache Udo On 30.01.2015 21:51, Bruce McFarland wrote: I have a cluster and have created a rbd device -

Re: [ceph-users] RBD caching on 4K reads???

2015-01-30 Thread Udo Lembke
verify if it’s disabled at the librbd level on the client. If you mean on the storage nodes I’ve had some issues dumping the config. Does the rbd caching occur on the storage nodes, client, or both? *From:*Udo Lembke [mailto:ulem...@polarzone.de] *Sent:* Friday, January 30, 2015 1:00 PM

Re: [ceph-users] erasure code : number of chunks for a small cluster ?

2015-02-01 Thread Udo Lembke
Hi Alexandre, nice to meet you here ;-) With 3 hosts only you can't survive an full node failure, because for that you need host = k + m. And k:1 m:2 don't make any sense. I start with 5 hosts and use k:3, m:2. In this case two hdds can fail or one host can be down for maintenance. Udo PS:

Re: [ceph-users] OSD capacity variance ?

2015-02-01 Thread Udo Lembke
Hi Howard, I assume it's an typo with 160 + 250 MB. Ceph OSDs must be min. 10GB to get an weight of 0.01 Udo On 31.01.2015 23:39, Howard Thomson wrote: Hi All, I am developing a custom disk storage backend for the Bacula backup system, and am in the process of setting up a trial Ceph system,

Re: [ceph-users] estimate the impact of changing pg_num

2015-02-01 Thread Udo Lembke
Hi Xu, On 01.02.2015 21:39, Xu (Simon) Chen wrote: RBD doesn't work extremely well when ceph is recovering - it is common to see hundreds or a few thousands of blocked requests (30s to finish). This translates high IO wait inside of VMs, and many applications don't deal with this well. this

Re: [ceph-users] slow read-performance inside the vm

2015-01-27 Thread Udo Lembke
Hi Patrik, Am 27.01.2015 14:06, schrieb Patrik Plank: ... I am really happy, these values above are enough for my little amount of vms. Inside the vms I get now for write 80mb/s and read 130mb/s, with write-cache enabled. But there is one little problem. Are there some tuning

Re: [ceph-users] backfill_toofull, but OSDs not full

2015-01-09 Thread Udo Lembke
Hi, I had an similiar effect two weeks ago - 1PG backfill_toofull and due reweighting and delete there was enough free space but the rebuild process stopped after a while. After stop and start ceph on the second node, the rebuild process runs without trouble and the backfill_toofull are gone.

[ceph-users] ssd osd fails often with FAILED assert(soid scrubber.start || soid = scrubber.end)

2015-01-13 Thread Udo Lembke
Hi, since last thursday we had an ssd-pool (cache tier) in front of an ec-pool and fill the pools with data via rsync (app. 50MB/s). The ssd-pool has tree disks and one of them (an DC S3700) fails four times since that. I simply start the osd again and the pool pas rebuilded and work again for

[ceph-users] Part 2: ssd osd fails often with FAILED assert(soid scrubber.start || soid = scrubber.end)

2015-01-14 Thread Udo Lembke
Hi again, sorry for not threaded, but my last email don't came back on the mailing list (often miss some posts!). Just after sending the last mail, the first time another SSD fails - in this case an cheap one, but with the same error: root@ceph-04:/var/log/ceph# more ceph-osd.62.log 2015-01-13

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Udo Lembke
Hi, your will get further trouble, because your weight is not correct. You need an weight = 0.01 for each OSD. This mean, you OSD must be 10GB or greater! Udo Am 10.02.2015 12:22, schrieb B L: Hi Vickie, My OSD tree looks like this: ceph@ceph-node3:/home/ubuntu$ ceph osd tree #

Re: [ceph-users] Placement Groups fail on fresh Ceph cluster installation with all OSDs up and in

2015-02-10 Thread Udo Lembke
Hi, use: ceph osd crush set 0 0.01 pool=default host=ceph-node1 ceph osd crush set 1 0.01 pool=default host=ceph-node1 ceph osd crush set 2 0.01 pool=default host=ceph-node3 ceph osd crush set 3 0.01 pool=default host=ceph-node3 ceph osd crush set 4 0.01 pool=default host=ceph-node2 ceph osd crush

Re: [ceph-users] Improving Performance with more OSD's?

2015-01-04 Thread Udo Lembke
Hi Lindsay, On 05.01.2015 06:52, Lindsay Mathieson wrote: ... So two OSD Nodes had: - Samsung 840 EVO SSD for Op. Sys. - Intel 530 SSD for Journals (10GB Per OSD) - 3TB WD Red - 1 TB WD Blue - 1 TB WD Blue - Each disk weighted at 1.0 - Primary affinity of the WD Red (slow) set to 0 the

Re: [ceph-users] Ceph BIG outage : 200+ OSD are down , OSD cannot create thread

2015-03-09 Thread Udo Lembke
Hi Tony, sounds like an good idea! Udo On 09.03.2015 21:55, Tony Harris wrote: I know I'm not even close to this type of a problem yet with my small cluster (both test and production clusters) - but it would be great if something like that could appear in the cluster HEALTHWARN, if Ceph could

[ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-25 Thread Udo Lembke
Hi, due to two more hosts (now 7 storage nodes) I want to create an new ec-pool and get an strange effect: ceph@admin:~$ ceph health detail HEALTH_WARN 2 pgs degraded; 2 pgs stuck degraded; 2 pgs stuck unclean; 2 pgs stuck undersized; 2 pgs undersized pg 22.3e5 is stuck unclean since forever,

Re: [ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-25 Thread Udo Lembke
300 PGs... Udo Am 25.03.2015 14:52, schrieb Gregory Farnum: On Wed, Mar 25, 2015 at 1:20 AM, Udo Lembke ulem...@polarzone.de wrote: Hi, due to two more hosts (now 7 storage nodes) I want to create an new ec-pool and get an strange effect: ceph@admin:~$ ceph health detail HEALTH_WARN 2 pgs

Re: [ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-25 Thread Udo Lembke
Hi Don, thanks for the info! looks that choose_tries set to 200 do the trick. But the setcrushmap takes a long long time (alarming, but the client have still IO)... hope it's finished soon ;-) Udo Am 25.03.2015 16:00, schrieb Don Doerner: Assuming you've calculated the number of PGs

[ceph-users] won leader election with quorum during osd setcrushmap

2015-03-25 Thread Udo Lembke
Hi, due to PG-trouble with an EC-Pool I modify the crushmap (step set_choose_tries 200) from rule ec7archiv { ruleset 6 type erasure min_size 3 max_size 20 step set_chooseleaf_tries 5 step take default step chooseleaf indep 0 type host

Re: [ceph-users] Strange osd in PG with new EC-Pool - pgs: 2 active+undersized+degraded

2015-03-26 Thread Udo Lembke
could have specified enough PGs to make it impossible to form PGs out of 84 OSDs (I'm assuming your SSDs are in a separate root) but I have to ask... -don- -Original Message- From: Udo Lembke [mailto:ulem...@polarzone.de] Sent: 25 March, 2015 08:54 To: Don Doerner; ceph-us

[ceph-users] How to see the content of an EC Pool after recreate the SSD-Cache tier?

2015-03-26 Thread Udo Lembke
Hi all, due an very silly approach, I removed the cache tier of an filled EC pool. After recreate the pool and connect with the EC pool I don't see any content. How can I see the rbd_data and other files through the new ssd cache tier? I think, that I must recreate the rbd_directory (and fill

Re: [ceph-users] Hammer release data and a Design question

2015-03-27 Thread Udo Lembke
Hi, Am 26.03.2015 11:18, schrieb 10 minus: Hi , I 'm just starting on small Ceph implementation and wanted to know the release date for Hammer. Will it coincide with relase of Openstack. My Conf: (using 10G and Jumboframes on Centos 7 / RHEL7 ) 3x Mons (VMs) : CPU - 2 Memory - 4G

[ceph-users] too few pgs in cache tier

2015-02-27 Thread Udo Lembke
Hi all, we use an EC-Pool with an small cache tier in front of, for our archive-data (4 * 16TB VM-disks). The ec-pool has k=3;m=2 because we startet with 5 nodes and want to migrate to an new ec-pool with k=5;m=2. Therefor we migrate one VM-disk (16TB) from the ceph-cluster to an fc-raid with the

Re: [ceph-users] How to see the content of an EC Pool after recreate the SSD-Cache tier?

2015-03-26 Thread Udo Lembke
, is to create new rbd-disks and copy all blocks with rados get - file - rados put. The problem is the time it's take (days to weeks for 3 * 16TB)... Udo -Greg On Thu, Mar 26, 2015 at 8:56 AM, Udo Lembke ulem...@polarzone.de wrote: Hi Greg, ok! It's looks like, that my problem is more

Re: [ceph-users] How to see the content of an EC Pool after recreate the SSD-Cache tier?

2015-03-26 Thread Udo Lembke
show up when listing on the cache pool. -Greg On Thu, Mar 26, 2015 at 3:43 AM, Udo Lembke ulem...@polarzone.de wrote: Hi all, due an very silly approach, I removed the cache tier of an filled EC pool. After recreate the pool and connect with the EC pool I don't see any content. How can I see

Re: [ceph-users] How to estimate whether putting a journal on SSD will help with performance?

2015-05-01 Thread Udo Lembke
Hi, On 01.05.2015 10:30, Piotr Wachowicz wrote: Is there any way to confirm (beforehand) that using SSDs for journals will help? yes SSD-Journal helps a lot (if you use the right SSDs) for write speed, and I made the experiences that this also helped (but not too much) for read-performance.

Re: [ceph-users] Did maximum performance reached?

2015-07-28 Thread Udo Lembke
Hi, On 28.07.2015 12:02, Shneur Zalman Mattern wrote: Hi! And so, in your math I need to build size = osd, 30 replicas for my cluster of 120TB - to get my demans 30 replicas is the wrong math! Less replicas = more speed (because of less writing). More replicas less speed. Fore data

Re: [ceph-users] dropping old distros: el6, precise 12.04, debian wheezy?

2015-07-30 Thread Udo Lembke
Hi, dropping debian wheezy are quite fast - till now there aren't packages for jessie?! Dropping of squeeze I understand, but wheezy at this time? Udo On 30.07.2015 15:54, Sage Weil wrote: As time marches on it becomes increasingly difficult to maintain proper builds and packages for older

Re: [ceph-users] Different filesystems on OSD hosts at the same cluster

2015-08-07 Thread Udo Lembke
Hi, some time ago I switched all OSDs from XFS to ext4 (step by step). I had no issues during mixed osd-format (the process takes some weeks). And yes, for me ext4 performs also better (esp. the latencies). Udo Am 07.08.2015 13:31, schrieb Межов Игорь Александрович: Hi! We do some

Re: [ceph-users] Different filesystems on OSD hosts at the samecluster

2015-08-07 Thread Udo Lembke
1) the default is relatime which has minimal impact on performance 2) AFAIK some ceph features actually use atime (cache tiering was it?) or at least so I gathered from some bugs I saw Jan On 07 Aug 2015, at 16:30, Udo Lembke ulem...@polarzone.de wrote: Hi, I use the ext4-parameters like

Re: [ceph-users] different omap format in one cluster (.sst + .ldb) - new installed OSD-node don't start any OSD

2015-07-23 Thread Udo Lembke
osd? How many osds meet this problems? This assert failure means that osd detects a upgraded pg meta object but failed to read(or lack of 1 key) meta keys from object. On Thu, Jul 23, 2015 at 7:03 PM, Udo Lembke ulem...@polarzone.de wrote: Am 21.07.2015 12:06, schrieb Udo Lembke: Hi all

Re: [ceph-users] different omap format in one cluster (.sst + .ldb) - new installed OSD-node don't start any OSD

2015-07-23 Thread Udo Lembke
Am 21.07.2015 12:06, schrieb Udo Lembke: Hi all, ... Normaly I would say, if one OSD-Node die, I simply reinstall the OS and ceph and I'm back again... but this looks bad for me. Unfortunality the system also don't start 9 OSDs as I switched back to the old system-disk... (only three

Re: [ceph-users] He8 drives

2015-07-13 Thread Udo Lembke
Hi, I have just expand our ceph-cluster (7 nodes) with one 8TB HGST (change from 4TB to 8TB) on each node (and 11 4TB HGST). But I have set the primary affinity to 0 for the 8 TB-disks... in this case my performance values are not 8-TB-disk related. Udo On 08.07.2015 02:28, Blair Bethwaite

[ceph-users] different omap format in one cluster (.sst + .ldb) - new installed OSD-node don't start any OSD

2015-07-21 Thread Udo Lembke
Hi all, we had an ceph cluster with 7 OSD-nodes (Debian Jessie (because patched tcmalloc) with ceph 0.94) which we expand with one further node. For this node we use puppet with Debian 7.8, because ceph 0.92.2 doesn't install on Jessie (upgrade 0.94.1 work on the other nodes but 0.94.2 looks not

Re: [ceph-users] Network performance

2015-10-22 Thread Udo Lembke
Hi Jonas, you can create an bond over multible NICs (depends on your switch which modes are possible) to use one IP addresses but more than one NIC. Udo On 21.10.2015 10:23, Jonas Björklund wrote: > Hello, > > In the configuration I have read about "cluster network" and "cluster addr". > Is it

Re: [ceph-users] two or three replicas?

2015-11-03 Thread Udo Lembke
Hi, for production (with enough OSDs) is three replicas the right choice. The chance for data loss if two ODSs fails at one time is to high. And if this happens most of your data ist lost, because the data is spead over many OSDs... And yes - two replicas is faster for writes. Udo On

Re: [ceph-users] v0.94.4 Hammer released

2015-10-20 Thread Udo Lembke
Hi, do you have changed the ownership like discribed in Sages mail about "v9.1.0 Infernalis release candidate released"? #. Fix the ownership:: chown -R ceph:ceph /var/lib/ceph or set ceph.conf to use root instead? When upgrading, administrators have two options: #. Add

Re: [ceph-users] Cache tier experiences (for ample sized caches ^o^)

2015-10-07 Thread Udo Lembke
Hi Christian, On 07.10.2015 09:04, Christian Balzer wrote: > > ... > > My main suspect for the excessive slowness are actually the Toshiba DT > type drives used. > We only found out after deployment that these can go into a zombie mode > (20% of their usual performance for ~8 hours if not

Re: [ceph-users] Storage node refurbishing, a "freeze" OSD feature would be nice

2015-08-31 Thread Udo Lembke
Hi Christian, for my setup "b" takes too long - too much data movement and stress to all nodes. I have simply (with replica 3) "set noout", reinstall one node (with new filesystem on the OSDs, but leave them in the crushmap) and start all OSDs (at friday night) - takes app. less than one day

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-07 Thread Udo Lembke
Hi Vickey, I had the same rados bench output after changing the motherboard of the monitor node with the lowest IP... Due to the new mainboard, I assume the hw-clock was wrong during startup. Ceph health show no errors, but all VMs aren't able to do IO (very high load on the VMs - but no traffic).

Re: [ceph-users] [sepia] debian jessie repository ?

2015-09-25 Thread Udo Lembke
Hi, you can use this sources-list cat /etc/apt/sources.list.d/ceph.list deb http://gitbuilder.ceph.com/ceph-deb-jessie-x86_64-basic/ref/v0.94.3 jessie main Udo On 25.09.2015 15:10, Jogi Hofmüller wrote: > Hi, > > Am 2015-09-11 um 13:20 schrieb Florent B: > >> Jessie repository will be available

Re: [ceph-users] All SSD Pool - Odd Performance

2015-11-22 Thread Udo Lembke
--verify=0 --verify_fatal=0 --numjobs=4 --rw=randwrite --blocksize=4k --group_reporting Udo On 22.11.2015 23:59, Udo Lembke wrote: > Hi Zoltan, > you are right ( but this was two running systems...). > > I see also an big failure: "--filename=/mnt/test.bin" (use simply > c

Re: [ceph-users] All SSD Pool - Odd Performance

2015-11-22 Thread Udo Lembke
ents clean tomorow. Udo On 22.11.2015 14:29, Zoltan Arnold Nagy wrote: > It would have been more interesting if you had tweaked only one option > as now we can’t be sure which changed had what impact… :-) > >> On 22 Nov 2015, at 04:29, Udo Lembke <ulem...@polarzone.de >

  1   2   >