[ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Andrija Panic
Hi, I'm trying to add CEPH as Primary Storage, but my libvirt 0.10.2 (CentOS 6.5) does some complaints: - internal error missing backend for pool type 8 Is it possible that the libvirt 0.10.2 (shipped with CentOS 6.5) was not compiled with RBD support ? Can't find how to check this... I'm able

Re: [ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Andrija Panic
Thank you very much Wido, any suggestion on compiling libvirt with support (I already found a way) or perhaps use some prebuilt , that you would recommend ? Best On 28 April 2014 13:25, Wido den Hollander w...@42on.com wrote: On 04/28/2014 12:49 PM, Andrija Panic wrote: Hi, I'm trying

Re: [ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Andrija Panic
Thanks Dan :) On 28 April 2014 15:02, Dan van der Ster daniel.vanders...@cern.ch wrote: On 28/04/14 14:54, Wido den Hollander wrote: On 04/28/2014 02:15 PM, Andrija Panic wrote: Thank you very much Wido, any suggestion on compiling libvirt with support (I already found a way) or perhaps

Re: [ceph-users] Unable to add CEPH as Primary Storage - libvirt error undefined storage pool type

2014-04-28 Thread Andrija Panic
Dan, is this maybe just rbd support for kvm package (I already have rbd enabled qemu, qemu-img etc from ceph.com site) I need just libvirt with rbd support ? Thanks On 28 April 2014 15:05, Andrija Panic andrija.pa...@gmail.com wrote: Thanks Dan :) On 28 April 2014 15:02, Dan van der Ster

[ceph-users] OSD not starting at boot time

2014-04-30 Thread Andrija Panic
Hi, I was wondering why would OSDs not start at the boot time, happens on 1 server (2 OSDs). If i check with: chkconfig ceph --list, I can see that is should start, that is, the MON on this server does really start but OSDs does not. I can normally start them with: service ceph start osd.X

[ceph-users] Migrate system VMs from local storage to CEPH

2014-05-02 Thread Andrija Panic
Hi. I was wondering what would be correct way to migrate system VMs (storage,console,VR) from local storage to CEPH. I'm on CS 4.2.1 and will be soon updating to 4.3... Is it enough to just change global setting system.vm.use.local.storage = true, to FALSE, and then destroy system VMs

Re: [ceph-users] Migrate system VMs from local storage to CEPH

2014-05-05 Thread Andrija Panic
Thank you very much Wido, that's exatly what I was looking for. Thanks On 4 May 2014 18:30, Wido den Hollander w...@42on.com wrote: On 05/02/2014 04:06 PM, Andrija Panic wrote: Hi. I was wondering what would be correct way to migrate system VMs (storage,console,VR) from local storage

Re: [ceph-users] Migrate system VMs from local storage to CEPH

2014-05-05 Thread Andrija Panic
Will try creating tag inside CS database, since GUI/cloudmoneky editing of existing offer is NOT possible... On 5 May 2014 16:04, Brian Rak b...@gameservers.com wrote: This would be a better question for the Cloudstack community. On 5/2/2014 10:06 AM, Andrija Panic wrote: Hi. I

Re: [ceph-users] Migrate system VMs from local storage to CEPH

2014-05-05 Thread Andrija Panic
suggestion, please ? Thanks, Andrija On 5 May 2014 16:11, Andrija Panic andrija.pa...@gmail.com wrote: Will try creating tag inside CS database, since GUI/cloudmoneky editing of existing offer is NOT possible... On 5 May 2014 16:04, Brian Rak b...@gameservers.com wrote: This would

Re: [ceph-users] Replace journals disk

2014-05-06 Thread Andrija Panic
elegant than this manual steps... Cheers On 6 May 2014 12:52, Gandalf Corvotempesta gandalf.corvotempe...@gmail.comwrote: 2014-05-06 12:39 GMT+02:00 Andrija Panic andrija.pa...@gmail.com: Good question - I'm also interested. Do you want to movejournal to dedicated disk/partition i.e. on SSD

Re: [ceph-users] Migrate system VMs from local storage to CEPH

2014-05-06 Thread Andrija Panic
wrote: On 05/05/2014 11:40 PM, Andrija Panic wrote: Hi Wido, thanks again for inputs. Everything is fine, except for the Software Router - it doesn't seem to get created on CEPH, no matter what I try. There is a separate offering for the VR, have you checked that? But this is more

Re: [ceph-users] NFS over CEPH - best practice

2014-05-07 Thread Andrija Panic
Mapping RBD image to 2 or more servers is the same as a shared storage device (SAN) - so from there on, you could do any clustering you want, based on what Wido said... On 7 May 2014 12:43, Andrei Mikhailovsky and...@arhont.com wrote: Wido, would this work if I were to run nfs over two or

[ceph-users] qemu-img break cloudstack snapshot

2014-05-10 Thread Andrija Panic
Hi, just to share my issue with qemu-img provided by CEPH (RedHat made a problem, not CEPH): newest qemu-img - /qemu-img-0.12.1.2-2.415.el6.3ceph.x86_64.rpm was built from RHEL 6.5 source code, where Redhat removed the -s paramter, so snapshooting in CloudStack up to 4.2.1 does not work, I

Re: [ceph-users] client: centos6.4 no rbd.ko

2014-05-14 Thread Andrija Panic
Try 3.x from elrepo repo...works for me, cloudstack/ceph... Sent from Google Nexus 4 On May 14, 2014 11:56 AM, maoqi1982 maoqi1...@126.com wrote: Hi list our ceph(0.72) cluster use ubuntu12.04 is ok . client server run openstack install CentOS6.4 final, the kernel is up to

[ceph-users] Cluster status reported wrongly as HEALTH_WARN

2014-06-17 Thread Andrija Panic
Hi, I have 3 node (2 OSD per node) CEPH cluster, running fine, not much data, network also fine: Ceph ceph-0.72.2. When I issue ceph status command, I get randomly HEALTH_OK, and imidiately after that when repeating command, I get HEALTH_WARN Examle given down - these commands were issues

Re: [ceph-users] Cluster status reported wrongly as HEALTH_WARN

2014-06-17 Thread Andrija Panic
:44 +0200 Andrija Panic wrote: Hi, I have 3 node (2 OSD per node) CEPH cluster, running fine, not much data, network also fine: Ceph ceph-0.72.2. When I issue ceph status command, I get randomly HEALTH_OK, and imidiately after that when repeating command, I get HEALTH_WARN Examle

Re: [ceph-users] Cluster status reported wrongly as HEALTH_WARN

2014-06-17 Thread Andrija Panic
be a disk space issue. Regards, *Stanislav Yanchev* Core System Administrator [image: MAX TELECOM] Mobile: +359 882 549 441 s.yanc...@maxtelecom.bg www.maxtelecom.bg *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf Of *Andrija Panic *Sent:* Tuesday, June 17, 2014

Re: [ceph-users] Cluster status reported wrongly as HEALTH_WARN

2014-06-18 Thread Andrija Panic
...@inktank.com wrote: Try running ceph health detail on each of the monitors. Your disk space thresholds probably aren't configured correctly or something. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Jun 17, 2014 at 2:09 AM, Andrija Panic andrija.pa...@gmail.com wrote

Re: [ceph-users] Cluster status reported wrongly as HEALTH_WARN

2014-06-18 Thread Andrija Panic
As stupid as I could do it... After lowering mon data . from 20% to 15% treshold, it seems I forgot to restart MON service on this one node... I appologies for bugging and thanks again everybody. Andrija On 18 June 2014 09:49, Andrija Panic andrija.pa...@gmail.com wrote: Hi Gregory

Re: [ceph-users] Cluster status reported wrongly as HEALTH_WARN

2014-06-18 Thread Andrija Panic
Thanks Greg, seems like I'm going to update soon... Thanks again, Andrija On 18 June 2014 14:06, Gregory Farnum g...@inktank.com wrote: The lack of warnings in ceph -w for this issue is a bug in Emperor. It's resolved in Firefly. -Greg On Wed, Jun 18, 2014 at 3:49 AM, Andrija Panic

[ceph-users] Mixing CEPH versions on new ceph nodes...

2014-07-02 Thread Andrija Panic
Hi, I have existing CEPH cluster of 3 nodes, versions 0.72.2 I'm in a process of installing CEPH on 4th node, but now CEPH version is 0.80.1 Will this make problems running mixed CEPH versions ? I intend to upgrade CEPH on exsiting 3 nodes anyway ? Recommended steps ? Thanks -- Andrija

Re: [ceph-users] Mixing CEPH versions on new ceph nodes...

2014-07-03 Thread Andrija Panic
/02/2014 04:08 PM, Andrija Panic wrote: Hi, I have existing CEPH cluster of 3 nodes, versions 0.72.2 I'm in a process of installing CEPH on 4th node, but now CEPH version is 0.80.1 Will this make problems running mixed CEPH versions ? No, but the recommendation is not to have this running

Re: [ceph-users] Mixing CEPH versions on new ceph nodes...

2014-07-03 Thread Andrija Panic
Thanks a lot Wido, will do... Andrija On 3 July 2014 13:12, Wido den Hollander w...@42on.com wrote: On 07/03/2014 10:59 AM, Andrija Panic wrote: Hi Wido, thanks for answers - I have mons and OSD on each host... server1: mon + 2 OSDs, same for server2 and server3. Any Proposed upgrade

Re: [ceph-users] Mixing CEPH versions on new ceph nodes...

2014-07-03 Thread Andrija Panic
Wido, one final question: since I compiled libvirt1.2.3 usinfg ceph-devel 0.72 - do I need to recompile libvirt again now with ceph-devel 0.80 ? Perhaps not smart question, but need to make sure I don't screw something... Thanks for your time, Andrija On 3 July 2014 14:27, Andrija Panic

Re: [ceph-users] Mixing CEPH versions on new ceph nodes...

2014-07-03 Thread Andrija Panic
Thanks again a lot. On 3 July 2014 15:20, Wido den Hollander w...@42on.com wrote: On 07/03/2014 03:07 PM, Andrija Panic wrote: Wido, one final question: since I compiled libvirt1.2.3 usinfg ceph-devel 0.72 - do I need to recompile libvirt again now with ceph-devel 0.80 ? Perhaps

[ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80

2014-07-12 Thread Andrija Panic
Hi, Sorry to bother, but I have urgent situation: upgraded CEPH from 0.72 to 0.80 (centos 6.5), and now all my CloudStack HOSTS can not connect. I did basic yum update ceph on the first MON leader, and all CEPH services on that HOST, have been restarted - done same on other CEPH nodes (I have

Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80

2014-07-13 Thread Andrija Panic
suggestion on need to recompile libvirt ? I got info from Wido, that libvirt does NOT need to be recompiled Best On 13 July 2014 08:35, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: On 13/07/14 17:07, Andrija Panic wrote: Hi, Sorry to bother, but I have urgent situation: upgraded

Re: [ceph-users] [URGENT]. Can't connect to CEPH after upgrade from 0.72 to 0.80

2014-07-13 Thread Andrija Panic
for your time for my issue... Best. Andrija On 13 July 2014 10:20, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: On 13/07/14 19:15, Mark Kirkwood wrote: On 13/07/14 18:38, Andrija Panic wrote: Any suggestion on need to recompile libvirt ? I got info from Wido, that libvirt does

[ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-13 Thread Andrija Panic
Hi, after seting ceph upgrade (0.72.2 to 0.80.3) I have issued ceph osd crush tunables optimal and after only few minutes I have added 2 more OSDs to the CEPH cluster... So these 2 changes were more or a less done at the same time - rebalancing because of tunables optimal, and rebalancing

Re: [ceph-users] Mixing CEPH versions on new ceph nodes...

2014-07-13 Thread Andrija Panic
daemons automatically ? Since it makes sense to have all MONs updated first, and than OSD (and perhaps after that MDS if using it...) Upgraded to 0.80.3 release btw. Thanks for your help again. Andrija On 3 July 2014 15:21, Andrija Panic andrija.pa...@gmail.com wrote: Thanks again a lot. On 3

Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-14 Thread Andrija Panic
in our upgrade process 2. What options should we have used to keep our vms alive Cheers Andrei -- *From: *Andrija Panic andrija.pa...@gmail.com *To: *ceph-users@lists.ceph.com *Sent: *Sunday, 13 July, 2014 9:54:17 PM *Subject: *[ceph-users] ceph osd crush

Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-14 Thread Andrija Panic
of overhead related to rebalancing... and it's clearly not perfect yet. :/ sage On Sun, 13 Jul 2014, Andrija Panic wrote: Hi, after seting ceph upgrade (0.72.2 to 0.80.3) I have issued ceph osd crush tunables optimal and after only few minutes I have added 2 more OSDs to the CEPH cluster

Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-14 Thread Andrija Panic
Udo, I had all VMs completely unoperational - so don't set optimal for now... On 14 July 2014 20:48, Udo Lembke ulem...@polarzone.de wrote: Hi, which values are all changed with ceph osd crush tunables optimal? Is it perhaps possible to change some parameter the weekends before the upgrade

Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-15 Thread Andrija Panic
not perfect yet. :/ sage On Sun, 13 Jul 2014, Andrija Panic wrote: Hi, after seting ceph upgrade (0.72.2 to 0.80.3) I have issued ceph osd crush tunables optimal and after only few minutes I have added 2 more OSDs to the CEPH cluster... So these 2 changes were more or a less done

Re: [ceph-users] v0.80.4 Firefly released

2014-07-16 Thread Andrija Panic
Hi Sage, can anyone validate, if there is still bug inside RPMs that does automatic CEPH service restart after updating packages ? We are instructed to first update/restart MONs, and after that OSD - but that is impossible if we have MON+OSDs on same host...since the ceph is automaticaly

Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-16 Thread Andrija Panic
prohibited. If you have received this email in error please immediately advise us by return email at and...@arhont.com and delete and purge the email and any attachments without making a copy. -- *From: *Quenten Grasso qgra...@onq.com.au *To: *Andrija Panic andrija.pa

[ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Andrija Panic
Hi, we just had some new clients, and have suffered very big degradation in CEPH performance for some reasons (we are using CloudStack). I'm wondering if there is way to monitor OP/s or similar usage by client connected, so we can isolate the heavy client ? Also, what is the general best

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Andrija Panic
Thanks Wido, yes I'm aware of CloudStack in that sense, but would prefer some precise OP/s per ceph Image at least... Will check CloudStack then... Thx On 8 August 2014 13:53, Wido den Hollander w...@42on.com wrote: On 08/08/2014 01:51 PM, Andrija Panic wrote: Hi, we just had some new

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Andrija Panic
- could not find anything with google... Thanks again Wido. Andrija On 8 August 2014 14:07, Wido den Hollander w...@42on.com wrote: On 08/08/2014 02:02 PM, Andrija Panic wrote: Thanks Wido, yes I'm aware of CloudStack in that sense, but would prefer some precise OP/s per ceph Image at least

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Andrija Panic
eat up the entire iops capacity of the cluster. Cheers, Dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- On 08 Aug 2014, at 13:51, Andrija Panic andrija.pa...@gmail.com wrote: Hi, we just had some new clients, and have suffered very big degradation in CEPH

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Andrija Panic
Thanks again, and btw, beside being Friday I'm also on vacation - so double the joy of troubleshooting performance problmes :))) Thx :) On 8 August 2014 16:01, Dan Van Der Ster daniel.vanders...@cern.ch wrote: Hi, On 08 Aug 2014, at 15:55, Andrija Panic andrija.pa...@gmail.com wrote

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-08 Thread Andrija Panic
Storage Services || CERN IT Department -- On 08 Aug 2014, at 13:51, Andrija Panic andrija.pa...@gmail.com mailto:andrija.pa...@gmail.com wrote: Hi, we just had some new clients, and have suffered very big degradation in CEPH performance for some reasons (we are using CloudStack). I'm

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-11 Thread Andrija Panic
: Writes per RBD: Writes per object: Writes per length: . . . On 8 August 2014 16:01, Dan Van Der Ster daniel.vanders...@cern.ch wrote: Hi, On 08 Aug 2014, at 15:55, Andrija Panic andrija.pa...@gmail.com wrote: Hi Dan, thank you very much for the script, will check it out...no thortling

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-11 Thread Andrija Panic
to fix this... ? Thanks, Andrija On 11 August 2014 12:46, Andrija Panic andrija.pa...@gmail.com wrote: Hi Dan, the script provided seems to not work on my ceph cluster :( This is ceph version 0.80.3 I get empty results, on both debug level 10 and the maximum level of 20... [root@cs1

Re: [ceph-users] Show IOps per VM/client to find heavy users...

2014-08-11 Thread Andrija Panic
/cernceph/ceph-scripts/blob/master/tools/rbd-io-stats.pl Cheers, Dan -- Dan van der Ster || Data Storage Services || CERN IT Department -- On 11 Aug 2014, at 12:48, Andrija Panic andrija.pa...@gmail.com wrote: I appologize, clicked the Send button to fast... Anyway, I can see

[ceph-users] Multiply OSDs per host strategy ?

2013-10-16 Thread Andrija Panic
Hi, I have 2 x 2TB disks, in 3 servers, so total of 6 disks... I have deployed total of 6 OSDs. ie: host1 = osd.0 and osd.1 host2 = osd.2 and osd.3 host4 = osd.4 and osd.5 Now, since I will have total of 3 replica (original + 2 replicas), I want my replica placement to be such, that I don't end

Re: [ceph-users] Multiply OSDs per host strategy ?

2013-10-16 Thread Andrija Panic
-map/ Cheers, Mike Dawson On 10/16/2013 5:16 PM, Andrija Panic wrote: Hi, I have 2 x 2TB disks, in 3 servers, so total of 6 disks... I have deployed total of 6 OSDs. ie: host1 = osd.0 and osd.1 host2 = osd.2 and osd.3 host4 = osd.4 and osd.5 Now, since I will have total of 3

Re: [ceph-users] RBD read-ahead not working in 0.87.1

2015-03-18 Thread Andrija Panic
Acutally, good question - is RBD caching at all - possible with Windows guestes, if it ussing latest VirtIO drivers ? Linux caching (write caching, writeback) is working fine with newer virt-io drivers... Thanks On 18 March 2015 at 10:39, Alexandre DERUMIER aderum...@odiso.com wrote: Hi, I

Re: [ceph-users] Doesn't Support Qcow2 Disk images

2015-03-12 Thread Andrija Panic
ceph is RAW format - should be all fine...so VM will be using that RAW format On 12 March 2015 at 09:03, Azad Aliyar azad.ali...@sparksupport.com wrote: Community please explain the 2nd warning on this page: http://ceph.com/docs/master/rbd/rbd-openstack/ Important Ceph doesn’t support QCOW2

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-05 Thread Andrija Panic
for some reason...it just stayed degraded... so this is a reason why I started back the OSD, and then set it to out...) Thanks On 4 March 2015 at 17:54, Andrija Panic andrija.pa...@gmail.com wrote: Hi Robert, I already have this stuff set. CEph is 0.87.0 now... Thanks, will schedule

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-05 Thread Andrija Panic
at the same time. If you try this, please report back on your experience. I'm might try it in my lab, but I'm really busy at the moment so I don't know if I'll get to it real soon. On Thu, Mar 5, 2015 at 12:53 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi Robert, it seems I have

[ceph-users] [rbd cache experience - given]

2015-03-07 Thread Andrija Panic
Hi there, just wanted to share some benchmark experience with RBD caching, that I have just (partially) implemented. This is not nicely formated results, just raw numbers to understadn the difference *INFRASTRUCTURE: - 3 hosts with: 12 x 4TB drives, 6 Journals on 1 SSD, 6 journals on

Re: [ceph-users] Adding Monitor

2015-03-13 Thread Andrija Panic
Georgeos , you need to have deployment server and cd into folder that you used originaly while deploying CEPH - in this folder you should already have ceph.conf, admin.client keyring and other stuff - which is required to to connect to cluster...and provision new MONs or OSDs, etc. Message:

Re: [ceph-users] Adding Monitor

2015-03-13 Thread Andrija Panic
Check firewall - I hit this issue over and over again... On 13 March 2015 at 22:25, Georgios Dimitrakakis gior...@acmac.uoc.gr wrote: On an already available cluster I 've tried to add a new monitor! I have used ceph-deploy mon create {NODE} where {NODE}=the name of the node and then I

Re: [ceph-users] Public Network Meaning

2015-03-14 Thread Andrija Panic
Public network is clients-to-OSD traffic - and if you have NOT explicitely defined cluster network, than also OSD-to-OSD replication takes place over same network. Otherwise, you can define public and cluster(private) network - so OSD replication will happen over dedicated NICs (cluster network)

Re: [ceph-users] [SPAM] Changing pg_num = RBD VM down !

2015-03-14 Thread Andrija Panic
changin PG number - causes LOOOT of data rebalancing (in my case was 80%) which I learned the hard way... On 14 March 2015 at 18:49, Gabri Mate mailingl...@modernbiztonsag.org wrote: I had the same issue a few days ago. I was increasing the pg_num of one pool from 512 to 1024 and all the VMs

Re: [ceph-users] Public Network Meaning

2015-03-14 Thread Andrija Panic
This is how I did it, and then retart each OSD one by one, but monritor with ceph -s, when ceph is healthy, proceed with next OSD restart... Make sure the networks are fine on physical nodes, that you can ping in between... [global] x x x x x x # ###

Re: [ceph-users] {Disarmed} Re: Public Network Meaning

2015-03-14 Thread Andrija Panic
Georgios, no need to put ANYTHING if you don't plan to split client-to-OSD vs OSD-OSD-replication on 2 different Network Cards/Networks - for pefromance reasons. if you have only 1 network - simply DONT configure networks at all inside your CEPH.conf file... if you have 2 x 1G cards in servers,

Re: [ceph-users] {Disarmed} Re: {Disarmed} Re: Public Network Meaning

2015-03-14 Thread Andrija Panic
In that case - yes...put everything on 1 card - or if both cards are 1G (or same speed for that matter...) - then you might want toblock all external traffic except i.e. SSH, WEB, but allow ALL traffic between all CEPH OSDs... so you can still use that network for public/client traffic - not sure

Re: [ceph-users] Turning on SCRUB back on - any suggestion ?

2015-03-13 Thread Andrija Panic
Thanks Wido - I will do that. On 13 March 2015 at 09:46, Wido den Hollander w...@42on.com wrote: On 13-03-15 09:42, Andrija Panic wrote: Hi all, I have set nodeep-scrub and noscrub while I had small/slow hardware for the cluster. It has been off for a while now. Now we

[ceph-users] Turning on SCRUB back on - any suggestion ?

2015-03-13 Thread Andrija Panic
Hi all, I have set nodeep-scrub and noscrub while I had small/slow hardware for the cluster. It has been off for a while now. Now we are upgraded with hardware/networking/SSDs and I would like to activate - or unset these flags. Since I now have 3 servers with 12 OSDs each (SSD based Journals)

Re: [ceph-users] Turning on SCRUB back on - any suggestion ?

2015-03-13 Thread Andrija Panic
Nice - so I just realized I need to manually scrub 1216 placements groups :) On 13 March 2015 at 10:16, Andrija Panic andrija.pa...@gmail.com wrote: Thanks Wido - I will do that. On 13 March 2015 at 09:46, Wido den Hollander w...@42on.com wrote: On 13-03-15 09:42, Andrija Panic wrote

Re: [ceph-users] Turning on SCRUB back on - any suggestion ?

2015-03-13 Thread Andrija Panic
Hollander wrote: On 13-03-15 09:42, Andrija Panic wrote: Hi all, I have set nodeep-scrub and noscrub while I had small/slow hardware for the cluster. It has been off for a while now. Now we are upgraded with hardware/networking/SSDs and I would like to activate - or unset these flags

Re: [ceph-users] Turning on SCRUB back on - any suggestion ?

2015-03-13 Thread Andrija Panic
Hmnice Thx guys On 13 March 2015 at 12:33, Henrik Korkuc li...@kirneh.eu wrote: I think settings apply to both kinds of scrubs On 3/13/15 13:31, Andrija Panic wrote: Interestingthx for that Henrik. BTW, my placements groups are arround 1800 objects (ceph pg dump) - meainng

Re: [ceph-users] Turning on SCRUB back on - any suggestion ?

2015-03-13 Thread Andrija Panic
Will do, of course :) THx Wido for quick help, as always ! On 13 March 2015 at 12:04, Wido den Hollander w...@42on.com wrote: On 13-03-15 12:00, Andrija Panic wrote: Nice - so I just realized I need to manually scrub 1216 placements groups :) With manual I meant using a script. Loop

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Andrija Panic
ceph]# ceph --admin-daemon /var/run/ceph/ceph-osd.94.asok config show | grep osd_recovery_delay_start osd_recovery_delay_start: 10 2015-03-03 13:13 GMT+03:00 Andrija Panic andrija.pa...@gmail.com: HI Guys, I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it caused over 37% od

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Andrija Panic
GMT+03:00 Andrija Panic andrija.pa...@gmail.com: Hi Irek, yes, stoping OSD (or seting it to OUT) resulted in only 3% of data degraded and moved/recovered. When I after that removed it from Crush map ceph osd crush rm id, that's when the stuff with 37% happened. And thanks Irek for help

[ceph-users] [URGENT-HELP] - Ceph rebalancing again after taking OSD out of CRUSH map

2015-03-02 Thread Andrija Panic
Hi people, I had one OSD crash, so the rebalancing happened - all fine (some 3% of the data has been moved arround, and rebalanced) and my previous recovery/backfill throtling was applied fine and we didnt have a unusable cluster. Now I used the procedure to remove this crashed OSD comletely

Re: [ceph-users] [URGENT-HELP] - Ceph rebalancing again after taking OSD out of CRUSH map

2015-03-02 Thread Andrija Panic
, when my cluster completely colapsed during data rebalancing... I don't see any option to contribute to documentation ? Best On 2 March 2015 at 16:07, Wido den Hollander w...@42on.com wrote: On 03/02/2015 03:56 PM, Andrija Panic wrote: Hi people, I had one OSD crash, so the rebalancing

Re: [ceph-users] Implement replication network with live cluster

2015-03-04 Thread Andrija Panic
are good to go by restarting one OSD at a time. On Wed, Mar 4, 2015 at 4:17 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi, I'm having a live cluster with only public network (so no explicit network configuraion in the ceph.conf file) I'm wondering what is the procedure

Re: [ceph-users] Implement replication network with live cluster

2015-03-04 Thread Andrija Panic
configuration to all OSDs and restart them one by one. Make sure the network is ofcourse up and running and it should work. On Wed, Mar 4, 2015 at 4:17 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi, I'm having a live cluster with only public network (so no explicit network configuraion

Re: [ceph-users] Implement replication network with live cluster

2015-03-04 Thread Andrija Panic
. Tired, sorry... On 4 March 2015 at 17:48, Andrija Panic andrija.pa...@gmail.com wrote: That was my thought, yes - I found this blog that confirms what you are saying I guess: http://www.sebastien-han.fr/blog/2012/07/29/tip-ceph-public-slash-private-network-configuration/ I will do

Re: [ceph-users] Implement replication network with live cluster

2015-03-04 Thread Andrija Panic
, Andrija Panic andrija.pa...@gmail.com wrote: That was my thought, yes - I found this blog that confirms what you are saying I guess: http://www.sebastien-han.fr/blog/2012/07/29/tip-ceph-public-slash-private-network-configuration/ I will do that... Thx I guess it doesnt matter, since my Crush Map

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-04 Thread Andrija Panic
you are running, but I think there was some priority work done in firefly to help make backfills lower priority. I think it has gotten better in later versions. On Wed, Mar 4, 2015 at 1:35 AM, Andrija Panic andrija.pa...@gmail.com wrote: Thank you Rober - I'm wondering when I do remove total

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-04 Thread Andrija Panic
adding new nodes, when nobackfill and norecover is set, you can add them in so that the one big relocate fills the new drives too. On Tue, Mar 3, 2015 at 5:58 AM, Andrija Panic andrija.pa...@gmail.com wrote: Thx Irek. Number of replicas is 3. I have 3 servers with 2 OSDs on them on 1g switch

[ceph-users] Implement replication network with live cluster

2015-03-04 Thread Andrija Panic
Hi, I'm having a live cluster with only public network (so no explicit network configuraion in the ceph.conf file) I'm wondering what is the procedure to implement dedicated Replication/Private and Public network. I've read the manual, know how to do it in ceph.conf, but I'm wondering since this

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Andrija Panic
potentialy go with 7 x the same number of missplaced objects...? Any thoughts ? Thanks On 3 March 2015 at 12:14, Andrija Panic andrija.pa...@gmail.com wrote: Thanks Irek. Does this mean, that after peering for each PG, there will be delay of 10sec, meaning that every once in a while, I will have

Re: [ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Andrija Panic
, the correct option is to remove the entire node, rather than each disk individually 2015-03-03 14:27 GMT+03:00 Andrija Panic andrija.pa...@gmail.com: Another question - I mentioned here 37% of objects being moved arround - this is MISPLACED object (degraded objects were 0.001%, after I removed

[ceph-users] Rebalance/Backfill Throtling - anything missing here?

2015-03-03 Thread Andrija Panic
HI Guys, I yesterday removed 1 OSD from cluster (out of 42 OSDs), and it caused over 37% od the data to rebalance - let's say this is fine (this is when I removed it frm Crush Map). I'm wondering - I have previously set some throtling mechanism, but during first 1h of rebalancing, my rate of

Re: [ceph-users] replace dead SSD journal

2015-05-06 Thread Andrija Panic
Well, seems like they are on satellite :) On 6 May 2015 at 02:58, Matthew Monaco m...@monaco.cx wrote: On 05/05/2015 08:55 AM, Andrija Panic wrote: Hi, small update: in 3 months - we lost 5 out of 6 Samsung 128Gb 850 PROs (just few days in between of each SSD death) - cant believe

Re: [ceph-users] replace dead SSD journal

2015-05-05 Thread Andrija Panic
Hi, small update: in 3 months - we lost 5 out of 6 Samsung 128Gb 850 PROs (just few days in between of each SSD death) - cant believe it - NOT due to wearing out... I really hope we got efective series from suplier... Regards On 18 April 2015 at 14:24, Andrija Panic andrija.pa...@gmail.com

[ceph-users] replace dead SSD journal

2015-04-17 Thread Andrija Panic
Hi guys, I have 1 SSD that hosted 6 OSD's Journals, that is dead, so 6 OSD down, ceph rebalanced etc. Now I have new SSD inside, and I will partition it etc - but would like to know, how to proceed now, with the journal recreation for those 6 OSDs that are down now. Should I flush journal

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Andrija Panic
, at 18:49, Andrija Panic andrija.pa...@gmail.com wrote: 12 osds down - I expect less work with removing and adding osd? On Apr 17, 2015 6:35 PM, Krzysztof Nowicki krzysztof.a.nowi...@gmail.com wrote: Why not just wipe out the OSD filesystem, run ceph-osd --mkfs with the existing OSD UUID, copy

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Andrija Panic
Thx guys, thats what I will be doing at the end. Cheers On Apr 17, 2015 6:24 PM, Robert LeBlanc rob...@leblancnet.us wrote: Delete and re-add all six OSDs. On Fri, Apr 17, 2015 at 3:36 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi guys, I have 1 SSD that hosted 6 OSD's Journals

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Andrija Panic
2015 o 18:31 użytkownik Andrija Panic andrija.pa...@gmail.com napisał: Thx guys, thats what I will be doing at the end. Cheers On Apr 17, 2015 6:24 PM, Robert LeBlanc rob...@leblancnet.us wrote: Delete and re-add all six OSDs. On Fri, Apr 17, 2015 at 3:36 AM, Andrija Panic andrija.pa

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Andrija Panic
on the SSD? /Josef On 17 Apr 2015 20:05, Andrija Panic andrija.pa...@gmail.com wrote: SSD died that hosted journals for 6 OSDs - 2 x SSD died, so 12 OSDs are down, and rebalancing is about finish... after which I need to fix the OSDs. On 17 April 2015 at 19:01, Josef Johansson jo...@oderland.se

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Andrija Panic
) for about half a year now. So far so good. I'll be keeping a closer look at them. pt., 17 kwi 2015, 21:07 Andrija Panic użytkownik andrija.pa...@gmail.com napisał: nahSamsun 850 PRO 128GB - dead after 3months - 2 of these died... wearing level is 96%, so only 4% wasted... (yes I know

Re: [ceph-users] replace dead SSD journal

2015-04-18 Thread Andrija Panic
yes I know, but to late now, I'm afraid :) On 18 April 2015 at 14:18, Josef Johansson jose...@gmail.com wrote: Have you looked into the samsung 845 dc? They are not that expensive last time I checked. /Josef On 18 Apr 2015 13:15, Andrija Panic andrija.pa...@gmail.com wrote: might be true

Re: [ceph-users] replace dead SSD journal

2015-04-18 Thread Andrija Panic
inclined to suffer this fate. Regards Mark On 18/04/15 22:23, Andrija Panic wrote: these 2 drives, are on the regular SATA (on board)controler, and beside this, there is 12 x 4TB on the fron of the servers - normal backplane on the front. Anyway, we are going to check those dead SSDs

[ceph-users] ceph-deploy journal on separate partition - quck info needed

2015-04-17 Thread Andrija Panic
Hi all, when I run: ceph-deploy osd create SERVER:sdi:/dev/sdb5 (sdi = previously ZAP-ed 4TB drive) (sdb5 = previously manually created empty partition with fdisk) Is ceph-deploy going to create journal properly on sdb5 (something similar to: ceph-osd -i $ID --mkjournal ), or do I need to do

Re: [ceph-users] ceph-deploy journal on separate partition - quck info needed

2015-04-17 Thread Andrija Panic
was created properly. The OSD would not start if the journal was not created. On Fri, Apr 17, 2015 at 2:43 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi all, when I run: ceph-deploy osd create SERVER:sdi:/dev/sdb5 (sdi = previously ZAP-ed 4TB drive) (sdb5 = previously manually created

Re: [ceph-users] replace dead SSD journal

2015-04-18 Thread Andrija Panic
...@me.com wrote: On 17/04/2015, at 21.07, Andrija Panic andrija.pa...@gmail.com wrote: nahSamsun 850 PRO 128GB - dead after 3months - 2 of these died... wearing level is 96%, so only 4% wasted... (yes I know these are not enterprise,etc… ) Damn… but maybe your surname says it all

Re: [ceph-users] replace dead SSD journal

2015-04-18 Thread Andrija Panic
be a defect there as well. On 18 Apr 2015 09:42, Steffen W Sørensen ste...@me.com wrote: On 17/04/2015, at 21.07, Andrija Panic andrija.pa...@gmail.com wrote: nahSamsun 850 PRO 128GB - dead after 3months - 2 of these died... wearing level is 96%, so only 4% wasted... (yes I know

Re: [ceph-users] Repair inconsistent pgs..

2015-08-20 Thread Andrija Panic
Guys, I'm Igor's colleague, working a bit on CEPH, together with Igor. This is production cluster, and we are becoming more desperate as the time goes by. Im not sure if this is appropriate place to seek commercial support, but anyhow, I do it... If anyone feels like and have some experience

Re: [ceph-users] Broken snapshots... CEPH 0.94.2

2015-08-20 Thread Andrija Panic
This was related to the caching layer, which doesnt support snapshooting per docs...for sake of closing the thread. On 17 August 2015 at 21:15, Voloshanenko Igor igor.voloshane...@gmail.com wrote: Hi all, can you please help me with unexplained situation... All snapshot inside ceph broken...

Re: [ceph-users] which SSD / experiences with Samsung 843T vs. Intel s3700

2015-08-25 Thread Andrija Panic
Make sure you test what ever you decide. We just learned this the hard way with samsung 850 pro, which is total crap, more than you could imagine... Andrija On Aug 25, 2015 11:25 AM, Jan Schermer j...@schermer.cz wrote: I would recommend Samsung 845 DC PRO (not EVO, not just PRO). Very cheap,

Re: [ceph-users] which SSD / experiences with Samsung 843T vs. Intel s3700

2015-08-25 Thread Andrija Panic
and performans was acceptable...no we are upgrading to intel S3500... Best any details on that ? On Tue, 25 Aug 2015 11:42:47 +0200, Andrija Panic andrija.pa...@gmail.com wrote: Make sure you test what ever you decide. We just learned this the hard way with samsung 850 pro, which is total crap, more

Re: [ceph-users] which SSD / experiences with Samsung 843T vs. Intel s3700

2015-08-25 Thread Andrija Panic
. Yes, it's cheaper than S3700 (about 2x times), and no so durable for writes, but we think more better to replace 1 ssd per 1 year than to pay double price now. 2015-08-25 12:59 GMT+03:00 Andrija Panic andrija.pa...@gmail.com: And should I mention that in another CEPH installation we had

Re: [ceph-users] which SSD / experiences with Samsung 843T vs. Intel s3700

2015-08-25 Thread Andrija Panic
And should I mention that in another CEPH installation we had samsung 850 pro 128GB and all of 6 ssds died in 2 month period - simply disappear from the system, so not wear out... Never again we buy Samsung :) On Aug 25, 2015 11:57 AM, Andrija Panic andrija.pa...@gmail.com wrote: First read

Re: [ceph-users] which SSD / experiences with Samsung 843T vs. Intel s3700

2015-09-04 Thread Andrija Panic
to be the case. > > QH > > On Fri, Sep 4, 2015 at 12:53 PM, Andrija Panic <andrija.pa...@gmail.com> > wrote: > >> Hi James, >> >> I had 3 CEPH nodes as folowing: 12 OSDs(HDD) and 2 SSDs (2x 6 Journals >> partitions on each SSD) - SSDs just vanished with n

  1   2   >