[ceph-users] Cache-tier problem when cache becomes full

2015-04-17 Thread Xavier Serrano
Hello all, We are trying to run some tests on a cache-tier Ceph cluster, but we are encountering serious problems, which eventually lead the cluster unusable. We are apparently doing something wrong, but we have no idea of what it could be. We'd really appreciate if someone could point us what

Re: [ceph-users] Cache-tier problem when cache becomes full

2015-04-17 Thread LOPEZ Jean-Charles
Hi Xavier see comments inline JC On 16 Apr 2015, at 23:02, Xavier Serrano xserrano+c...@ac.upc.edu wrote: Hello all, We are trying to run some tests on a cache-tier Ceph cluster, but we are encountering serious problems, which eventually lead the cluster unusable. We are apparently

Re: [ceph-users] Ceph.com

2015-04-17 Thread Wido den Hollander
On 16-04-15 19:31, Ferber, Dan wrote: Thanks for working on this Patrick. I have looked for a mirror that I can point all the ceph.com references to in /usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py. So I can get ceph-deploy to work. I tried eu.ceph.com but it

[ceph-users] replace dead SSD journal

2015-04-17 Thread Andrija Panic
Hi guys, I have 1 SSD that hosted 6 OSD's Journals, that is dead, so 6 OSD down, ceph rebalanced etc. Now I have new SSD inside, and I will partition it etc - but would like to know, how to proceed now, with the journal recreation for those 6 OSDs that are down now. Should I flush journal

Re: [ceph-users] Ceph.com

2015-04-17 Thread Kurt Bauer
Ferber, Dan wrote: Thanks for working on this Patrick. I have looked for a mirror that I can point all the ceph.com references to in /usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py. So I can get ceph-deploy to work. I tried eu.ceph.com but it does not work for this

Re: [ceph-users] Ceph repo - RSYNC?

2015-04-17 Thread Matt Taylor
Australian/Oceanic users can also rsync from here: rsync://ceph.mirror.digitalpacific.com.au/ceph As Wido mentioned before, you can also obtain packages from here too: http://ceph.mirror.digitalpacific.com.au/ Mirror is located in Sydney, Australia and syncs directly from eu.ceph.com Cheers,

Re: [ceph-users] CEPHFS with erasure code

2015-04-17 Thread Loic Dachary
Hi, You should set a cache tier for CephFS to use and have the erasure coded pool behind it. You will find detailed informations at http://docs.ceph.com/docs/master/rados/operations/cache-tiering/ Cheers On 17/04/2015 12:39, MEGATEL / Rafał Gawron wrote: Hello I would create cephfs with

Re: [ceph-users] All pools have size=3 but MB data and MB used ratio is 1 to 5

2015-04-17 Thread Saverio Proto
Do you by any chance have your OSDs placed at a local directory path rather than on a non utilized physical disk? No, I have 18 Disks per Server. Each OSD is mapped to a physical disk. Here in the output of one server: ansible@zrh-srv-m-cph02:~$ df -h Filesystem Size Used Avail

Re: [ceph-users] ceph-deploy : systemd unit files not deployed to a centos7 nodes

2015-04-17 Thread Ken Dreyer
As you've seen, a set of systemd unit files has been committed to git, but the packages do not yet use them. There is an open ticket for this task, http://tracker.ceph.com/issues/11344 . Feel free to add yourself as a watcher on that if you are interested in the progress. - Ken On 04/17/2015

Re: [ceph-users] advantages of multiple pools?

2015-04-17 Thread Saverio Proto
For example you can assign different read/write permissions and different keyrings to different pools. 2015-04-17 16:00 GMT+02:00 Chad William Seys cws...@physics.wisc.edu: Hi All, What are the advantages of having multiple ceph pools (if they use the whole cluster)? Thanks! C.

[ceph-users] advantages of multiple pools?

2015-04-17 Thread Chad William Seys
Hi All, What are the advantages of having multiple ceph pools (if they use the whole cluster)? Thanks! C. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] CEPHFS with erasure code

2015-04-17 Thread MEGATEL / Rafał Gawron
Hello I would create cephfs with erasure code. I define my default ec-profile: ceph osd erasure-code-profile get default directory=/usr/lib64/ceph/erasure-code k=3 m=1 plugin=jerasure ruleset-failure-domain=host technique=reed_sol_van How I can create cephfs with this profile ? I try create

Re: [ceph-users] metadata management in case of ceph object storage and ceph block storage

2015-04-17 Thread Steffen W Sørensen
On 17/04/2015, at 07.33, Josef Johansson jose...@gmail.com wrote: To your question, which I’m not sure I understand completely. So yes, you don’t need the MDS if you just keep track of block storage and object storage. (i.e. images for KVM) So the Mon keeps track of the metadata for

Re: [ceph-users] All pools have size=3 but MB data and MB used ratio is 1 to 5

2015-04-17 Thread Georgios Dimitrakakis
Hi! Do you by any chance have your OSDs placed at a local directory path rather than on a non utilized physical disk? If I remember correctly from a similar setup that I had performed in the past the ceph df command accounts for the entire disk and not just for the OSD data directory. I am

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Steffen W Sørensen
I have 1 SSD that hosted 6 OSD's Journals, that is dead, so 6 OSD down, ceph rebalanced etc. Now I have new SSD inside, and I will partition it etc - but would like to know, how to proceed now, with the journal recreation for those 6 OSDs that are down now. Well assuming the OSDs are

[ceph-users] ceph-deploy : systemd unit files not deployed to a centos7 nodes

2015-04-17 Thread Alexandre DERUMIER
Hi, I'm currently try to deploy a new ceph test cluster on centos7, (hammer) from ceph-deploy (on a debian wheezy). And it seem that systemd unit files are not deployed Seem that ceph git have systemd unit file https://github.com/ceph/ceph/tree/hammer/systemd I don't have look inside the rpm

Re: [ceph-users] advantages of multiple pools?

2015-04-17 Thread Lionel Bouton
On 04/17/15 16:01, Saverio Proto wrote: For example you can assign different read/write permissions and different keyrings to different pools. From memory you can set different replication settings, use a cache pool or not, use specific crush map rules too. Lionel Bouton

Re: [ceph-users] ceph on Debian Jessie stopped working

2015-04-17 Thread Chad William Seys
Hi Greg, Thanks for the reply. After looking more closely at /etc/ceph/rbdmap I discovered it was corrupted. That was the only problem. I think the dmesg line 'rbd: no image name provided' is also a clue to this! Hope that helps any other newbies! :) Thanks again, Chad.

Re: [ceph-users] ceph-deploy : systemd unit files not deployed to a centos7 nodes

2015-04-17 Thread HEWLETT, Paul (Paul)** CTR **
I would be very keen for this to be implemented in Hammer and am willing to help test it... Paul Hewlett Senior Systems Engineer Velocix, Cambridge Alcatel-Lucent t: +44 1223 435893 m: +44 7985327353 From: ceph-users

Re: [ceph-users] many slow requests on different osds (scrubbing disabled)

2015-04-17 Thread Craig Lewis
I've seen something like this a few times. Once, I lost the battery in my battery backed RAID card. That caused all the OSDs on that host to be slow, which triggered slow request notices pretty much cluster wide. It was only when I histogrammed the slow request notices that I saw most of them

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Andrija Panic
SSD died that hosted journals for 6 OSDs - 2 x SSD died, so 12 OSDs are down, and rebalancing is about finish... after which I need to fix the OSDs. On 17 April 2015 at 19:01, Josef Johansson jo...@oderland.se wrote: Hi, Did 6 other OSDs go down when re-adding? /Josef On 17 Apr 2015, at

[ceph-users] Query regarding integrating Ceph with Vcenter/Clustered Esxi hosts.

2015-04-17 Thread Vivek Varghese Cherian
Hi all, I have a setup where I can launch vms from a standalone vmware esxi host which acts as a iscsi initiator and a ceph rbd block device that is exported as a iscsi target. During the time of launching of vms from the standalone esxi host integrated with ceph, it is prompting me to choose

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Robert LeBlanc
Delete and re-add all six OSDs. On Fri, Apr 17, 2015 at 3:36 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi guys, I have 1 SSD that hosted 6 OSD's Journals, that is dead, so 6 OSD down, ceph rebalanced etc. Now I have new SSD inside, and I will partition it etc - but would like to

Re: [ceph-users] Managing larger ceph clusters

2015-04-17 Thread Craig Lewis
I'm running a small cluster, but I'll chime in since nobody else has. Cern had a presentation a while ago (dumpling time-frame) about their deployment. They go over some of your questions: http://www.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern My philosophy on Config Management is that it

Re: [ceph-users] Ceph on Solaris / Illumos

2015-04-17 Thread Michal Kozanecki
Performance on ZFS on Linux (ZoL) seems to be fine, as long as you use the CEPH generic filesystem implementation (writeahead) and not the specific CEPH ZFS implementation, CoW snapshoting that CEPH does with ZFS support compiled in absolutely kills performance. I suspect the same would go with

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Josef Johansson
Hi, Did 6 other OSDs go down when re-adding? /Josef On 17 Apr 2015, at 18:49, Andrija Panic andrija.pa...@gmail.com wrote: 12 osds down - I expect less work with removing and adding osd? On Apr 17, 2015 6:35 PM, Krzysztof Nowicki krzysztof.a.nowi...@gmail.com

[ceph-users] full ssd setup preliminary hammer bench

2015-04-17 Thread Alexandre DERUMIER
Hi Mark, I finally got my hardware for my production full ssd cluster. Here a first preliminary bench. (1osd). I got around 45K iops with randread 4K with a small 10GB rbd volume I'm pretty happy because I don't see anymore huge cpu difference between krbd lirbd. In my previous bench I was

Re: [ceph-users] full ssd setup preliminary hammer bench

2015-04-17 Thread Michal Kozanecki
Any quick write performance data? Michal Kozanecki | Linux Administrator | E: mkozane...@evertz.com -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alexandre DERUMIER Sent: April-17-15 11:38 AM To: Mark Nelson; ceph-users Subject:

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Andrija Panic
Thx guys, thats what I will be doing at the end. Cheers On Apr 17, 2015 6:24 PM, Robert LeBlanc rob...@leblancnet.us wrote: Delete and re-add all six OSDs. On Fri, Apr 17, 2015 at 3:36 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi guys, I have 1 SSD that hosted 6 OSD's Journals,

Re: [ceph-users] Ceph.com

2015-04-17 Thread Paul Mansfield
On 16/04/15 17:34, Chris Armstrong wrote: Thanks for the update, Patrick. Our Docker builds were failing due to the mirror being down. I appreciate being able to check the mailing list and quickly see what's going on! if you're accessing the ceph repo all the time, it's probably worth the

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Andrija Panic
12 osds down - I expect less work with removing and adding osd? On Apr 17, 2015 6:35 PM, Krzysztof Nowicki krzysztof.a.nowi...@gmail.com wrote: Why not just wipe out the OSD filesystem, run ceph-osd --mkfs with the existing OSD UUID, copy the keyring and let it populate itself? pt., 17 kwi

Re: [ceph-users] full ssd setup preliminary hammer bench

2015-04-17 Thread Stefan Priebe
Am 17.04.2015 um 17:37 schrieb Alexandre DERUMIER: Hi Mark, I finally got my hardware for my production full ssd cluster. Here a first preliminary bench. (1osd). I got around 45K iops with randread 4K with a small 10GB rbd volume I'm pretty happy because I don't see anymore huge cpu

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Andrija Panic
nahSamsun 850 PRO 128GB - dead after 3months - 2 of these died... wearing level is 96%, so only 4% wasted... (yes I know these are not enterprise,etc... ) On 17 April 2015 at 21:01, Josef Johansson jose...@gmail.com wrote: tough luck, hope everything comes up ok afterwards. What models on

Re: [ceph-users] Managing larger ceph clusters

2015-04-17 Thread Steve Anthony
For reference, I'm currently running 26 nodes (338 OSDs); will be 35 nodes (455 OSDs) in the near future. Node/OSD provisioning and replacements: Mostly I'm using ceph-deploy, at least to do node/osd adds and replacements. Right now the process is: Use FAI (http://fai-project.org) to setup

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Josef Johansson
the massive rebalancing does not affect the ssds in a good way either. But from what I've gatherd the pro should be fine. Massive amount of write errors in the logs? /Josef On 17 Apr 2015 21:07, Andrija Panic andrija.pa...@gmail.com wrote: nahSamsun 850 PRO 128GB - dead after 3months - 2 of

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Josef Johansson
tough luck, hope everything comes up ok afterwards. What models on the SSD? /Josef On 17 Apr 2015 20:05, Andrija Panic andrija.pa...@gmail.com wrote: SSD died that hosted journals for 6 OSDs - 2 x SSD died, so 12 OSDs are down, and rebalancing is about finish... after which I need to fix the

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Krzysztof Nowicki
I have two of them in my cluster (plus one 256GB version) for about half a year now. So far so good. I'll be keeping a closer look at them. pt., 17 kwi 2015, 21:07 Andrija Panic użytkownik andrija.pa...@gmail.com napisał: nahSamsun 850 PRO 128GB - dead after 3months - 2 of these died...

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Andrija Panic
damn, good news for me, pssibly bad news for you :) what is wearing level (samrtctl -a /dev/sdX) - attribute near the end of the atribute list... thx On 17 April 2015 at 21:12, Krzysztof Nowicki krzysztof.a.nowi...@gmail.com wrote: I have two of them in my cluster (plus one 256GB version) for

Re: [ceph-users] Managing larger ceph clusters

2015-04-17 Thread Quentin Hartman
I also have a fairly small deployment of 14 nodes, 42 OSDs, but even I use some automation. I do my OS installs and partitioning with PXE / kickstart, then use chef for my baseline install of the normal server stuff in our env and admin accounts. Then the ceph-specific stuff I handle by hand and

Re: [ceph-users] Upgrade from Giant 0.87-1 to Hammer 0.94-1

2015-04-17 Thread Chad William Seys
Now I also know I have too many PGs! It is fairly confusing to talk about PGs on the Pool page, but only vaguely talk about the number of PGs for the cluster. Here are some examples of confusing statements with suggested alternatives from the online docs:

Re: [ceph-users] ceph-deploy journal on separate partition - quck info needed

2015-04-17 Thread Robert LeBlanc
If the journal file on the osd is a symlink to the partition and the OSD process is running, then the journal was created properly. The OSD would not start if the journal was not created. On Fri, Apr 17, 2015 at 2:43 PM, Andrija Panic andrija.pa...@gmail.com wrote: Hi all, when I run:

[ceph-users] ceph-deploy journal on separate partition - quck info needed

2015-04-17 Thread Andrija Panic
Hi all, when I run: ceph-deploy osd create SERVER:sdi:/dev/sdb5 (sdi = previously ZAP-ed 4TB drive) (sdb5 = previously manually created empty partition with fdisk) Is ceph-deploy going to create journal properly on sdb5 (something similar to: ceph-osd -i $ID --mkjournal ), or do I need to do

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Krzysztof Nowicki
Checked the SMART status. All of the Samsungs have Wear Leveling Count equal to 99 (raw values 29, 36 and 15). I'm going to have to monitor them - I could afford loosing one of them, but loosing two would mean loss of data. pt., 17 kwi 2015 o 21:22 użytkownik Josef Johansson jose...@gmail.com

Re: [ceph-users] ceph-deploy journal on separate partition - quck info needed

2015-04-17 Thread Andrija Panic
ok, thx Robert - I expected that so this is fine then - just done it on 12 OSDs and all fine... thx again On 17 April 2015 at 23:38, Robert LeBlanc rob...@leblancnet.us wrote: If the journal file on the osd is a symlink to the partition and the OSD process is running, then the journal was

[ceph-users] CephFS and Erasure Codes

2015-04-17 Thread Ben Randall
Hello all, I am considering using Ceph for a new deployment and have a few questions about the current implementation of erasure codes. I understand that erasure codes have been enabled for pools, but that erasure coded pools cannot be used as the basis of a Ceph FS. Is it fair to infer that

Re: [ceph-users] CephFS and Erasure Codes

2015-04-17 Thread Loic Dachary
Hi, Although erasure coded pools cannot be used with CephFS, they can be used behind a replicated cache pool as explained at http://docs.ceph.com/docs/master/rados/operations/cache-tiering/. Cheers On 18/04/2015 00:26, Ben Randall wrote: Hello all, I am considering using Ceph for a new

Re: [ceph-users] Ceph on Solaris / Illumos

2015-04-17 Thread Jake Young
On Friday, April 17, 2015, Michal Kozanecki mkozane...@evertz.com wrote: Performance on ZFS on Linux (ZoL) seems to be fine, as long as you use the CEPH generic filesystem implementation (writeahead) and not the specific CEPH ZFS implementation, CoW snapshoting that CEPH does with ZFS support

Re: [ceph-users] replace dead SSD journal

2015-04-17 Thread Krzysztof Nowicki
Why not just wipe out the OSD filesystem, run ceph-osd --mkfs with the existing OSD UUID, copy the keyring and let it populate itself? pt., 17 kwi 2015 o 18:31 użytkownik Andrija Panic andrija.pa...@gmail.com napisał: Thx guys, thats what I will be doing at the end. Cheers On Apr 17, 2015

Re: [ceph-users] metadata management in case of ceph object storage and ceph block storage

2015-04-17 Thread pragya jain
Thanks to all for your reply -RegardsPragya JainDepartment of Computer ScienceUniversity of DelhiDelhi, India On Friday, 17 April 2015 4:36 PM, Steffen W Sørensen ste...@me.com wrote: On 17/04/2015, at 07.33, Josef Johansson jose...@gmail.com wrote: To your question, which

Re: [ceph-users] full ssd setup preliminary hammer bench

2015-04-17 Thread Alexandre DERUMIER
Any quick write performance data? 4k randwrite iops : 12K hostcpu: 85.5 idle client cpu : 98,5 id idle disk util : 100% (this is the bottleneck). This s3500 drives can do around 25K rand 4K write with O_DSYNC So, with ceph double write (journal + datas), that's explain the 12K -

Re: [ceph-users] full ssd setup preliminary hammer bench

2015-04-17 Thread Alexandre DERUMIER
any idea whether this might be the tcmalloc bug? I still don't known if centos/redhat packages have also the bug or not. gperftools.x86_64 2.1-1.el7 - Mail original - De: Stefan Priebe s.pri...@profihost.ag À: aderumier aderum...@odiso.com, Mark Nelson

Re: [ceph-users] full ssd setup preliminary hammer bench

2015-04-17 Thread Stefan Priebe - Profihost AG
Am 18.04.2015 um 07:24 schrieb Alexandre DERUMIER aderum...@odiso.com: any idea whether this might be the tcmalloc bug? I still don't known if centos/redhat packages have also the bug or not. gperftools.x86_64 2.1-1.el7 From the version number it looks buggy. I'm