Hello all,
We are trying to run some tests on a cache-tier Ceph cluster, but
we are encountering serious problems, which eventually lead the cluster
unusable.
We are apparently doing something wrong, but we have no idea of
what it could be. We'd really appreciate if someone could point us what
Hi Xavier
see comments inline
JC
On 16 Apr 2015, at 23:02, Xavier Serrano xserrano+c...@ac.upc.edu wrote:
Hello all,
We are trying to run some tests on a cache-tier Ceph cluster, but
we are encountering serious problems, which eventually lead the cluster
unusable.
We are apparently
On 16-04-15 19:31, Ferber, Dan wrote:
Thanks for working on this Patrick. I have looked for a mirror that I can
point all the ceph.com references to in
/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py. So I
can get ceph-deploy to work.
I tried eu.ceph.com but it
Hi guys,
I have 1 SSD that hosted 6 OSD's Journals, that is dead, so 6 OSD down,
ceph rebalanced etc.
Now I have new SSD inside, and I will partition it etc - but would like to
know, how to proceed now, with the journal recreation for those 6 OSDs that
are down now.
Should I flush journal
Ferber, Dan wrote:
Thanks for working on this Patrick. I have looked for a mirror that I can
point all the ceph.com references to in
/usr/lib/python2.6/site-packages/ceph_deploy/hosts/centos/install.py. So I
can get ceph-deploy to work.
I tried eu.ceph.com but it does not work for this
Australian/Oceanic users can also rsync from here:
rsync://ceph.mirror.digitalpacific.com.au/ceph
As Wido mentioned before, you can also obtain packages from here too:
http://ceph.mirror.digitalpacific.com.au/
Mirror is located in Sydney, Australia and syncs directly from eu.ceph.com
Cheers,
Hi,
You should set a cache tier for CephFS to use and have the erasure coded pool
behind it. You will find detailed informations at
http://docs.ceph.com/docs/master/rados/operations/cache-tiering/
Cheers
On 17/04/2015 12:39, MEGATEL / Rafał Gawron wrote:
Hello
I would create cephfs with
Do you by any chance have your OSDs placed at a local directory path rather
than on a non utilized physical disk?
No, I have 18 Disks per Server. Each OSD is mapped to a physical disk.
Here in the output of one server:
ansible@zrh-srv-m-cph02:~$ df -h
Filesystem Size Used Avail
As you've seen, a set of systemd unit files has been committed to git,
but the packages do not yet use them.
There is an open ticket for this task,
http://tracker.ceph.com/issues/11344 . Feel free to add yourself as a
watcher on that if you are interested in the progress.
- Ken
On 04/17/2015
For example you can assign different read/write permissions and
different keyrings to different pools.
2015-04-17 16:00 GMT+02:00 Chad William Seys cws...@physics.wisc.edu:
Hi All,
What are the advantages of having multiple ceph pools (if they use the
whole cluster)?
Thanks!
C.
Hi All,
What are the advantages of having multiple ceph pools (if they use the
whole cluster)?
Thanks!
C.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hello
I would create cephfs with erasure code.
I define my default ec-profile:
ceph osd erasure-code-profile get default
directory=/usr/lib64/ceph/erasure-code
k=3
m=1
plugin=jerasure
ruleset-failure-domain=host
technique=reed_sol_van
How I can create cephfs with this profile ?
I try create
On 17/04/2015, at 07.33, Josef Johansson jose...@gmail.com wrote:
To your question, which I’m not sure I understand completely.
So yes, you don’t need the MDS if you just keep track of block storage and
object storage. (i.e. images for KVM)
So the Mon keeps track of the metadata for
Hi!
Do you by any chance have your OSDs placed at a local directory path
rather than on a non utilized physical disk?
If I remember correctly from a similar setup that I had performed in
the past the ceph df command accounts for the entire disk and not just
for the OSD data directory. I am
I have 1 SSD that hosted 6 OSD's Journals, that is dead, so 6 OSD down, ceph
rebalanced etc.
Now I have new SSD inside, and I will partition it etc - but would like to
know, how to proceed now, with the journal recreation for those 6 OSDs that
are down now.
Well assuming the OSDs are
Hi,
I'm currently try to deploy a new ceph test cluster on centos7, (hammer)
from ceph-deploy (on a debian wheezy).
And it seem that systemd unit files are not deployed
Seem that ceph git have systemd unit file
https://github.com/ceph/ceph/tree/hammer/systemd
I don't have look inside the rpm
On 04/17/15 16:01, Saverio Proto wrote:
For example you can assign different read/write permissions and
different keyrings to different pools.
From memory you can set different replication settings, use a cache
pool or not, use specific crush map rules too.
Lionel Bouton
Hi Greg,
Thanks for the reply. After looking more closely at /etc/ceph/rbdmap I
discovered it was corrupted. That was the only problem.
I think the dmesg line
'rbd: no image name provided'
is also a clue to this!
Hope that helps any other newbies! :)
Thanks again,
Chad.
I would be very keen for this to be implemented in Hammer and am willing to
help test it...
Paul Hewlett
Senior Systems Engineer
Velocix, Cambridge
Alcatel-Lucent
t: +44 1223 435893 m: +44 7985327353
From: ceph-users
I've seen something like this a few times.
Once, I lost the battery in my battery backed RAID card. That caused all
the OSDs on that host to be slow, which triggered slow request notices
pretty much cluster wide. It was only when I histogrammed the slow request
notices that I saw most of them
SSD died that hosted journals for 6 OSDs - 2 x SSD died, so 12 OSDs are
down, and rebalancing is about finish... after which I need to fix the OSDs.
On 17 April 2015 at 19:01, Josef Johansson jo...@oderland.se wrote:
Hi,
Did 6 other OSDs go down when re-adding?
/Josef
On 17 Apr 2015, at
Hi all,
I have a setup where I can launch vms from a standalone vmware esxi host
which acts as a iscsi initiator and a ceph rbd block device that is
exported as a iscsi target.
During the time of launching of vms from the standalone esxi host
integrated with ceph, it is prompting me to choose
Delete and re-add all six OSDs.
On Fri, Apr 17, 2015 at 3:36 AM, Andrija Panic andrija.pa...@gmail.com
wrote:
Hi guys,
I have 1 SSD that hosted 6 OSD's Journals, that is dead, so 6 OSD down,
ceph rebalanced etc.
Now I have new SSD inside, and I will partition it etc - but would like to
I'm running a small cluster, but I'll chime in since nobody else has.
Cern had a presentation a while ago (dumpling time-frame) about their
deployment. They go over some of your questions:
http://www.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern
My philosophy on Config Management is that it
Performance on ZFS on Linux (ZoL) seems to be fine, as long as you use the CEPH
generic filesystem implementation (writeahead) and not the specific CEPH ZFS
implementation, CoW snapshoting that CEPH does with ZFS support compiled in
absolutely kills performance. I suspect the same would go with
Hi,
Did 6 other OSDs go down when re-adding?
/Josef
On 17 Apr 2015, at 18:49, Andrija Panic andrija.pa...@gmail.com wrote:
12 osds down - I expect less work with removing and adding osd?
On Apr 17, 2015 6:35 PM, Krzysztof Nowicki krzysztof.a.nowi...@gmail.com
Hi Mark,
I finally got my hardware for my production full ssd cluster.
Here a first preliminary bench. (1osd).
I got around 45K iops with randread 4K with a small 10GB rbd volume
I'm pretty happy because I don't see anymore huge cpu difference between krbd
lirbd.
In my previous bench I was
Any quick write performance data?
Michal Kozanecki | Linux Administrator | E: mkozane...@evertz.com
-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Alexandre DERUMIER
Sent: April-17-15 11:38 AM
To: Mark Nelson; ceph-users
Subject:
Thx guys, thats what I will be doing at the end.
Cheers
On Apr 17, 2015 6:24 PM, Robert LeBlanc rob...@leblancnet.us wrote:
Delete and re-add all six OSDs.
On Fri, Apr 17, 2015 at 3:36 AM, Andrija Panic andrija.pa...@gmail.com
wrote:
Hi guys,
I have 1 SSD that hosted 6 OSD's Journals,
On 16/04/15 17:34, Chris Armstrong wrote:
Thanks for the update, Patrick. Our Docker builds were failing due to
the mirror being down. I appreciate being able to check the mailing list
and quickly see what's going on!
if you're accessing the ceph repo all the time, it's probably worth the
12 osds down - I expect less work with removing and adding osd?
On Apr 17, 2015 6:35 PM, Krzysztof Nowicki krzysztof.a.nowi...@gmail.com
wrote:
Why not just wipe out the OSD filesystem, run ceph-osd --mkfs with the
existing OSD UUID, copy the keyring and let it populate itself?
pt., 17 kwi
Am 17.04.2015 um 17:37 schrieb Alexandre DERUMIER:
Hi Mark,
I finally got my hardware for my production full ssd cluster.
Here a first preliminary bench. (1osd).
I got around 45K iops with randread 4K with a small 10GB rbd volume
I'm pretty happy because I don't see anymore huge cpu
nahSamsun 850 PRO 128GB - dead after 3months - 2 of these died...
wearing level is 96%, so only 4% wasted... (yes I know these are not
enterprise,etc... )
On 17 April 2015 at 21:01, Josef Johansson jose...@gmail.com wrote:
tough luck, hope everything comes up ok afterwards. What models on
For reference, I'm currently running 26 nodes (338 OSDs); will be 35
nodes (455 OSDs) in the near future.
Node/OSD provisioning and replacements:
Mostly I'm using ceph-deploy, at least to do node/osd adds and
replacements. Right now the process is:
Use FAI (http://fai-project.org) to setup
the massive rebalancing does not affect the ssds in a good way either. But
from what I've gatherd the pro should be fine. Massive amount of write
errors in the logs?
/Josef
On 17 Apr 2015 21:07, Andrija Panic andrija.pa...@gmail.com wrote:
nahSamsun 850 PRO 128GB - dead after 3months - 2 of
tough luck, hope everything comes up ok afterwards. What models on the SSD?
/Josef
On 17 Apr 2015 20:05, Andrija Panic andrija.pa...@gmail.com wrote:
SSD died that hosted journals for 6 OSDs - 2 x SSD died, so 12 OSDs are
down, and rebalancing is about finish... after which I need to fix the
I have two of them in my cluster (plus one 256GB version) for about half a
year now. So far so good. I'll be keeping a closer look at them.
pt., 17 kwi 2015, 21:07 Andrija Panic użytkownik andrija.pa...@gmail.com
napisał:
nahSamsun 850 PRO 128GB - dead after 3months - 2 of these died...
damn, good news for me, pssibly bad news for you :)
what is wearing level (samrtctl -a /dev/sdX) - attribute near the end of
the atribute list...
thx
On 17 April 2015 at 21:12, Krzysztof Nowicki krzysztof.a.nowi...@gmail.com
wrote:
I have two of them in my cluster (plus one 256GB version) for
I also have a fairly small deployment of 14 nodes, 42 OSDs, but even I use
some automation. I do my OS installs and partitioning with PXE / kickstart,
then use chef for my baseline install of the normal server stuff in our
env and admin accounts. Then the ceph-specific stuff I handle by hand and
Now I also know I have too many PGs!
It is fairly confusing to talk about PGs on the Pool page, but only vaguely
talk about the number of PGs for the cluster.
Here are some examples of confusing statements with suggested alternatives
from the online docs:
If the journal file on the osd is a symlink to the partition and the OSD
process is running, then the journal was created properly. The OSD would
not start if the journal was not created.
On Fri, Apr 17, 2015 at 2:43 PM, Andrija Panic andrija.pa...@gmail.com
wrote:
Hi all,
when I run:
Hi all,
when I run:
ceph-deploy osd create SERVER:sdi:/dev/sdb5
(sdi = previously ZAP-ed 4TB drive)
(sdb5 = previously manually created empty partition with fdisk)
Is ceph-deploy going to create journal properly on sdb5 (something similar
to: ceph-osd -i $ID --mkjournal ), or do I need to do
Checked the SMART status. All of the Samsungs have Wear Leveling Count
equal to 99 (raw values 29, 36 and 15). I'm going to have to monitor them -
I could afford loosing one of them, but loosing two would mean loss of data.
pt., 17 kwi 2015 o 21:22 użytkownik Josef Johansson jose...@gmail.com
ok, thx Robert - I expected that so this is fine then - just done it on 12
OSDs and all fine...
thx again
On 17 April 2015 at 23:38, Robert LeBlanc rob...@leblancnet.us wrote:
If the journal file on the osd is a symlink to the partition and the OSD
process is running, then the journal was
Hello all,
I am considering using Ceph for a new deployment and have a few
questions about the current implementation of erasure codes.
I understand that erasure codes have been enabled for pools, but that
erasure coded pools cannot be used as the basis of a Ceph FS. Is it
fair to infer that
Hi,
Although erasure coded pools cannot be used with CephFS, they can be used
behind a replicated cache pool as explained at
http://docs.ceph.com/docs/master/rados/operations/cache-tiering/.
Cheers
On 18/04/2015 00:26, Ben Randall wrote:
Hello all,
I am considering using Ceph for a new
On Friday, April 17, 2015, Michal Kozanecki mkozane...@evertz.com wrote:
Performance on ZFS on Linux (ZoL) seems to be fine, as long as you use the
CEPH generic filesystem implementation (writeahead) and not the specific
CEPH ZFS implementation, CoW snapshoting that CEPH does with ZFS support
Why not just wipe out the OSD filesystem, run ceph-osd --mkfs with the
existing OSD UUID, copy the keyring and let it populate itself?
pt., 17 kwi 2015 o 18:31 użytkownik Andrija Panic andrija.pa...@gmail.com
napisał:
Thx guys, thats what I will be doing at the end.
Cheers
On Apr 17, 2015
Thanks to all for your reply -RegardsPragya JainDepartment of Computer
ScienceUniversity of DelhiDelhi, India
On Friday, 17 April 2015 4:36 PM, Steffen W Sørensen ste...@me.com wrote:
On 17/04/2015, at 07.33, Josef Johansson jose...@gmail.com wrote:
To your question, which
Any quick write performance data?
4k randwrite
iops : 12K
hostcpu: 85.5 idle
client cpu : 98,5 id idle
disk util : 100% (this is the bottleneck).
This s3500 drives can do around 25K rand 4K write with O_DSYNC
So, with ceph double write (journal + datas), that's explain the 12K
-
any idea whether this might be the tcmalloc bug?
I still don't known if centos/redhat packages have also the bug or not.
gperftools.x86_64 2.1-1.el7
- Mail original -
De: Stefan Priebe s.pri...@profihost.ag
À: aderumier aderum...@odiso.com, Mark Nelson
Am 18.04.2015 um 07:24 schrieb Alexandre DERUMIER aderum...@odiso.com:
any idea whether this might be the tcmalloc bug?
I still don't known if centos/redhat packages have also the bug or not.
gperftools.x86_64 2.1-1.el7
From the version number it looks buggy. I'm
52 matches
Mail list logo