come from the same batch)
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
complex having to rule out more potential causes.
Not saying it can not work perfectly fine.
I'd rather just not take any chances with the storage system...
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http
of a PITA.
There are also some performance considerations with those filesystems so you
should really do some proper testing before any large scale deployments.
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http
So .. the idea was that ceph would provide the required clustered filesystem
element,
and it was the only FS that provided the required resize on the fly and
snapshotting things that were needed.
I can't see it working with one shared lun. In theory I can't see why it
couldn't work, but
This is similar to ISCSI except that the data is distributed accross x ceph
nodes.
Just as ISCSI you should mount this on two locations unless you run a
clustered filesystem (e.g. GFS / OCFS)
Oops I meant, should NOT mount this on two locations unles... :)
Cheers,
Robert
with a replica count of 2 it will show up as using
twice the amount of space.
So 1 GB usage will result in:
1394 GB / 1396 GB avail
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph
We are at the end of the process of designing and purchasing storage to
provide Ceph based backend for VM images, VM boot (ephemeral) disks,
persistent volumes (and possibly object storage) for our future Openstack
cloud.
We considered many options and we chose to prefer commodity storage
this is a very good point that I totally overlooked. I concentrated more on
the IOPS alignment plus write durability,
and forgot to check the sequential write bandwidth.
Again, this totally depends on the expected load.
Running lots of VMs usually tends to end up being random IOPS on your
All of which means that Mysql performance (looking at you binlog) may
still suffer due to lots of small block size sync writes.
Which begs the question:
Anyone running a reasonable busy Mysql server on Ceph backed storage?
We tried and it did not perform good enough.
We have a small ceph
.
When I manually add the OSD the following process just hangs:
ceph-osd -i 10 --osd-journal=/mnt/ceph/journal_vg_sda/journal --mkfs --mkkey
Running ceph-0.72.1 on CentOS.
Any tips?
Thx,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users
Try to add --debug-osd=20 and --debug-filestore=20
The logs might tell you more why it isn't going through.
Nothing of interest there :(
What I do notice is that when I run ceph_deploy it is referencing a keyring
that does not exist:
--keyring /var/lib/ceph/tmp/mnt.J7nSi0/keyring
If I look on
Hi,
I cannot add a new OSD to a current Ceph cluster.
It just hangs, here is the debug log:
ceph-osd -d --debug-ms=20 --debug-osd=20 --debug-filestore=31 -i 10
--osd-journal=/mnt/ceph/journal_vg_sda/journal0 --mkfs --mkjournal --mkkey
2014-07-09 10:50:28.934959
I cannot add a new OSD to a current Ceph cluster.
It just hangs, here is the debug log:
This is ceph 0.72.1 on CentOS.
Found the issue:
Although I installed the specific ceph (0.72.1) version the latest leveldb was
installed.
Apparently this breaks stuff...
Cheers,
Robert van Leeuwen
Which leveldb from where? 1.12.0-5 that tends to be in el6/7 repos is broken
for Ceph.
You need to remove the “basho fix” patch.
1.7.0 is the only readily available version that works, though it is so old
that I suspect it is responsible for various
issues we see.
Apparently at some
hierarchy indeed.
There is a nice explanation here:
http://ceph.com/docs/master/rados/operations/crush-map/
Note that your clients will write in both the local and remote dc so it will
impact write latency!
Cheers,
Robert van Leeuwen
___
ceph-users
only (pools).
(I'd be happy to hear the numbers about running high random write loads on it :)
And another nice hardware scaling PDF from dreamhost...
https://objects.dreamhost.com/inktankweb/Inktank_Hardware_Configuration_Guide.pdf
Cheers,
Robert van Leeuwen
it in production...
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
if they are all busy ;)
I honestly do not know how Ceph behaves when it is CPU starved but I guess it
might not be pretty.
Since your whole environment will be crumbling down if your storage becomes
unavailable it is not a risk I would take lightly.
Cheers,
Robert van Leeuwen
.
This is what I set in the ceph client:
[client]
rbd cache = true
rbd cache writethrough until flush = true
Anyone else noticed this behaviour before or have some troubleshooting tips?
Thx,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users
The behavior you both are seeing is fixed by making flush requests
asynchronous in the qemu driver. This was fixed upstream in qemu 1.4.2
and 1.5.0. If you've installed from ceph-extras, make sure you're using
the .async rpms [1] (we should probably remove the non-async ones at
this point).
the EPEL and Ceph repo's can be added there?
If not, make sure you have a lot of time and patience to copy stuff around.
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users
I tried putting Flashcache on my spindle OSDs using an Intel SSL and it works
great.
This is getting me read and write SSD caching instead of just write
performance on the journal.
It should also allow me to protect the OSD journal on the same drive as the
OSD data and still get benefits
any need to do this in
Ceph but I am curious what Ceph does)
I guess it is pretty tricky to handle since load can either be raw bandwith or
number of IOPS.
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http
for this instead of the /dev/disk/by-id
/by-path given by ceph-deploy.
So I am wondering how other people are setting up machines and how things work
:)
Thx,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com
Hi,
I'm playing with our new Ceph cluster and it seems that Ceph is not gracefully
handling a maxed out cluster network.
I had some flapping nodes once every few minutes when pushing a lot of
traffic to the nodes so I decided to set the noup and nodown as described in
the docs.
like to do a partition/format and some ceph commands to get
stuff working again...
Thx,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
to recreate the pool and I was not using the recommended
settings I did not really dive into the issue.
I will not stray to far from the recommended settings in the future though :)
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users
is that IOs will crawl to a halt during pg creation ( 1000 took
a few minutes).
Also expect reduced performance during the rebalance of the data.
The OSDs will be quite busy during that time.
I would certainly pick a time with low traffic to do this.
Cheers,
Robert van Leeuwen
for 100% percent.
Also the usage dropped to 0% pretty much immediately after the benchmark so it
looks like it's not lagging behind the journal.
Did not really test reads yet since we have so much read cache (128 GB per
node) I assume we will mostly be write limited.
Cheers,
Robert van Leeuwen
sequential
writes.
Cheers,
Robert van Leeuwen
Sent from my iPad
On 3 dec. 2013, at 17:02, Mike Dawson mike.daw...@cloudapt.com wrote:
Robert,
Do you have rbd writeback cache enabled on these volumes? That could
certainly explain the higher than expected write performance. Any chance you
this maillinglist is a community effort.
If you want immediate and official support 24x7 buy support @ www.inktank.com
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
for flashcache instead of journals)
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
making it fully
random.
I would expect a performance of 100 to 200 IOPS max.
Doing an iostat -x or atop should show this bottleneck immediately.
This is also the reason to go with SSDs: they have reasonable random IO
performance.
Cheers,
Robert van Leeuwen
Sent from my iPad
On 6 dec. 2013, at 17
for objects in the pool in order to
acknowledge a write operation to the client. If minimum is not met, Ceph will
not acknowledge the write to the client. This setting ensures a minimum number
of replicas when operating in degraded mode.
Cheers,
Robert van Leeuwen
use RAID 10 and do 2 instead of 3
replicas.
Cheers,
Robert van Leeuwen
From: ceph-users-boun...@lists.ceph.com [ceph-users-boun...@lists.ceph.com] on
behalf of nicolasc [nicolas.cance...@surfsara.nl]
Sent: Thursday, December 12, 2013 5:23 PM
To: Craig Lewis
trivial to change the code to support this and would be a very small change
compared to erasure code.
( I looked a bit at crush map Bucket Types but it *seems* that all Bucket types
will still stripe the PGs across all nodes within a failure domain )
Cheers,
Robert van Leeuwen
=1:
iops=297
Result jobs=16:
iops=1200
I'm running the fio bench from a KVM virtual.
Seems that a single write thread is not able to go above 300 iops (latency?)
Ceph can handle more iops if you start more / parallel write threads.
Cheers,
Robert van Leeuwen
.
I think all brands have their own quirks, the question is which one you are the
most comfortable to live with.
(e.g. we have no support contracts with Supermicro and just have parts on stock)
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph
to now it seems that only Intel seems to have done his homework.
In general they *seem* to be the most reliable SSD provider.
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users
Hi,
We experience something similar with our Openstack Swift setup.
You can change the sysstl vm.vfs_cache_pressure to make sure more inodes are
being kept in cache.
(Do not set this to 0 because you will trigger the OOM killer at some point ;)
We also decided to go for nodes with more memory
I'm hoping to get some feedback on the Dell H310 (LSI SAS2008 chipset).
Based on searching I'd done previously I got the impression that people
generally recommended avoiding it in favour of the higher specced H710
(LSI SAS2208 chipset).
Purely based on the controller chip it should be OK.
We
failure domains by default.
My guess is the default crush map will see a node as a single failure domain by
default.
So, edit the crushmap to allow this or add a second node.
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
to be sure.
In general I would go for 3 replica's and no RAID.
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
.
In my experience it is good enough for some low writes instances but not for
write intensive applications like Mysql.
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
consistent.
I think Ceph is working on something similar for the Rados gateway.
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
will be.
Although bandwidth during a rebalance of data might also be problematic...
Cheers,
Robert van Leeuwen
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
46 matches
Mail list logo