Hi Pieter,

At the time our cluster environment was Ubuntu 10.04 + Kernel-2.6.32 + ocfs2-tools-1.4.3.

Later we did the upgrade to Ubuntu 10.10 + Kernel-2.6.35 + ocfs2-tools-1.6.4.

We tried to use OCFS2 under production in 2010, but were forced to migrate to a failover design cluster with Ext4.

This is the bug that affected us:
https://oss.oracle.com/bugzilla/show_bug.cgi?id=1297

Even today, its status is "NEW".

Thanks!
--
Thiago Henrique
www.adminlinux.com.br


Em 26-05-2014 18:39, Smart Weblications GmbH - Florian Wiessner escreveu:
Am 26.05.2014 15:52, schrieb Listas@Adminlinux:
Thanks Pieter!

I tried using OCFS2 over DRBD, but was not satisfied. I was being affected by
various bugs in OCFS2. But Oracle was not committed to solving them.


When did you try it? We use such a setup with ocfs2 ontop of rbd with 3.10.40
but also hit bugs with earlier kernel versions. Which Bugs did you hit? I
noticed some ocfs2 changes in changelog between 3.10.20 and 3.10.40...

I also did online resize of rbd image and then online resize of ocfs2 without
problems.



Em 24-05-2014 09:14, Pieter Koorts escreveu:
If looking for a DRBD alternative and not wanting to use CephFS is it
not possible to just use something like OCFS2 or GFS on top of a RDB
block device and all worker nodes accessing it via GFS or OCFS2
(obviously with write-through mode)?

Would this method not present some advantages over DRBD?

DRBD has its uses and will never go away but it does have limited
scalability in the general sense.


Hi !

I have failover clusters for some aplications. Generally with 2 members
configured with Ubuntu + Drbd + Ext4. For example, my IMAP cluster works
fine with ~ 50k email accounts and my HTTP cluster hosts ~2k sites.

My mailbox servers are also multiple DRBD based cluster pairs.
For performance in fully redundant storage there is isn't anything better
(in the OSS, generic hardware section at least).

See design here:http://adminlinux.com.br/cluster_design.txt

I would like to provide load balancing instead of just failover. So, I
would like to use a distributed architecture of the filesystem. As we
know, Ext4 isn't a distributed filesystem. So wish to use Ceph in my
clusters.

You will find that all cluster/distributed filesystems have severe
performance shortcomings when compared to something like Ext4.

On top of that, CephFS isn't ready for production as the MDS isn't HA.

A potential middle way might be to use Ceph/RBD volumes formatted in Ext4.
That doesn't give you shared access, but it will allow you to separate
storage and compute nodes, so when one compute node becomes busy, mount
that volume from a more powerful compute node instead.

That all said, I can't see any way and reason to replace my mailbox DRBD
clusters with Ceph in the foreseeable future.
To get similar performance/reliability to DRBD I would have to spend 3-4
times the money.

Where Ceph/RBD works well is situations where you can't fit the compute
needs into a storage node (as required with DRBD) and where you want to
access things from multiple compute nodes, primarily for migration
purposes.
In short, as a shared storage for VMs.

Any suggestions for design of the cluster with Ubuntu+Ceph?

I built a simple cluster of 2 servers to test simultaneous reading and
writing with Ceph. My conf: http://adminlinux.com.br/ceph_conf.txt

Again, CephFS isn't ready for production, but other than that I know very
little about it as I don't use it.
However your version of Ceph is severely outdated, you really should be
looking at something more recent to rule out you're experience long fixed
bugs. The same goes for your entire setup and kernel.

Also Ceph only starts to perform decently with many OSDs (disks) and
the journals on SSDs instead of being on the same disk.
Think DRBD AL metadata-internal, but with MUCH more impact.

Regards,

Christian
But in my simultaneous benchmarks found errors in reading and writing. I
ran "iozone -t 5 -r 4k -s 2m" simultaneously on both servers in the
cluster. The performance was poor and had errors like this:

Error in file: Found ?0? Expecting ?6d6d6d6d6d6d6d6d? addr b6600000
Error in file: Position 1060864
Record # 259 Record size 4 kb
where b6600000 loop 0

Performance graphs of benchmark: http://adminlinux.com.br/ceph_bench.html

Can you help me find what I did wrong?




_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to