Hi Marc,

thanks for the reply.

I knew RBDSR project very well.
I used to be one first contributors to the project: https://github.com/rposudnevskiy/RBDSR/graphs/contributors I rewrote all the installation script to do it easily and allow multiple installation across all XenClusters in few commands.

I runned that plugin for about 6 months in 2017 and everything went fine since a disk image suddenly locked and never come back. There was an error with the "tap-ctl" driver and I had a terrible work to rebuild that image from backups.
It was a nightmare.

The "tap-ctl" is a driver bundled with the distribution of XEN which need to be overwritten to allow this plugin work. It's a brute patching way to force the plugin work without any guarantee that the driver will be overwrote again by a future update of Xen.
This is unsafe.

Since your team didn't joined the project few months ago, the code was mainly written by a saint called Roman, with very few support from the community. Recently I saw many new work done by your team, and it seems great, however, still not production ready, i guess.

By the way I started to think that the issue there is not Ceph, but XEN instead. In order to install ceph on xenserver you need to brake the repo, brake the driver, install newest libraries that brake somewhere the code of XEN.
I have done it, but this leaded to somewhere error in the storage handler.

I think that till Xen will stay in Citrix, this hypervisor will always have heavy incompatibility issues. Citrix is too slow and too committed on what needed by big brands and not really by SysAdmins.
But I see an light of hope: https://xcp-ng.org
XCP-ng is the new version of XenServer, sourced by the original but compiled fully featured by a very good team of developers. They are rewroting everything it's absourd with Ufficial XenServer removing all limits, also they started to run updates by REPO instead of that tricky and absurd patch system. I hope they are gonna to update the OS and use a custom kernel fully compatible with Ceph-Client.

About this I wrote some week ago to Roman, to merge it's plugin RBDSR within that project and join forces with XCP.
What do you think about it?

Go back to your question about make piooners.
I had the same decision to take some month ago, so I really see your point.
At the end iSCSI-ceph seems to be much more community followed than RBDSR (which is a project of interest only for Xen people). So, yes, at this time you and me are piooners for boths technology, but what about 3 months?
Do iSCSI-ceph still be a 'piooner' technology? I guess no.
Do RBDSR still be a 'piooner' technology? I would say... it'll be always, as it need to hack the OS to work.

So, without any kind of support, we probably go on with iSCSI-ceph, even if this means lost 4x speed.
I need to stay safe.

But of course, I like people that find a good reason to change my mind :)






i knew that project very well



Il 13/06/2018 09:25, Marc Schöchlin ha scritto:
Hi Max,

just a sidenote: we are using a fork of RBDSR
(https://github.com/vico-research-and-consulting/RBDSR) to connect
XENServer 7.2 Community to RBDs directly using rbd-nbd.
After a bit of hacking this works pretty good: direct RBD Creation from
the storage repo, live Migration between xen-nodes and pool, migration
between pools

A few weeks we had two options to decide:

   * Do pioneer work with the new and really fancy active/active
     multipathed LIO TCMU-U infrastructure
       o compile a bunch of software and create packages for ubuntu
         ("ceph-iscsi-config", "python-rtslib-bf", "a recent
         linux-kernel", "ceph-iscsi-config", "ceph-iscsi-cli",
         "ceph-iscsi-tools", " tcmu-runner",  ...)
       o get the setup stable, work on a complex setup with no real userbase
       o deploy iscsi gateways on our 5 osd nodes and distribute workload
         on some pairs of gateways
       o have krbd load on osd nodes, add complexity on ceph-upgrades
       o accept the limitations of iscsi, especially hacky edge-cases if
         gateways go away and return
       o write automation code for our centralized operations inventory
         system which interfaces and manages the associations of xen vms
         to iscsi storage repos, iscsi volumes, rbd images
   * Do pioneer work on RBDSR
       o improve the storage gluecode of RBDSR
       o get the setup stable, work on a simpler setup with no userbase
       o have real multipathed rbd access without limitations
       o get a good performance, especially in overall-cluster-view
         (especially LACP bandwith usage between xen servers and ceph osds)
       o have librbd (rbd-nbd, ceph-client) workload on every xenserver dom-0
         => better scaling: more xen servers -> better overall performance
       o utilize rbd cache (there is nothing comparable in XENServer 7.2
         Community)
       o use the capabilities of ceph to create snapshots, clone systems

What do you think about that?

Regards
Marc


Am 12.06.2018 um 21:03 schrieb Max Cuttins:
Hi everybody,

i have a running iSCSI-ceph environment that connect to XenServer 7.2.
I have some dubts and rookie questions about iSCSI.

1) Xen refused to connect to iSCSI gateway since I didn't turn up
multipath on Xen.
To me it's ok. But Is it right say that multipath is much more than
just a feature but it's a mandatory way to connect instead?
Is this normal? I thought iSCSI multipath was back-compatible with
singlepath one.

2) The connection accomplished correctly with multipath.
I see on the XEN dashboard:

     *2 of 2 paths active* (2 iSCSI sessions)

I read around that for now that the iSCSI gateway would have just an
active/passive multipath.
Is this already worked? :)

3) I see "optimized/not optmized" on my Ceph dashboard.
This stand for?

4) Performance.
I run a simple test (nothing of statistically proven), and I see these
value:

     dd if=/dev/zero of=/iscsi-test/testfile bs=1G count=1 oflag=direct
     1073741824 bytes (1.1 GB) copied, 6.72009 s, *160 MB/s*

     dd if=/dev/zero of=/ceph-test/testfile bs=1G count=1 oflag=direct
     1073741824 bytes (1.1 GB) copied, 1.57821 s, *680 MB/s*

Of course I expected a drop (due to overhead of iSCSI)... but this is
4x slower than direct client. Which It seems to me a little bit high.
However... is this *more-or-less* what I should consider as expected
drop in iSCSI, or this gap'll be lowered in future?






_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to