On 2020-10-01 04:33, Jeremey Wise wrote: > > I have for many years used gluster because..well. 3 nodes.. and so > long as I can pull a drive out.. I can get my data.. and with three > copies.. I have much higher chance of getting it. > > Downsides to gluster: Slower (its my home..meh... and I have SSD to > avoid MTBF issues ) and with VDO.. and thin provisioning.. not had issue. > > BUT.... gluster seems to be falling out of favor. Especially as I > move towards OCP. > > So.. CEPH. I have one SSD in each of the three servers. so I have > some space to play. > > I googled around.. and find no clean deployment notes and guides on > CEPH + oVirt. >
Greetings, First, the legalese...the below is all personal view/experiences...I am not speaking on my employers behalf on any of it, I just happen to have most of the experience from my work for them...yadda yadda yadda...blah blah blah...and so on and so forth. This is all just me and my thoughts/opinions. :-D *sigh* It sucks we are in a world where that garbage has to be said when talking about work-related experiences....Anyway... We've been running CephFS since Firefly in 2014. Yeah, I know. We were crazy, but the risk of data loss vs speed was within threshold of what we were trying to do. Fast-forward six years and we've got two CephFS clusters as primary storage for High Performance Clusters where we very much care about performance AND the risk of data loss. We've also got two deployments of oVirt with CephFS as the filesystem. In other words, I've got some experience with this setup and we are /very/ happy with it. :-) I'm so happy with it, that it is easier/faster for me to list the bad than to list the good. 1. Red Hat (our OS to satisfy the "enterprise" check-box for the audit-heads) and I have gone round and round multiple times over the years. In short, don't expect excellent support out of oVirt for Ceph. Want to use Ceph via iSCSI or Cinder? Whooo boy do I have some horror stories for you! One of the many reasons we prefer CephFS. But say that to them and you get blank looks until they've escalated the ticket sufficiently high up the chain, and even then it's not reassuring... However, if you pass CephFS to oVirt as NFS it works...but you don't get the high-availability nor high-performance aspect of scaling your metadata nodes when coming from oVirt. You _SHOULD_ scale your metadata nodes (as with everything in Ceph, scaling in three's is best), but oVirt won't let you mount "cephmds01,cephmds02,cephmds03". It will gladly tell you that it works, but the moment you start a VM on it oVirt freaks out and it has since I reported it years ago (I recently confirmed this again on 4.4 with CentOS8). But if you just mount "cephmds01" and then hack around on your IP routes in your switch to handle the distribution of the data, it's fine. Honestly, even if you just mount a single host and you /know/ that and you _plan_ upgrades/fails/ect around that, it's still fine. It just really sucks that RH pushes Ceph and claims it's a valued FS, but then doesn't really support anything but their cloud variations of Ceph and if you step out of their very narrow definitions you get a *shrug*. Sigh...anyway...digressing from that as this isn't the time/place for my rants. :-D Point being, if you are going RH don't expect to use any of their helper scripts or minimal install builds or anything like that. Minimal OS install, add CephFS drivers, then install oVirt (or...I forget what they call it..) and configure Ceph like you would NFS. Should be fine afterwards. But I've rarely found significant differences between the community version of oVirt and the RH version (when comparing same/similar versions) including the support for Ceph. 2. We get incredible performance out of Ceph, but it does require tuning. Ceph crushes the pre-packaged vendors we ran tests against. But part of the reason is because it is flexible enough that we can swap out the bits that we need to scale - and we can do that FAR cheaper than the pre-packaged solutions allow. Yes, in three's for the servers. Three metadata's, three monitors (we double those two services on the same servers), and storage in blocks of three. If your SSD's are fast enough, 1 SSD per every two spinning disks is a great ratio. And rebuild times across the cluster are only as fast as your back-plane so you should have a dedicated back-plane network in addition to your primary network. Everyone wants their primary network fast, but your backplane should be equally fast if not faster (and no, don't add just one "fast" network - it should be two). So you are going to need to plan and tweak your install. Just throwing parts at Ceph and expecting it to work will get you mixed results at best. 3. I'd never run it at my home. My home oVirt system mounts NFS to a ZFS filesystem. Nothing fancy either. Stripped mirrors ensure good read/write speed with good fault tolerance. I threw two cheap SSD's as a log drive and a cache drive (which these two SSD's made HUGE performance gains for oVirt VM's) and it's been smooth sailing since. It's trivial to manage/upgrade and FAR less over-head than Ceph. That's really just the warnings I've got for you. I'm a HUGE fan of oVirt and we've done some pretty nutty stuff with it in testing and I trust it for multiple environments where we throw some pretty heavy loads at it. I've got TONS of praise for oVirt and the whole team that backs it. It's fantastic. And I do love Ceph (and specifically CephFS) and we get incredible performance that I could gush over all day long. If you are planning on building Ceph on the cheap, plan replications in sets of three, and prepare for lots of tweaking and tuning. If you are in the position to buy, I *HIGHLY* recommend at least talking to https://softiron.com (I do not work for them, I do not get any kick-back from them, I'm just very pleased with their product). They focus on Ceph and they do it well, but they still let you tweak as needed. And since they build off of Arm processors, all the power and heat come from the drives...these things run super-cool. Loads more efficient then the home-built stuff we ran for years. I'm even a huge fan of running oVirt with a CephFS storage! I _REALLY_ wish the combo would be treated better. But most of my frustrations are many years old at this point, and we've figured out workarounds in the meantime. It's too much for me to want to mess with at home, but so long as you plan out your Ceph install and you are just prepared to be the odd-ball using CephFS+oVirt including the workarounds it's a great setup. I absolutely believe that we've gotten a HUGE return on investment into Ceph...but I'm also using it for high-speed data computations in a big cluster. The oVirt + CephFS is an add-on to the HPC + CephFS. The ROI on oVirt is also huge because we were never satisfied with other virtualization solutions and while OpenStack worked for us it was FAR more overhead than we needed or could support with a team as small as ours. So I'm a big believer that our specific use case for both is a massive ROI win. Should you decide to move forward with CephFS + oVirt and you have questions, feel free to reach out to me. No promises that your problems will be the same as mine, but I can at least share some experiences/config-settings with you. Good luck! ~Stack~ _______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/YSMPMBZ435SK6UHYSWHLQLG4YRO5LAQ3/