Re: [ceph-users] After power outage, nearly all vm volumes corrupted and unmountable
Hello Gary, On 2018-07-06 13:55, Gary Molenkamp wrote: # parted /dev/sdb GNU Parted 3.1 Using /dev/sdb Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) p Model: QEMU QEMU HARDDISK (scsi) Disk /dev/sdb: 42.9GB Sector size (logical/physical): 512B/512B Partition Table: msdos Disk Flags: Number Start End Size Type File system Flags 1 1049kB 42.9GB 42.9GB primary xfs boot # mount -t xfs /dev/sdb temp mount: wrong fs type, bad option, bad superblock on /dev/sdb, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so. # xfs_repair /dev/sdb Phase 1 - find and verify superblock... bad primary superblock - bad magic number !!! attempting to find secondary superblock... You have a partition on your block device but you try to mount and repair the block device itself. What happens if you try to mount /dev/sdb1 (instead of /dev/sdb) and, if needed, xfs_repair /dev/sdb1? Kind regards, Cybertinus ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] what happen to the OSDs if the OS disk dies?
Hello Felix, When you put your OS on a single drive and that drive fails, you will loose all the OSDs on that machine, because the entier machine goes down. The PGs that now miss a partner are going to be replicated again. So, in your case, the PGs that are on those 11 OSDs. This rebuilding doesn't start right away, so you can safely reboot an OSD host without starting a major rebalance of your data. I would put 2 drives in RAID1 if I were you. Putting 2 SSDs in the back 2,5" slots, like suggested by Brian, sounds like the best option to me. This way you don't loose a massive storage amount (2x10x8 = 160 TB you would loose otherwise, just for the OS installation...) --- Kind regards, Cybertinus On 12-08-2016 13:41, Félix Barbeira wrote: Hi, I'm planning to make a ceph cluster but I have a serious doubt. At this moment we have ~10 servers DELL R730xd with 12x4TB SATA disks. The official ceph docs says: "We recommend using a dedicated drive for the operating system and software, and one drive for each Ceph OSD Daemon you run on the host." I could use for example 1 disk for the OS and 11 for OSD data. In the operating system I would run 11 daemons to control the OSDs. But...what happen to the cluster if the disk with the OS fails?? maybe the cluster thinks that 11 OSD failed and try to replicate all that data over the cluster...that sounds no good. Should I use 2 disks for the OS making a RAID1? in this case I'm "wasting" 8TB only for ~10GB that the OS needs. In all the docs that i've been reading says ceph has no unique single point of failure, so I think that this scenario must have a optimal solution, maybe somebody could help me. Thanks in advance. -- Félix Barbeira. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Is Ceph the right tool for me?
Hello, That's a clear anwser :). Thank you for that. The core issue I'm trying to solve is the lack of HA indeed. And at some previous employer I saw Ceph in action and Ceph itself did what it promised. In the end it had way to little performance for what we were trying to do with it, but as a technology it rocked. So that's the reason I'm looking at Ceph now, for my own project. I don't need the performance now. I will look into DRBD and HAST (the FreeBSD version of DRBD) in a bit more detail, maybe that's the way to go. And when the DRBD/HAST cluster has become too small for what I use it for, I can always upgrade to Ceph at that point :). Kind regards, Cybertinus On Fri, June 26, 2015 13:03, Nick Fisk wrote: -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of John Spray Sent: 26 June 2015 11:37 To: Cybertinus; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Is Ceph the right tool for me? On 25/06/2015 23:28, Cybertinus wrote: Hello everybody, I'm looking at Ceph as an alternative for my current storage solution, but I'm wondering if it is the right choice for me. I'm hoping you guys can help me decide. The current setup is a FreeBSD 10.1 machine running entirely on ZFS. The function of the machine is offsite backup for important data. For some (fairly rapidly changing) data this server is the only backup of it. But because the data is changing fairly quickly (every day at least) I'm looking to get this server more HA then it is now. I think that last sentence is the key bit -- you're looking for an HA solution for your filer. While Ceph is highly available, it's probably huge overkill if you're just looking to add an additional server to make your storage HA. At two-node scale, you'll get better and more consistent performance from a local filesystem (like ZFS) on a dual ported direct attached storage array, than you would from *any* distributed filesystem. If the array is out of the question then look to DRBD. All that said, you should definitely play with ceph anyway, even if you don't need it for this project, it's awesome :-) I would echo what John has just said. If your problem with ZFS is scaling it, or solving issues like performance or reliability at scale, then Ceph is your answer. But it sounds like you would be better just turning/rebuilding your current solution into something that has HA. Check out the LSI Syncro cards for a slightly easier implementation. John ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Is Ceph the right tool for me?
Hello, Here is some more info: At the moment is is a very small dataset. 500 GB. I've got 4TB storage available in the server (2x 4TB in ZFS mirror) and I'm going to upgrade that to 12 TB netto in the near future, just to have more spindles available (4x 4TB in raidz1). When this 12TB is full (and I'm not sure at what rate this will go, could be full in 3 months or 2 years) I need to upgrade my setup to something else. And I'm looking at my options now, so I can have a plan ready on how to proceed at that point and I don't have to rush things. I'm using ZFS directly. It's DAS. The restore time isn't guaranteed. I want the backup solution to be HA, because when a backup is made from some system towards my storage, I need to have my service available. Because, like I said, this is the only backup available for a lot of the data stored on this server. Most of the source machines don't have any redundancy build in, so having backups is important. I don't have spare hardware on stock. I can arrange something when it goes horribly wrong, but it will impact my services in another way. It's not nice, but it will work. The risk of is not extremely high because of the age of the hardware. It isn't brand new, but it's not that old either. It's 2 years old now, and it has been running production for a year. But the PSU is single for instance. The machine has an Xeon E3-1220 V2 as CPU, so like I said: not that old :). I'm just storing backups on this server. I'm not backing up this server itself (yet ;). I just realized how stupid I am for not doing that, will be fixed in the very near future :p). Making a ZFS snapshot *and* have the DB consistent isn't that hard. You just run a FLUSH TABLES WITH READ LOCK command, ensuring that MySQL writes everything to the datafiles and locking the tables for writes. When you have the locks, this statement will be done. Then you have the system command in the mysql prompt, which you can use to run system command, so here you can let the mysql commandline tool start the snapshot process. And when that's done you just unlock the tables again. And in your snapshot the data is consistent. I don't have an example ready with this trick with a ZFS snapshot (because I don't backup the FreeBSD machine yet), but I do apply this trick on a CentOS 6 VM of mine. I'm making an LVM snapshot there instead of a ZFS snapshot, but the essence of the trick stays the same. This is how I do it there (replace the parts between with your own setting): mysql -h localhost -u root -pmysql_root_password -e 'FLUSH TABLES WITH READ LOCK; SYSTEM /sbin/lvcreate --snapshot --size snapshot_size --name snapshot_name /dev/mapper/vgname-lvname; UNLOCK TABLES;' And for keeping the OwnCloud database in sync: Right after creating the snapshot and the clone I remove the entier cache for the specific user from the cache table in the OwnCloud database. I know this isn't recommended, but it works like a charm. But the cache in OwnCloud is something that keeps on bothering me, and even the removal of those lines isn't that great all the time. I need to redo the removal by hand, sometimes. And I don't like that. On Fri, June 26, 2015 11:58, 10 minus wrote: Hi , As Christian has mentioned ... bit more detailed information will do us good.. Had explored Cephfs -- but performance was an issue vis-a-vis zfs when we tested ( more than a year back) , so we did not get into details. I will let the Cephfs experts chip in here on the present state of Cephfs How are you using zfs on your main site .. nfs/cifs / iscsi How much data are we talking about ? Yes one machine is a SPOF but the questions you should ask or answer : Is there a business requirement to restore data in a defined time ? How much data is in play here ? what are the odds it fails (hw quality is improving by the day -- does mean it wont fail) ? How fast can one replace failed HW (We have spare HW always avaialbe) ? do you need always on backup, especially offsite backup? Have you explored Tape option ? We are using zfs on solaris and freebsd as a filer ( nfs/cifs) and we keep three copies of snapshot (We have 5 TB of data ) - local on filers ( snapshot every hour for 2 days) - onsite on another machine 1 Week (snapshot copy every 12 hrs on a machine onsite ) - offsite (snapshot copy every day for 4 weeks -- then from offsite to tape). For DB backup we have a system in place but it does not rely on zfs snapshot, Would love to know how you manage DB backups with zfs snapshots. ZFS is a mature technology .. P.S We use ceph for openstack (ephemeral /cinder / glance ) .. with no backup. (One year on we are still learning new things and it has just worked) On Fri, Jun 26, 2015 at 9:00 AM, Christian Balzer ch...@gol.com wrote: Hello, On Fri, 26 Jun 2015 00:28:20 +0200 Cybertinus wrote: Hello everybody, I'm looking at Ceph as an alternative for my current storage solution, but I'm wondering
[ceph-users] Is Ceph the right tool for me?
Hello everybody, I'm looking at Ceph as an alternative for my current storage solution, but I'm wondering if it is the right choice for me. I'm hoping you guys can help me decide. The current setup is a FreeBSD 10.1 machine running entirely on ZFS. The function of the machine is offsite backup for important data. For some (fairly rapidly changing) data this server is the only backup of it. But because the data is changing fairly quickly (every day at least) I'm looking to get this server more HA then it is now. It is just one FreeBSD machine, so this is an enormous SPOF off course. The most used functionality of ZFS that I use is the snapshot technology. I've got multiple users on this server and each user has it's own filesystem within the pool. And I just snapshot each filesystem regularly and that way I enable the users to go back in time. I've looked at the snapshot functionality of Ceph, but it's not clear to me what I can snapshot with it exactly. Furthermore: what is the best way to hook Ceph to the application I use to transfer the data from the users to the backup server? Today I'm using OwnCloud, which is (in essence) a WebDAV server. Now I'm thinking about replacing OwnCloud with something custom build. That way I can let PHP talk directly with librados, which makes it easy to store the data. Or I can keep on using OwnCloud and just hook up Ceph via CephFS. This has the added advantage that I don't have to get my head around the concept of object storage :p ;). Kind regards, Cybertinus ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com