Re: [CentOS-virt] Fwd: Xen4CentOS kernel panic on dom0 reboot
On Sat, Mar 08, 2014 at 09:09:07AM -0600, PJ Welsh wrote: No, I have not followed those instructions yet. These were production servers that I had scheduled firmware updates late Sunday evening. The first time I though the error was a fluke and only began to research it after the second failure (and still no firmware updates due to the power-cycle). I may try to sneak in a restart of one of the systems late Sunday night US CT. OK. Still not sure why the running vm's would stop the reboot... The server shows that it was suppose to be restarting. I have had a similar stuck on restarting message (minus all the umount errors) on some Dell T105's running CentOS 6.5 and the reboot=pci grub.conf kernel option is what ended up working for them. I have not tested that possible option yet, either since that would take 2 reboots to put into place. Yeah, it's worth testing both, to figure out what's wrong. -- Pasi On Fri, Mar 7, 2014 at 10:43 PM, Pasi Kärkkäinen [1]pa...@iki.fi wrote: On Thu, Mar 06, 2014 at 01:54:22PM -0600, Phillippe Welsh wrote: Subject: Re: [CentOS-virt] Xen4CentOS kernel panic on dom0 reboot On Wed, Mar 5, 2014 at 10:17 AM, David Vrabel david.vrabel at [2]citrix.comwrote: On 05/03/14 15:09, Karl Johnson wrote: I've been using Xen4CentOS for the last 3 months. It's working fine and dom0/domUs are stable but the server does a kernel panic when doing a reboot and the server has to be hard reset manually. It has kernel panic on the 3 last reboot. There is a xenbus device still present and during shutdown it is trying to set it to CLOSED but at this point xenstored isn't running and the xenbus write stalls. Do you have VMs that are still running when you attempt a reboot? If so shutting them down will likely avoid this. Can you provide the output of xenstore-ls prior to attempting a reboot? I though Xen init.d scripts would stop all of them before rebooting? Here's the output of chkconfig and xenstore-ls: [3]http://pastebin.centos.org/8186/ Thanks, Karl I've got the me-too on the reboot hang issue for 2 different Dell R710's with xen-4.2.4-29.el6 and at least the kernels kernel-[4]3.10.25-11.el6.centos.alt.x86_64 and kernel-3.10.23-11.el6.centos.alt.x86_64. I have not tried to reboot with the latest kernel-3.10.32-11.el6.centos.alt.x86_64 (if kernel even makes a difference). I *have* had dom0_mem=1024M,max:1024M option in place for all of them with only 6 VM's. Any new suggestions? So did you make sure all the VMs are shut down before trying to reboot dom0? -- Pasi ___ CentOS-virt mailing list [5]CentOS-virt@centos.org [6]http://lists.centos.org/mailman/listinfo/centos-virt References Visible links 1. mailto:pa...@iki.fi 2. http://citrix.com/ 3. http://pastebin.centos.org/8186/ 4. file:///tmp/tel:3.10.25-11 5. mailto:CentOS-virt@centos.org 6. http://lists.centos.org/mailman/listinfo/centos-virt ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Remove Centos from AWS marketplace
On Sun, Mar 09, 2014 at 11:28:07AM -0400, Digimer wrote: Would you mind elaborating on this? If a snapshot is a point-in-time image of a VM (or even normal FS), why would DB backups be at risk (assuming things like fsync are used)? I'm asking in general terms... no idea if this is something AWS specific. Database disk snapshots may include transactions in flight and the on-disk image may not be in a consistent state. Databases such as Oracle try to work around this by ensuring that writes occur in a specific order and have a good recovery process (each data file has a change number; determine the best change number to start from, roll forward from there to recover, then roll back any incomplete transactions) but it's considered crash recovery and shouldn't be part of BAU activity. Other databases may not be so good at recovery (mysql?) and so you run the risk of database corruption if you need to restore the snapshot. If you rely on disk snapshots then it's recommended you do a proper db dump before the snapshot is taken, so that you can recover the database from the dump file and not the snapshot. -- rgds Stephen ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Remove Centos from AWS marketplace
On Sun, Mar 9, 2014 at 11:28 AM, Digimer li...@alteeve.ca wrote: Would you mind elaborating on this? If a snapshot is a point-in-time image of a VM (or even normal FS), why would DB backups be at risk (assuming things like fsync are used)? I'm asking in general terms... no idea if this is something AWS specific. digimer It's a general issue. If a system snapshot is used to correctly preserve both the disk image, and the state of the VM including memory, well and good. The state is recoverable. There's always a risk that interrupted network transactions left things in an unexpectedly inconsistent state that the VM is not equipped to handle: I'm thinking particularly of wget or other download transactions where the download software was not intelligent enough to verify the download before proceeding. I've been through this a lot lately with chef software. It's compounded by network based filesystem transactions, such as interactions with NFS or CIFS filesystems, which cannot be synchronized with the OS snapshot. But simply relying on the disk image from such an AWS snapshot, without recovering the full system state, is a potential adventure. I've not myself had opportunity to play with this kind of restoration, so I'm uncertain whether AWS allows access to the plain disk image, or automatically would bring the full VM state with it for re-activation of the snapshot. If you're just getting at the disk images, using fsync before the snapshots is helpful, but any atomic transaction that is in progress at the time of the disk image snapshot is not verifiable in the atomicity of that transaction. This particularly includes precisely the sort of page mapped data, sitting in RAM, that the fsync command helps write to disk. And snapshots cheduled from outside controllers, such as automatic snapshots, cannot be reliably synced with system specific fsync database suspension commands without a great deal of integration between the outside system, and the local host, that VM's are not supposed to normally need. I went through great deal of this some years back, shutting down databases, running LVM to get a disk snapshot, then running rsnapshot against the *snapshot* to avoid getting an inconsistent state of the database into the backup system. And there are some *funky* databases out there. Ask sometime about the Use hardlinked RCS files for source control of multiple project branches sometime, if you'd like to wince a lot. ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Remove Centos from AWS marketplace
On 09/03/14 11:52 AM, Nico Kadel-Garcia wrote: On Sun, Mar 9, 2014 at 11:28 AM, Digimer li...@alteeve.ca wrote: Would you mind elaborating on this? If a snapshot is a point-in-time image of a VM (or even normal FS), why would DB backups be at risk (assuming things like fsync are used)? I'm asking in general terms... no idea if this is something AWS specific. digimer It's a general issue. If a system snapshot is used to correctly preserve both the disk image, and the state of the VM including memory, well and good. The state is recoverable. There's always a risk that interrupted network transactions left things in an unexpectedly inconsistent state that the VM is not equipped to handle: I'm thinking particularly of wget or other download transactions where the download software was not intelligent enough to verify the download before proceeding. I've been through this a lot lately with chef software. It's compounded by network based filesystem transactions, such as interactions with NFS or CIFS filesystems, which cannot be synchronized with the OS snapshot. But simply relying on the disk image from such an AWS snapshot, without recovering the full system state, is a potential adventure. I've not myself had opportunity to play with this kind of restoration, so I'm uncertain whether AWS allows access to the plain disk image, or automatically would bring the full VM state with it for re-activation of the snapshot. If you're just getting at the disk images, using fsync before the snapshots is helpful, but any atomic transaction that is in progress at the time of the disk image snapshot is not verifiable in the atomicity of that transaction. This particularly includes precisely the sort of page mapped data, sitting in RAM, that the fsync command helps write to disk. And snapshots cheduled from outside controllers, such as automatic snapshots, cannot be reliably synced with system specific fsync database suspension commands without a great deal of integration between the outside system, and the local host, that VM's are not supposed to normally need. I went through great deal of this some years back, shutting down databases, running LVM to get a disk snapshot, then running rsnapshot against the *snapshot* to avoid getting an inconsistent state of the database into the backup system. And there are some *funky* databases out there. Ask sometime about the Use hardlinked RCS files for source control of multiple project branches sometime, if you'd like to wince a lot. This is very useful, thank you kindly for sharing. I suppose I always considered the it's like recovering for the server losing power as usually works and equating that to good enough backup. So I suppose, at best, using snapshot images as a backup ... backup method would be valid... I could see the benefit of recovering the VM, and then if anything wasn't right, using it as the target for restoring data from the proper backup. Thanks again! -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt
Re: [CentOS-virt] Remove Centos from AWS marketplace
On 09/03/14 11:43 AM, Stephen Harris wrote: On Sun, Mar 09, 2014 at 11:28:07AM -0400, Digimer wrote: Would you mind elaborating on this? If a snapshot is a point-in-time image of a VM (or even normal FS), why would DB backups be at risk (assuming things like fsync are used)? I'm asking in general terms... no idea if this is something AWS specific. Database disk snapshots may include transactions in flight and the on-disk image may not be in a consistent state. Databases such as Oracle try to work around this by ensuring that writes occur in a specific order and have a good recovery process (each data file has a change number; determine the best change number to start from, roll forward from there to recover, then roll back any incomplete transactions) but it's considered crash recovery and shouldn't be part of BAU activity. Other databases may not be so good at recovery (mysql?) and so you run the risk of database corruption if you need to restore the snapshot. If you rely on disk snapshots then it's recommended you do a proper db dump before the snapshot is taken, so that you can recover the database from the dump file and not the snapshot. Thanks for the reply, Stephen. I also replied to Nico, and my comments there can be directed to you, as well. :) -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? ___ CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt