Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 01/11/2013 01:08 PM, Vivek Goyal wrote: A signed /sbin/kexec would realistically have to be statically linked, at least in the short term; otherwise the libraries and ld.so would need verification as well. Yes. That's the expectation. Sign only statically linked exeutables which don't do any of dlopen() stuff either. In fact in the patch, I fail the exec() if signed executable has interpreter. As I said, though (and possibly not for kexec, that depends): in the long term we probably want a way to be able to sign all kinds binaries in the system. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 11, 2013 at 01:03:41PM -0800, H. Peter Anvin wrote: > On 01/11/2013 12:52 PM, Vivek Goyal wrote: > > > > Eric, > > > > In a private conversation, David Howells suggested why not pass kernel > > signature in a segment to kernel and kernel can do the verification. > > > > /sbin/kexec signature is verified by kernel at exec() time. Then > > /sbin/kexec just passes one signature segment (after regular segment) for > > each segment being loaded. The segments which don't have signature, > > are passed with section size 0. And signature passing behavior can be > > controlled by one new kexec flag. > > > > That way /sbin/kexec does not have to worry about doing any verification > > by itself. In fact, I am not sure how it can do the verification when > > crypto libraries it will need are not signed (assuming they are not > > statically linked in). > > > > What do you think about this idea? > > > > A signed /sbin/kexec would realistically have to be statically linked, > at least in the short term; otherwise the libraries and ld.so would need > verification as well. Yes. That's the expectation. Sign only statically linked exeutables which don't do any of dlopen() stuff either. In fact in the patch, I fail the exec() if signed executable has interpreter. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 01/11/2013 12:52 PM, Vivek Goyal wrote: > > Eric, > > In a private conversation, David Howells suggested why not pass kernel > signature in a segment to kernel and kernel can do the verification. > > /sbin/kexec signature is verified by kernel at exec() time. Then > /sbin/kexec just passes one signature segment (after regular segment) for > each segment being loaded. The segments which don't have signature, > are passed with section size 0. And signature passing behavior can be > controlled by one new kexec flag. > > That way /sbin/kexec does not have to worry about doing any verification > by itself. In fact, I am not sure how it can do the verification when > crypto libraries it will need are not signed (assuming they are not > statically linked in). > > What do you think about this idea? > A signed /sbin/kexec would realistically have to be statically linked, at least in the short term; otherwise the libraries and ld.so would need verification as well. Now, that *might* very well have some real value -- there are certainly users out there who would very much want only binaries signed with specific keys to get run on their system. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 11, 2013 at 12:26:56PM -0800, Eric W. Biederman wrote: [..] > Recently there is a desire to figure out how to /sbin/kexec support > signed kernel images. What will probably happen is to have a specially > trusted userspace application perform the verification. Sort of like > dom0 for the linux userspace. A few other ideas have been batted around > but none that have stuck. [ CC David Howells ] Eric, In a private conversation, David Howells suggested why not pass kernel signature in a segment to kernel and kernel can do the verification. /sbin/kexec signature is verified by kernel at exec() time. Then /sbin/kexec just passes one signature segment (after regular segment) for each segment being loaded. The segments which don't have signature, are passed with section size 0. And signature passing behavior can be controlled by one new kexec flag. That way /sbin/kexec does not have to worry about doing any verification by itself. In fact, I am not sure how it can do the verification when crypto libraries it will need are not signed (assuming they are not statically linked in). What do you think about this idea? Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 11, 2013 at 12:26:48PM -0800, H. Peter Anvin wrote: > > > >And there is nothing fancy to be done for EFI and SecureBoot? Or is > >that something that the kernel has to handle on its own (so somehow > >passing some certificates to somewhere). > > > > For EFI, no... other than passing the EFI parameters, which > apparently is *not* currently done (David Woodhouse is working on > it.) Secure boot is still a work in progress. For secureboot, as a first step in that direction, I just wrote some code to sign elf executable and be able to verify it in kernel upon exec(). I am soon planning to post RFC code (most likely next week). Hopefully we will be able to sign statically signed /sbin/kexec, give it extra capability (upon signature verification) to be able to call sys_exec(). Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
And there is nothing fancy to be done for EFI and SecureBoot? Or is that something that the kernel has to handle on its own (so somehow passing some certificates to somewhere). For EFI, no... other than passing the EFI parameters, which apparently is *not* currently done (David Woodhouse is working on it.) Secure boot is still a work in progress. -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
Konrad Rzeszutek Wilk writes: > On Thu, Jan 10, 2013 at 08:16:48PM -0800, Eric W. Biederman wrote: >> The basic kexec interface is. >> >> load ranges of virtual addresses physical addresses. >> jump to the physical address with identity mapped page tables. >> >> There are a few flags to allow for different usage scenarios like >> kexec on panic vs normal kexec. > > And there is nothing fancy to be done for EFI and SecureBoot? There is a mess with EFI. Reports are that EFI is a bug ridden pile, and people keep advocating that we make more and more EFI calls in the main kernel. There is an argument over set_virtual_mapping, which is a call that can be made only once which relocates the EFI code to a different address, which makes life inconvient for kexec. There is another argument that EFI doesn't actually work if you don't make the set_virtual_mapping call so we can't remove it and always use physical addresses. Frankly the only sane way to run a linux kernel under EFI is to scrape up the information needed to talk to the hardware directly and ignore EFI. That is what we have historically done in the face of BIOS madness and if anything the situation is worse with EFI, but it looks like we are going to have to learn that the hard way. Recently there is a desire to figure out how to /sbin/kexec support signed kernel images. What will probably happen is to have a specially trusted userspace application perform the verification. Sort of like dom0 for the linux userspace. A few other ideas have been batted around but none that have stuck. None of that is really about SecureBoot. It is all trusting the kernel binary but not trusting userspace. With SecureBoot being an excuse for coming up with a policy like that. It looks like the answer to SecureBoot at this point may simply be just reconfigure your BIOS or root Windows and EFI to get the hardware to do what you want. So the answer for looking forward for Xen dom0 is: A trusted /sbin/kexec won't require changes. The other suggest solution is a flag that says a specific chunk of the loaded image is a signature that the magic trust faires can verify. As long as you have a flag bit free you should be able to implement that policy if we ever implement it. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
David Vrabel writes: > On 11/01/13 13:22, Daniel Kiper wrote: >> On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote: >>> On 04/01/13 17:01, Daniel Kiper wrote: My .5 cents: - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload; probably we should introduce KEXEC_CMD_kexec_load2 and KEXEC_CMD_kexec_unload2; load should __LOAD__ kernel image and other things into hypervisor memory; >>> >>> Yes, but I don't see how we can easily support both ABIs easily. I'd be >>> in favour of replacing the existing hypercalls and requiring updated >>> kexec tools in dom0 (this isn't that different to requiring the correct >>> libxc in dom0). >> >> Why? Just define new strutures for new functions of kexec hypercall. >> That should suffice. > > The current hypervisor ABI depends on an internal kernel ABI (i.e., the > ABI provided by relocate_kernel). We do not want hypervisor internals > to be constrained by having to be compatible with kernel internals. I think this is violent agreement. A new call with new arguments seems agreed upon. The only question seems to be what happens to the old hypercall. Keeping the current deprecated hypercall with the current ABI and not updating it, or modifying the current hypercall to return the xen equivalant of -ENOSYS seems to be the only question. Certainly /sbin/kexec will only support the new hypercall once the support has merged. >> No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec) >> system is completly shutdown. Return form >> HYPERVISOR_kexec_op(KEXEC_CMD_kexec) >> would require to restore some kernel functionalities. It maybe impossible >> in some cases. Additionally, it means that some changes should be made >> in generic kexec code path. As I know kexec maintainers are very reluctant >> to make such things. > > Huh? There only needs to be a call to a new hypervisor_crash_kexec() > function (which would then call the Xen specific crash hypercall) at the > very beginning of crash_kexec(). If this returns the normal > crash/shutdown path is done (which could even include a guest kexec!). Can you imagine what crash_kexec would look like if every architecture would hard code their own little piece in there? The practical issue with changing crash_kexec is that you are hard coding Xen policy just before a jump to a piece of code whose purpose is to implement policy. >From a maintenance and code comprehension stand-ponit it is much cleaner to put the hypervisor_crash_kexec() hypercall into the code that is loaded with sys_kexec_load and is branched to by crash_kexec. I would have no problem with hard coding that behavior into /sbin/kexec in the case of Xen dom0. Having any code have different semantics when running under Xen is a maintenance nightmare, and why we are having the conversation years and years after the initial deployment of Xen. A tiny hard coded stub that calls a hypercall should work indefinitely with no one having to do anything. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 11, 2013 at 03:22:35PM +, David Vrabel wrote: > On 11/01/13 13:22, Daniel Kiper wrote: > > On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote: > >> On 04/01/13 17:01, Daniel Kiper wrote: > >>> My .5 cents: > >>> - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload; > >>> probably we should introduce KEXEC_CMD_kexec_load2 and > >>> KEXEC_CMD_kexec_unload2; > >>> load should __LOAD__ kernel image and other things into hypervisor > >>> memory; > >> > >> Yes, but I don't see how we can easily support both ABIs easily. I'd be > >> in favour of replacing the existing hypercalls and requiring updated > >> kexec tools in dom0 (this isn't that different to requiring the correct > >> libxc in dom0). > > > > Why? Just define new strutures for new functions of kexec hypercall. > > That should suffice. > > The current hypervisor ABI depends on an internal kernel ABI (i.e., the > ABI provided by relocate_kernel). We do not want hypervisor internals > to be constrained by having to be compatible with kernel internals. I agree. I did not sugest to stay with current interface. Old KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload should stay as is for backward compatibility (maybe someday they should be removed). However, I do not see any problem in adding new KEXEC_CMD_kexec_load2 and KEXEC_CMD_kexec_unload2 functions with completely new arguments to existing kexec hypercall. Let's say something like that: struct kexec_segment { void *buf; size_t bufsz; unsigned long mem; size_t memsz; }; struct xen_kexec_load2 { unsigned long entry; unsigned long nr_segments; struct kexec_segment *segments; unsigned long flags; }; struct xen_kexec_load2 xkl2; ... rc = HYPERVISOR_kexec_op(KEXEC_CMD_kexec_load2, ); Regarding relocate_kernel(), it should be Xen hypervisor specific but probably most of the code will be similar to its Linux Kernel version. It should only at the end leave machine in state identical with state left by Linux Kernel version of relocate_kernel(). Just to be compatible with existing kexec/kdump implementations. > >>> probably we should introduce KEXEC_CMD_kexec_load2 and KEXEC_CMD_k > > >>> - Hmmm... Now I think that we should still use kexec syscall to load > >>> image > >>> into Xen memory (with new KEXEC_CMD_kexec_load2) because it > >>> establishes > >>> all things which are needed to call kdump if dom0 crashes; however, > >>> I could be wrong... > >> > >> I don't think we need the kexec syscall. The kernel can unconditionally > >> do the crash hypercall, which will return if the kdump kernel isn't > >> loaded and the kernel can fall back to the regular non-kexec panic. > > > > No, please do not do that. When you call > > HYPERVISOR_kexec_op(KEXEC_CMD_kexec) > > system is completly shutdown. Return form > > HYPERVISOR_kexec_op(KEXEC_CMD_kexec) > > would require to restore some kernel functionalities. It maybe impossible > > in some cases. Additionally, it means that some changes should be made > > in generic kexec code path. As I know kexec maintainers are very reluctant > > to make such things. > > Huh? There only needs to be a call to a new hypervisor_crash_kexec() > function (which would then call the Xen specific crash hypercall) at the > very beginning of crash_kexec(). If this returns the normal > crash/shutdown path is done (which could even include a guest kexec!). I am still not convinced. Howerver, go ahead with your vision in this case. Later we will see it makes sense. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Thu, Jan 10, 2013 at 08:16:48PM -0800, Eric W. Biederman wrote: > Konrad Rzeszutek Wilk writes: > > > On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote: > >> I think that new kexec hypercall function should mimics kexec syscall. > >> It means that all arguments passed to hypercall should have same types > >> if it is possible or if it is not possible then conversion should be done > >> in very easy way. Additionally, I think that one call of new hypercall > >> load function should load all needed thinks in right place and > >> return relevant status. Last but not least, new functionality should > > > > We are not restricted to just _one_ hypercall. And this loading > > thing could be similar to the micrcode hypercall - which just points > > to a virtual address along with the length - and says 'load me'. > > > >> be available through /dev/xen/privcmd or directly from kernel without > >> bigger effort. > > > > Perhaps we should have a email thread on xen-devel where we hash out > > some ideas. Eric, would you be OK included on this - it would make > > sense for this mechanism to be as future-proof as possible - and I am not > > sure what your plans for kexec are in the future? > > The basic kexec interface is. > > load ranges of virtual addresses physical addresses. > jump to the physical address with identity mapped page tables. > > There are a few flags to allow for different usage scenarios like > kexec on panic vs normal kexec. And there is nothing fancy to be done for EFI and SecureBoot? Or is that something that the kernel has to handle on its own (so somehow passing some certificates to somewhere). > > It is very very simple and very extensible. All of the weird glue > happens in userspace. > > Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 11/01/13 13:22, Daniel Kiper wrote: > On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote: >> On 04/01/13 17:01, Daniel Kiper wrote: >>> My .5 cents: >>> - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload; >>> probably we should introduce KEXEC_CMD_kexec_load2 and >>> KEXEC_CMD_kexec_unload2; >>> load should __LOAD__ kernel image and other things into hypervisor >>> memory; >> >> Yes, but I don't see how we can easily support both ABIs easily. I'd be >> in favour of replacing the existing hypercalls and requiring updated >> kexec tools in dom0 (this isn't that different to requiring the correct >> libxc in dom0). > > Why? Just define new strutures for new functions of kexec hypercall. > That should suffice. The current hypervisor ABI depends on an internal kernel ABI (i.e., the ABI provided by relocate_kernel). We do not want hypervisor internals to be constrained by having to be compatible with kernel internals. >>> - Hmmm... Now I think that we should still use kexec syscall to load image >>> into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes >>> all things which are needed to call kdump if dom0 crashes; however, >>> I could be wrong... >> >> I don't think we need the kexec syscall. The kernel can unconditionally >> do the crash hypercall, which will return if the kdump kernel isn't >> loaded and the kernel can fall back to the regular non-kexec panic. > > No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec) > system is completly shutdown. Return form HYPERVISOR_kexec_op(KEXEC_CMD_kexec) > would require to restore some kernel functionalities. It maybe impossible > in some cases. Additionally, it means that some changes should be made > in generic kexec code path. As I know kexec maintainers are very reluctant > to make such things. Huh? There only needs to be a call to a new hypervisor_crash_kexec() function (which would then call the Xen specific crash hypercall) at the very beginning of crash_kexec(). If this returns the normal crash/shutdown path is done (which could even include a guest kexec!). David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Mon, Jan 07, 2013 at 01:49:44PM +, Ian Campbell wrote: > On Mon, 2013-01-07 at 12:34 +, Daniel Kiper wrote: > > I think that new kexec hypercall function should mimics kexec syscall. > > We want to have an interface can be used by non-Linux domains (both dom0 > and domU) as well though, so please bear this in mind. I agree, but all arguments passed to kexec syscall are quiet generic and they do not impose any limitations. Just look into include/linux/kexec.h. That is why I think that a lot of things could be taken from Linux kexec implementation. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote: > On 04/01/13 17:01, Daniel Kiper wrote: > > On Fri, Jan 04, 2013 at 02:38:44PM +, David Vrabel wrote: > >> On 04/01/13 14:22, Daniel Kiper wrote: > >>> On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: > On 27/12/12 18:02, Eric W. Biederman wrote: > > Andrew Cooper writes: > > > >> On 27/12/2012 07:53, Eric W. Biederman wrote: > >>> The syscall ABI still has the wrong semantics. > >>> > >>> Aka totally unmaintainable and umergeable. > >>> > >>> The concept of domU support is also strange. What does domU support > >>> even mean, when the dom0 support is loading a kernel to pick up Xen > >>> when Xen falls over. > >> There are two requirements pulling at this patch series, but I agree > >> that we need to clarify them. > > It probably make sense to split them apart a little even. > > > > > > Thinking about this split, there might be a way to simply it even more. > > /sbin/kexec can load the "Xen" crash kernel itself by issuing > hypercalls using /dev/xen/privcmd. This would remove the need for > the dom0 kernel to distinguish between loading a crash kernel for > itself and loading a kernel for Xen. > > Or is this just a silly idea complicating the matter? > >>> > >>> This is impossible with current Xen kexec/kdump interface. > >>> It should be changed to do that. However, I suppose that > >>> Xen community would not be interested in such changes. > >> > >> I don't see why the hypercall ABI cannot be extended with new sub-ops > >> that do the right thing -- the existing ABI is a bit weird. > >> > >> I plan to start prototyping something shortly (hopefully next week) for > >> the Xen kexec case. > > > > Wow... As I can this time Xen community is interested in... > > That is great. I agree that current kexec interface is not ideal. > > I spent some more time looking at the existing interface and > implementation and it really is broken. > > > David, I am happy to help in that process. However, if you wish I could > > carry it myself. Anyway, it looks that I should hold on with my > > Linux kexec/kdump patches. > > I should be able to post some prototype patches for Xen in a few weeks. > No guarantees though. That is great. If you need any help drop me a line. > > My .5 cents: > > - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload; > > probably we should introduce KEXEC_CMD_kexec_load2 and > > KEXEC_CMD_kexec_unload2; > > load should __LOAD__ kernel image and other things into hypervisor > > memory; > > Yes, but I don't see how we can easily support both ABIs easily. I'd be > in favour of replacing the existing hypercalls and requiring updated > kexec tools in dom0 (this isn't that different to requiring the correct > libxc in dom0). Why? Just define new strutures for new functions of kexec hypercall. That should suffice. > > I suppose that allmost all things could be copied from > > linux/kernel/kexec.c, > > > > linux/arch/x86/kernel/{machine_kexec_$(BITS).c,relocate_kernel_$(BITS).c}; > > I think that KEXEC_CMD_kexec should stay as is, > > I don't think we want all the junk from Linux inside Xen -- we only want > to support the kdump case and do not have to handle returning from the > kexec image. I do not want to implement kexec jump or stuff like. However, I think that it is worth use code which could be used. As I know there are lot of stuff which was taken with smaller or bigger changes from Linux Kernel. Why we would like to reinvent the wheel this time? Additionally, we should not drop kexec support. It is main part of kdump. In case of kdump new kernel (and other stuff) is placed in prealocated space in contrary to kexec. That's all. kexec is useful if you would like to quickly (skipping BIOS) switch from Xen to baremetal Linux. If you drop kexec support from Xen then you need alter kexec-tools package in bunch of distros to take into account new Xen behavior. I think that it is not we want to do. > > - Hmmm... Now I think that we should still use kexec syscall to load image > > into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes > > all things which are needed to call kdump if dom0 crashes; however, > > I could be wrong... > > I don't think we need the kexec syscall. The kernel can unconditionally > do the crash hypercall, which will return if the kdump kernel isn't > loaded and the kernel can fall back to the regular non-kexec panic. No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec) system is completly shutdown. Return form HYPERVISOR_kexec_op(KEXEC_CMD_kexec) would require to restore some kernel functionalities. It maybe impossible in some cases. Additionally, it means that some changes should be made in generic kexec code path. As I know kexec maintainers are very reluctant to make such
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote: On 04/01/13 17:01, Daniel Kiper wrote: On Fri, Jan 04, 2013 at 02:38:44PM +, David Vrabel wrote: On 04/01/13 14:22, Daniel Kiper wrote: On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: On 27/12/12 18:02, Eric W. Biederman wrote: Andrew Cooperandrew.coop...@citrix.com writes: On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. It probably make sense to split them apart a little even. Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. It should be changed to do that. However, I suppose that Xen community would not be interested in such changes. I don't see why the hypercall ABI cannot be extended with new sub-ops that do the right thing -- the existing ABI is a bit weird. I plan to start prototyping something shortly (hopefully next week) for the Xen kexec case. Wow... As I can this time Xen community is interested in... That is great. I agree that current kexec interface is not ideal. I spent some more time looking at the existing interface and implementation and it really is broken. David, I am happy to help in that process. However, if you wish I could carry it myself. Anyway, it looks that I should hold on with my Linux kexec/kdump patches. I should be able to post some prototype patches for Xen in a few weeks. No guarantees though. That is great. If you need any help drop me a line. My .5 cents: - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload; probably we should introduce KEXEC_CMD_kexec_load2 and KEXEC_CMD_kexec_unload2; load should __LOAD__ kernel image and other things into hypervisor memory; Yes, but I don't see how we can easily support both ABIs easily. I'd be in favour of replacing the existing hypercalls and requiring updated kexec tools in dom0 (this isn't that different to requiring the correct libxc in dom0). Why? Just define new strutures for new functions of kexec hypercall. That should suffice. I suppose that allmost all things could be copied from linux/kernel/kexec.c, linux/arch/x86/kernel/{machine_kexec_$(BITS).c,relocate_kernel_$(BITS).c}; I think that KEXEC_CMD_kexec should stay as is, I don't think we want all the junk from Linux inside Xen -- we only want to support the kdump case and do not have to handle returning from the kexec image. I do not want to implement kexec jump or stuff like. However, I think that it is worth use code which could be used. As I know there are lot of stuff which was taken with smaller or bigger changes from Linux Kernel. Why we would like to reinvent the wheel this time? Additionally, we should not drop kexec support. It is main part of kdump. In case of kdump new kernel (and other stuff) is placed in prealocated space in contrary to kexec. That's all. kexec is useful if you would like to quickly (skipping BIOS) switch from Xen to baremetal Linux. If you drop kexec support from Xen then you need alter kexec-tools package in bunch of distros to take into account new Xen behavior. I think that it is not we want to do. - Hmmm... Now I think that we should still use kexec syscall to load image into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes all things which are needed to call kdump if dom0 crashes; however, I could be wrong... I don't think we need the kexec syscall. The kernel can unconditionally do the crash hypercall, which will return if the kdump kernel isn't loaded and the kernel can fall back to the regular non-kexec panic. No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec) system is completly shutdown. Return form HYPERVISOR_kexec_op(KEXEC_CMD_kexec) would require to restore some kernel functionalities. It maybe impossible in some cases. Additionally, it means that some changes should be made in generic kexec code path. As I know kexec maintainers are very reluctant to make such things. This will allow the kexec syscall to be used only for the domU kexec case. - last but not least, we should think about support for PV guests too. I won't be looking at this. OK. To avoid confusion about the two largely
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Mon, Jan 07, 2013 at 01:49:44PM +, Ian Campbell wrote: On Mon, 2013-01-07 at 12:34 +, Daniel Kiper wrote: I think that new kexec hypercall function should mimics kexec syscall. We want to have an interface can be used by non-Linux domains (both dom0 and domU) as well though, so please bear this in mind. I agree, but all arguments passed to kexec syscall are quiet generic and they do not impose any limitations. Just look into include/linux/kexec.h. That is why I think that a lot of things could be taken from Linux kexec implementation. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 11/01/13 13:22, Daniel Kiper wrote: On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote: On 04/01/13 17:01, Daniel Kiper wrote: My .5 cents: - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload; probably we should introduce KEXEC_CMD_kexec_load2 and KEXEC_CMD_kexec_unload2; load should __LOAD__ kernel image and other things into hypervisor memory; Yes, but I don't see how we can easily support both ABIs easily. I'd be in favour of replacing the existing hypercalls and requiring updated kexec tools in dom0 (this isn't that different to requiring the correct libxc in dom0). Why? Just define new strutures for new functions of kexec hypercall. That should suffice. The current hypervisor ABI depends on an internal kernel ABI (i.e., the ABI provided by relocate_kernel). We do not want hypervisor internals to be constrained by having to be compatible with kernel internals. - Hmmm... Now I think that we should still use kexec syscall to load image into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes all things which are needed to call kdump if dom0 crashes; however, I could be wrong... I don't think we need the kexec syscall. The kernel can unconditionally do the crash hypercall, which will return if the kdump kernel isn't loaded and the kernel can fall back to the regular non-kexec panic. No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec) system is completly shutdown. Return form HYPERVISOR_kexec_op(KEXEC_CMD_kexec) would require to restore some kernel functionalities. It maybe impossible in some cases. Additionally, it means that some changes should be made in generic kexec code path. As I know kexec maintainers are very reluctant to make such things. Huh? There only needs to be a call to a new hypervisor_crash_kexec() function (which would then call the Xen specific crash hypercall) at the very beginning of crash_kexec(). If this returns the normal crash/shutdown path is done (which could even include a guest kexec!). David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Thu, Jan 10, 2013 at 08:16:48PM -0800, Eric W. Biederman wrote: Konrad Rzeszutek Wilk konrad.w...@oracle.com writes: On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote: I think that new kexec hypercall function should mimics kexec syscall. It means that all arguments passed to hypercall should have same types if it is possible or if it is not possible then conversion should be done in very easy way. Additionally, I think that one call of new hypercall load function should load all needed thinks in right place and return relevant status. Last but not least, new functionality should We are not restricted to just _one_ hypercall. And this loading thing could be similar to the micrcode hypercall - which just points to a virtual address along with the length - and says 'load me'. be available through /dev/xen/privcmd or directly from kernel without bigger effort. Perhaps we should have a email thread on xen-devel where we hash out some ideas. Eric, would you be OK included on this - it would make sense for this mechanism to be as future-proof as possible - and I am not sure what your plans for kexec are in the future? The basic kexec interface is. load ranges of virtual addresses physical addresses. jump to the physical address with identity mapped page tables. There are a few flags to allow for different usage scenarios like kexec on panic vs normal kexec. And there is nothing fancy to be done for EFI and SecureBoot? Or is that something that the kernel has to handle on its own (so somehow passing some certificates to somewhere). It is very very simple and very extensible. All of the weird glue happens in userspace. Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 11, 2013 at 03:22:35PM +, David Vrabel wrote: On 11/01/13 13:22, Daniel Kiper wrote: On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote: On 04/01/13 17:01, Daniel Kiper wrote: My .5 cents: - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload; probably we should introduce KEXEC_CMD_kexec_load2 and KEXEC_CMD_kexec_unload2; load should __LOAD__ kernel image and other things into hypervisor memory; Yes, but I don't see how we can easily support both ABIs easily. I'd be in favour of replacing the existing hypercalls and requiring updated kexec tools in dom0 (this isn't that different to requiring the correct libxc in dom0). Why? Just define new strutures for new functions of kexec hypercall. That should suffice. The current hypervisor ABI depends on an internal kernel ABI (i.e., the ABI provided by relocate_kernel). We do not want hypervisor internals to be constrained by having to be compatible with kernel internals. I agree. I did not sugest to stay with current interface. Old KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload should stay as is for backward compatibility (maybe someday they should be removed). However, I do not see any problem in adding new KEXEC_CMD_kexec_load2 and KEXEC_CMD_kexec_unload2 functions with completely new arguments to existing kexec hypercall. Let's say something like that: struct kexec_segment { void *buf; size_t bufsz; unsigned long mem; size_t memsz; }; struct xen_kexec_load2 { unsigned long entry; unsigned long nr_segments; struct kexec_segment *segments; unsigned long flags; }; struct xen_kexec_load2 xkl2; ... rc = HYPERVISOR_kexec_op(KEXEC_CMD_kexec_load2, xkl2); Regarding relocate_kernel(), it should be Xen hypervisor specific but probably most of the code will be similar to its Linux Kernel version. It should only at the end leave machine in state identical with state left by Linux Kernel version of relocate_kernel(). Just to be compatible with existing kexec/kdump implementations. probably we should introduce KEXEC_CMD_kexec_load2 and KEXEC_CMD_k - Hmmm... Now I think that we should still use kexec syscall to load image into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes all things which are needed to call kdump if dom0 crashes; however, I could be wrong... I don't think we need the kexec syscall. The kernel can unconditionally do the crash hypercall, which will return if the kdump kernel isn't loaded and the kernel can fall back to the regular non-kexec panic. No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec) system is completly shutdown. Return form HYPERVISOR_kexec_op(KEXEC_CMD_kexec) would require to restore some kernel functionalities. It maybe impossible in some cases. Additionally, it means that some changes should be made in generic kexec code path. As I know kexec maintainers are very reluctant to make such things. Huh? There only needs to be a call to a new hypervisor_crash_kexec() function (which would then call the Xen specific crash hypercall) at the very beginning of crash_kexec(). If this returns the normal crash/shutdown path is done (which could even include a guest kexec!). I am still not convinced. Howerver, go ahead with your vision in this case. Later we will see it makes sense. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
David Vrabel david.vra...@citrix.com writes: On 11/01/13 13:22, Daniel Kiper wrote: On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote: On 04/01/13 17:01, Daniel Kiper wrote: My .5 cents: - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload; probably we should introduce KEXEC_CMD_kexec_load2 and KEXEC_CMD_kexec_unload2; load should __LOAD__ kernel image and other things into hypervisor memory; Yes, but I don't see how we can easily support both ABIs easily. I'd be in favour of replacing the existing hypercalls and requiring updated kexec tools in dom0 (this isn't that different to requiring the correct libxc in dom0). Why? Just define new strutures for new functions of kexec hypercall. That should suffice. The current hypervisor ABI depends on an internal kernel ABI (i.e., the ABI provided by relocate_kernel). We do not want hypervisor internals to be constrained by having to be compatible with kernel internals. I think this is violent agreement. A new call with new arguments seems agreed upon. The only question seems to be what happens to the old hypercall. Keeping the current deprecated hypercall with the current ABI and not updating it, or modifying the current hypercall to return the xen equivalant of -ENOSYS seems to be the only question. Certainly /sbin/kexec will only support the new hypercall once the support has merged. No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec) system is completly shutdown. Return form HYPERVISOR_kexec_op(KEXEC_CMD_kexec) would require to restore some kernel functionalities. It maybe impossible in some cases. Additionally, it means that some changes should be made in generic kexec code path. As I know kexec maintainers are very reluctant to make such things. Huh? There only needs to be a call to a new hypervisor_crash_kexec() function (which would then call the Xen specific crash hypercall) at the very beginning of crash_kexec(). If this returns the normal crash/shutdown path is done (which could even include a guest kexec!). Can you imagine what crash_kexec would look like if every architecture would hard code their own little piece in there? The practical issue with changing crash_kexec is that you are hard coding Xen policy just before a jump to a piece of code whose purpose is to implement policy. From a maintenance and code comprehension stand-ponit it is much cleaner to put the hypervisor_crash_kexec() hypercall into the code that is loaded with sys_kexec_load and is branched to by crash_kexec. I would have no problem with hard coding that behavior into /sbin/kexec in the case of Xen dom0. Having any code have different semantics when running under Xen is a maintenance nightmare, and why we are having the conversation years and years after the initial deployment of Xen. A tiny hard coded stub that calls a hypercall should work indefinitely with no one having to do anything. Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
Konrad Rzeszutek Wilk konrad.w...@oracle.com writes: On Thu, Jan 10, 2013 at 08:16:48PM -0800, Eric W. Biederman wrote: The basic kexec interface is. load ranges of virtual addresses physical addresses. jump to the physical address with identity mapped page tables. There are a few flags to allow for different usage scenarios like kexec on panic vs normal kexec. And there is nothing fancy to be done for EFI and SecureBoot? There is a mess with EFI. Reports are that EFI is a bug ridden pile, and people keep advocating that we make more and more EFI calls in the main kernel. There is an argument over set_virtual_mapping, which is a call that can be made only once which relocates the EFI code to a different address, which makes life inconvient for kexec. There is another argument that EFI doesn't actually work if you don't make the set_virtual_mapping call so we can't remove it and always use physical addresses. Frankly the only sane way to run a linux kernel under EFI is to scrape up the information needed to talk to the hardware directly and ignore EFI. That is what we have historically done in the face of BIOS madness and if anything the situation is worse with EFI, but it looks like we are going to have to learn that the hard way. Recently there is a desire to figure out how to /sbin/kexec support signed kernel images. What will probably happen is to have a specially trusted userspace application perform the verification. Sort of like dom0 for the linux userspace. A few other ideas have been batted around but none that have stuck. None of that is really about SecureBoot. It is all trusting the kernel binary but not trusting userspace. With SecureBoot being an excuse for coming up with a policy like that. It looks like the answer to SecureBoot at this point may simply be just reconfigure your BIOS or root Windows and EFI to get the hardware to do what you want. So the answer for looking forward for Xen dom0 is: A trusted /sbin/kexec won't require changes. The other suggest solution is a flag that says a specific chunk of the loaded image is a signature that the magic trust faires can verify. As long as you have a flag bit free you should be able to implement that policy if we ever implement it. Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
And there is nothing fancy to be done for EFI and SecureBoot? Or is that something that the kernel has to handle on its own (so somehow passing some certificates to somewhere). For EFI, no... other than passing the EFI parameters, which apparently is *not* currently done (David Woodhouse is working on it.) Secure boot is still a work in progress. -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 11, 2013 at 12:26:48PM -0800, H. Peter Anvin wrote: And there is nothing fancy to be done for EFI and SecureBoot? Or is that something that the kernel has to handle on its own (so somehow passing some certificates to somewhere). For EFI, no... other than passing the EFI parameters, which apparently is *not* currently done (David Woodhouse is working on it.) Secure boot is still a work in progress. For secureboot, as a first step in that direction, I just wrote some code to sign elf executable and be able to verify it in kernel upon exec(). I am soon planning to post RFC code (most likely next week). Hopefully we will be able to sign statically signed /sbin/kexec, give it extra capability (upon signature verification) to be able to call sys_exec(). Thanks Vivek -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 11, 2013 at 12:26:56PM -0800, Eric W. Biederman wrote: [..] Recently there is a desire to figure out how to /sbin/kexec support signed kernel images. What will probably happen is to have a specially trusted userspace application perform the verification. Sort of like dom0 for the linux userspace. A few other ideas have been batted around but none that have stuck. [ CC David Howells ] Eric, In a private conversation, David Howells suggested why not pass kernel signature in a segment to kernel and kernel can do the verification. /sbin/kexec signature is verified by kernel at exec() time. Then /sbin/kexec just passes one signature segment (after regular segment) for each segment being loaded. The segments which don't have signature, are passed with section size 0. And signature passing behavior can be controlled by one new kexec flag. That way /sbin/kexec does not have to worry about doing any verification by itself. In fact, I am not sure how it can do the verification when crypto libraries it will need are not signed (assuming they are not statically linked in). What do you think about this idea? Thanks Vivek -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 01/11/2013 12:52 PM, Vivek Goyal wrote: Eric, In a private conversation, David Howells suggested why not pass kernel signature in a segment to kernel and kernel can do the verification. /sbin/kexec signature is verified by kernel at exec() time. Then /sbin/kexec just passes one signature segment (after regular segment) for each segment being loaded. The segments which don't have signature, are passed with section size 0. And signature passing behavior can be controlled by one new kexec flag. That way /sbin/kexec does not have to worry about doing any verification by itself. In fact, I am not sure how it can do the verification when crypto libraries it will need are not signed (assuming they are not statically linked in). What do you think about this idea? A signed /sbin/kexec would realistically have to be statically linked, at least in the short term; otherwise the libraries and ld.so would need verification as well. Now, that *might* very well have some real value -- there are certainly users out there who would very much want only binaries signed with specific keys to get run on their system. -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 11, 2013 at 01:03:41PM -0800, H. Peter Anvin wrote: On 01/11/2013 12:52 PM, Vivek Goyal wrote: Eric, In a private conversation, David Howells suggested why not pass kernel signature in a segment to kernel and kernel can do the verification. /sbin/kexec signature is verified by kernel at exec() time. Then /sbin/kexec just passes one signature segment (after regular segment) for each segment being loaded. The segments which don't have signature, are passed with section size 0. And signature passing behavior can be controlled by one new kexec flag. That way /sbin/kexec does not have to worry about doing any verification by itself. In fact, I am not sure how it can do the verification when crypto libraries it will need are not signed (assuming they are not statically linked in). What do you think about this idea? A signed /sbin/kexec would realistically have to be statically linked, at least in the short term; otherwise the libraries and ld.so would need verification as well. Yes. That's the expectation. Sign only statically linked exeutables which don't do any of dlopen() stuff either. In fact in the patch, I fail the exec() if signed executable has interpreter. Thanks Vivek -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 01/11/2013 01:08 PM, Vivek Goyal wrote: A signed /sbin/kexec would realistically have to be statically linked, at least in the short term; otherwise the libraries and ld.so would need verification as well. Yes. That's the expectation. Sign only statically linked exeutables which don't do any of dlopen() stuff either. In fact in the patch, I fail the exec() if signed executable has interpreter. As I said, though (and possibly not for kexec, that depends): in the long term we probably want a way to be able to sign all kinds binaries in the system. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
Konrad Rzeszutek Wilk writes: > On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote: >> I think that new kexec hypercall function should mimics kexec syscall. >> It means that all arguments passed to hypercall should have same types >> if it is possible or if it is not possible then conversion should be done >> in very easy way. Additionally, I think that one call of new hypercall >> load function should load all needed thinks in right place and >> return relevant status. Last but not least, new functionality should > > We are not restricted to just _one_ hypercall. And this loading > thing could be similar to the micrcode hypercall - which just points > to a virtual address along with the length - and says 'load me'. > >> be available through /dev/xen/privcmd or directly from kernel without >> bigger effort. > > Perhaps we should have a email thread on xen-devel where we hash out > some ideas. Eric, would you be OK included on this - it would make > sense for this mechanism to be as future-proof as possible - and I am not > sure what your plans for kexec are in the future? The basic kexec interface is. load ranges of virtual addresses physical addresses. jump to the physical address with identity mapped page tables. There are a few flags to allow for different usage scenarios like kexec on panic vs normal kexec. It is very very simple and very extensible. All of the weird glue happens in userspace. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 04/01/13 17:01, Daniel Kiper wrote: > On Fri, Jan 04, 2013 at 02:38:44PM +, David Vrabel wrote: >> On 04/01/13 14:22, Daniel Kiper wrote: >>> On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: On 27/12/12 18:02, Eric W. Biederman wrote: > Andrew Cooper writes: > >> On 27/12/2012 07:53, Eric W. Biederman wrote: >>> The syscall ABI still has the wrong semantics. >>> >>> Aka totally unmaintainable and umergeable. >>> >>> The concept of domU support is also strange. What does domU support >>> even mean, when the dom0 support is loading a kernel to pick up Xen >>> when Xen falls over. >> There are two requirements pulling at this patch series, but I agree >> that we need to clarify them. > It probably make sense to split them apart a little even. > > Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the "Xen" crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? >>> >>> This is impossible with current Xen kexec/kdump interface. >>> It should be changed to do that. However, I suppose that >>> Xen community would not be interested in such changes. >> >> I don't see why the hypercall ABI cannot be extended with new sub-ops >> that do the right thing -- the existing ABI is a bit weird. >> >> I plan to start prototyping something shortly (hopefully next week) for >> the Xen kexec case. > > Wow... As I can this time Xen community is interested in... > That is great. I agree that current kexec interface is not ideal. I spent some more time looking at the existing interface and implementation and it really is broken. > David, I am happy to help in that process. However, if you wish I could > carry it myself. Anyway, it looks that I should hold on with my > Linux kexec/kdump patches. I should be able to post some prototype patches for Xen in a few weeks. No guarantees though. > My .5 cents: > - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload; > probably we should introduce KEXEC_CMD_kexec_load2 and > KEXEC_CMD_kexec_unload2; > load should __LOAD__ kernel image and other things into hypervisor memory; Yes, but I don't see how we can easily support both ABIs easily. I'd be in favour of replacing the existing hypercalls and requiring updated kexec tools in dom0 (this isn't that different to requiring the correct libxc in dom0). > I suppose that allmost all things could be copied from > linux/kernel/kexec.c, > linux/arch/x86/kernel/{machine_kexec_$(BITS).c,relocate_kernel_$(BITS).c}; > I think that KEXEC_CMD_kexec should stay as is, I don't think we want all the junk from Linux inside Xen -- we only want to support the kdump case and do not have to handle returning from the kexec image. > - Hmmm... Now I think that we should still use kexec syscall to load image > into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes > all things which are needed to call kdump if dom0 crashes; however, > I could be wrong... I don't think we need the kexec syscall. The kernel can unconditionally do the crash hypercall, which will return if the kdump kernel isn't loaded and the kernel can fall back to the regular non-kexec panic. This will allow the kexec syscall to be used only for the domU kexec case. > - last but not least, we should think about support for PV guests > too. I won't be looking at this. To avoid confusion about the two largely orthogonal sorts of kexec how about defining some terms. I suggest: Xen kexec: Xen executes the image in response to a Xen crash or a hypercall from a privileged domain. Guest kexec: The guest kernel executes the images within the domain in response to a guest kernel crash or a system call. David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
Konrad Rzeszutek Wilk konrad.w...@oracle.com writes: On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote: I think that new kexec hypercall function should mimics kexec syscall. It means that all arguments passed to hypercall should have same types if it is possible or if it is not possible then conversion should be done in very easy way. Additionally, I think that one call of new hypercall load function should load all needed thinks in right place and return relevant status. Last but not least, new functionality should We are not restricted to just _one_ hypercall. And this loading thing could be similar to the micrcode hypercall - which just points to a virtual address along with the length - and says 'load me'. be available through /dev/xen/privcmd or directly from kernel without bigger effort. Perhaps we should have a email thread on xen-devel where we hash out some ideas. Eric, would you be OK included on this - it would make sense for this mechanism to be as future-proof as possible - and I am not sure what your plans for kexec are in the future? The basic kexec interface is. load ranges of virtual addresses physical addresses. jump to the physical address with identity mapped page tables. There are a few flags to allow for different usage scenarios like kexec on panic vs normal kexec. It is very very simple and very extensible. All of the weird glue happens in userspace. Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 04/01/13 17:01, Daniel Kiper wrote: On Fri, Jan 04, 2013 at 02:38:44PM +, David Vrabel wrote: On 04/01/13 14:22, Daniel Kiper wrote: On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: On 27/12/12 18:02, Eric W. Biederman wrote: Andrew Cooperandrew.coop...@citrix.com writes: On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. It probably make sense to split them apart a little even. Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. It should be changed to do that. However, I suppose that Xen community would not be interested in such changes. I don't see why the hypercall ABI cannot be extended with new sub-ops that do the right thing -- the existing ABI is a bit weird. I plan to start prototyping something shortly (hopefully next week) for the Xen kexec case. Wow... As I can this time Xen community is interested in... That is great. I agree that current kexec interface is not ideal. I spent some more time looking at the existing interface and implementation and it really is broken. David, I am happy to help in that process. However, if you wish I could carry it myself. Anyway, it looks that I should hold on with my Linux kexec/kdump patches. I should be able to post some prototype patches for Xen in a few weeks. No guarantees though. My .5 cents: - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload; probably we should introduce KEXEC_CMD_kexec_load2 and KEXEC_CMD_kexec_unload2; load should __LOAD__ kernel image and other things into hypervisor memory; Yes, but I don't see how we can easily support both ABIs easily. I'd be in favour of replacing the existing hypercalls and requiring updated kexec tools in dom0 (this isn't that different to requiring the correct libxc in dom0). I suppose that allmost all things could be copied from linux/kernel/kexec.c, linux/arch/x86/kernel/{machine_kexec_$(BITS).c,relocate_kernel_$(BITS).c}; I think that KEXEC_CMD_kexec should stay as is, I don't think we want all the junk from Linux inside Xen -- we only want to support the kdump case and do not have to handle returning from the kexec image. - Hmmm... Now I think that we should still use kexec syscall to load image into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes all things which are needed to call kdump if dom0 crashes; however, I could be wrong... I don't think we need the kexec syscall. The kernel can unconditionally do the crash hypercall, which will return if the kdump kernel isn't loaded and the kernel can fall back to the regular non-kexec panic. This will allow the kexec syscall to be used only for the domU kexec case. - last but not least, we should think about support for PV guests too. I won't be looking at this. To avoid confusion about the two largely orthogonal sorts of kexec how about defining some terms. I suggest: Xen kexec: Xen executes the image in response to a Xen crash or a hypercall from a privileged domain. Guest kexec: The guest kernel executes the images within the domain in response to a guest kernel crash or a system call. David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote: > On Fri, Jan 04, 2013 at 02:11:46PM -0500, Konrad Rzeszutek Wilk wrote: > > On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote: > > > On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote: > > > > >>> On 04.01.13 at 15:22, Daniel Kiper wrote: > > > > > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: > > > > >> /sbin/kexec can load the "Xen" crash kernel itself by issuing > > > > >> hypercalls using /dev/xen/privcmd. This would remove the need for > > > > >> the dom0 kernel to distinguish between loading a crash kernel for > > > > >> itself and loading a kernel for Xen. > > > > >> > > > > >> Or is this just a silly idea complicating the matter? > > > > > > > > > > This is impossible with current Xen kexec/kdump interface. > > > > > > > > Why? > > > > > > Because current KEXEC_CMD_kexec_load does not load kernel > > > image and other things into Xen memory. It means that it > > > should live somewhere in dom0 Linux kernel memory. > > > > We could have a very simple hypercall which would have: > > > > struct fancy_new_hypercall { > > xen_pfn_t payload; // IN > > ssize_t len; // IN > > #define DATA (1<<1) > > #define DATA_EOF (1<<2) > > #define DATA_KERNEL (1<<3) > > #define DATA_RAMDISK (1<<4) > > unsigned int flags; // IN > > unsigned int status; // OUT > > }; > > > > which would in a loop just iterate over the payloads and > > let the hypervisor stick it in the crashkernel space. > > > > This is all hand-waving of course. There probably would be a need > > to figure out how much space you have in the reserved Xen's > > 'crashkernel' memory region too. > > I think that new kexec hypercall function should mimics kexec syscall. > It means that all arguments passed to hypercall should have same types > if it is possible or if it is not possible then conversion should be done > in very easy way. Additionally, I think that one call of new hypercall > load function should load all needed thinks in right place and > return relevant status. Last but not least, new functionality should We are not restricted to just _one_ hypercall. And this loading thing could be similar to the micrcode hypercall - which just points to a virtual address along with the length - and says 'load me'. > be available through /dev/xen/privcmd or directly from kernel without > bigger effort. Perhaps we should have a email thread on xen-devel where we hash out some ideas. Eric, would you be OK included on this - it would make sense for this mechanism to be as future-proof as possible - and I am not sure what your plans for kexec are in the future? > > Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Mon, 2013-01-07 at 12:34 +, Daniel Kiper wrote: > I think that new kexec hypercall function should mimics kexec syscall. We want to have an interface can be used by non-Linux domains (both dom0 and domU) as well though, so please bear this in mind. Historically we've not always been good at this when the hypercall interface is strongly tied to a particular guest implementation (in some sense this is the problem with the current kexec hypercall). Also what makes for a good syscall interface does not necessarily make for a good hypercall interface. Ian. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 04, 2013 at 02:11:46PM -0500, Konrad Rzeszutek Wilk wrote: > On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote: > > On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote: > > > >>> On 04.01.13 at 15:22, Daniel Kiper wrote: > > > > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: > > > >> /sbin/kexec can load the "Xen" crash kernel itself by issuing > > > >> hypercalls using /dev/xen/privcmd. This would remove the need for > > > >> the dom0 kernel to distinguish between loading a crash kernel for > > > >> itself and loading a kernel for Xen. > > > >> > > > >> Or is this just a silly idea complicating the matter? > > > > > > > > This is impossible with current Xen kexec/kdump interface. > > > > > > Why? > > > > Because current KEXEC_CMD_kexec_load does not load kernel > > image and other things into Xen memory. It means that it > > should live somewhere in dom0 Linux kernel memory. > > We could have a very simple hypercall which would have: > > struct fancy_new_hypercall { > xen_pfn_t payload; // IN > ssize_t len; // IN > #define DATA (1<<1) > #define DATA_EOF (1<<2) > #define DATA_KERNEL (1<<3) > #define DATA_RAMDISK (1<<4) > unsigned int flags; // IN > unsigned int status; // OUT > }; > > which would in a loop just iterate over the payloads and > let the hypervisor stick it in the crashkernel space. > > This is all hand-waving of course. There probably would be a need > to figure out how much space you have in the reserved Xen's > 'crashkernel' memory region too. I think that new kexec hypercall function should mimics kexec syscall. It means that all arguments passed to hypercall should have same types if it is possible or if it is not possible then conversion should be done in very easy way. Additionally, I think that one call of new hypercall load function should load all needed thinks in right place and return relevant status. Last but not least, new functionality should be available through /dev/xen/privcmd or directly from kernel without bigger effort. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Mon, 2013-01-07 at 10:46 +, Andrew Cooper wrote: > Given that /sbin/kexec creates a binary blob in memory, surely the most > simple thing is to get it to suitably mlock() the region and give a list > of VAs to the hypervisor. More than likely. The DOMID_KEXEC thing was just a radon musing ;-) > This way, Xen can properly take care of what it does with information > and where. For example, at the moment, allowing dom0 to choose where > gets overwritten in the Xen crash area is a recipe for disaster if a > crash occurs midway through loading/reloading the crash kernel. That's true. I think there is a double buffering scheme in the current thing and we should preserve that in any new implementation. Ian. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 07/01/13 10:25, Ian Campbell wrote: On Fri, 2013-01-04 at 19:11 +, Konrad Rzeszutek Wilk wrote: On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote: Because current KEXEC_CMD_kexec_load does not load kernel image and other things into Xen memory. It means that it should live somewhere in dom0 Linux kernel memory. We could have a very simple hypercall which would have: struct fancy_new_hypercall { xen_pfn_t payload; // IN This would have to be XEN_GUEST_HANDLE(something) since userspace cannot figure out what pfns back its memory. In any case since the hypervisor is going to want to copy the data into the crashkernel space a virtual address is convenient to have. ssize_t len; // IN #define DATA (1<<1) #define DATA_EOF (1<<2) #define DATA_KERNEL (1<<3) #define DATA_RAMDISK (1<<4) unsigned int flags; // IN unsigned int status; // OUT }; which would in a loop just iterate over the payloads and let the hypervisor stick it in the crashkernel space. This is all hand-waving of course. There probably would be a need to figure out how much space you have in the reserved Xen's 'crashkernel' memory region too. This is probably a mad idea but it's Monday morning and I'm sleep deprived so I'll throw it out there... What about adding DOMID_KEXEC (similar DOMID_IO etc)? This would allow dom0 to map the kexec memory space with the usual privcmd mmap hypercalls and build things in it directly. OK, I suspect this might not be practical for a variety of reasons (lack of a p2m for such domains so no way to find out the list of mfns, dom0 userspace simply doesn't have sufficient context to write sensible things here, etc) but maybe someone has a better head on today... Ian. Given that /sbin/kexec creates a binary blob in memory, surely the most simple thing is to get it to suitably mlock() the region and give a list of VAs to the hypervisor. This way, Xen can properly take care of what it does with information and where. For example, at the moment, allowing dom0 to choose where gets overwritten in the Xen crash area is a recipe for disaster if a crash occurs midway through loading/reloading the crash kernel. ~Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, 2013-01-04 at 19:11 +, Konrad Rzeszutek Wilk wrote: > On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote: > > On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote: > > > >>> On 04.01.13 at 15:22, Daniel Kiper wrote: > > > > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: > > > >> /sbin/kexec can load the "Xen" crash kernel itself by issuing > > > >> hypercalls using /dev/xen/privcmd. This would remove the need for > > > >> the dom0 kernel to distinguish between loading a crash kernel for > > > >> itself and loading a kernel for Xen. > > > >> > > > >> Or is this just a silly idea complicating the matter? > > > > > > > > This is impossible with current Xen kexec/kdump interface. > > > > > > Why? > > > > Because current KEXEC_CMD_kexec_load does not load kernel > > image and other things into Xen memory. It means that it > > should live somewhere in dom0 Linux kernel memory. > > We could have a very simple hypercall which would have: > > struct fancy_new_hypercall { > xen_pfn_t payload; // IN This would have to be XEN_GUEST_HANDLE(something) since userspace cannot figure out what pfns back its memory. In any case since the hypervisor is going to want to copy the data into the crashkernel space a virtual address is convenient to have. > ssize_t len; // IN > #define DATA (1<<1) > #define DATA_EOF (1<<2) > #define DATA_KERNEL (1<<3) > #define DATA_RAMDISK (1<<4) > unsigned int flags; // IN > unsigned int status; // OUT > }; > > which would in a loop just iterate over the payloads and > let the hypervisor stick it in the crashkernel space. > > This is all hand-waving of course. There probably would be a need > to figure out how much space you have in the reserved Xen's > 'crashkernel' memory region too. This is probably a mad idea but it's Monday morning and I'm sleep deprived so I'll throw it out there... What about adding DOMID_KEXEC (similar DOMID_IO etc)? This would allow dom0 to map the kexec memory space with the usual privcmd mmap hypercalls and build things in it directly. OK, I suspect this might not be practical for a variety of reasons (lack of a p2m for such domains so no way to find out the list of mfns, dom0 userspace simply doesn't have sufficient context to write sensible things here, etc) but maybe someone has a better head on today... Ian. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, 2013-01-04 at 19:11 +, Konrad Rzeszutek Wilk wrote: On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote: On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote: On 04.01.13 at 15:22, Daniel Kiper daniel.ki...@oracle.com wrote: On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. Why? Because current KEXEC_CMD_kexec_load does not load kernel image and other things into Xen memory. It means that it should live somewhere in dom0 Linux kernel memory. We could have a very simple hypercall which would have: struct fancy_new_hypercall { xen_pfn_t payload; // IN This would have to be XEN_GUEST_HANDLE(something) since userspace cannot figure out what pfns back its memory. In any case since the hypervisor is going to want to copy the data into the crashkernel space a virtual address is convenient to have. ssize_t len; // IN #define DATA (11) #define DATA_EOF (12) #define DATA_KERNEL (13) #define DATA_RAMDISK (14) unsigned int flags; // IN unsigned int status; // OUT }; which would in a loop just iterate over the payloads and let the hypervisor stick it in the crashkernel space. This is all hand-waving of course. There probably would be a need to figure out how much space you have in the reserved Xen's 'crashkernel' memory region too. This is probably a mad idea but it's Monday morning and I'm sleep deprived so I'll throw it out there... What about adding DOMID_KEXEC (similar DOMID_IO etc)? This would allow dom0 to map the kexec memory space with the usual privcmd mmap hypercalls and build things in it directly. OK, I suspect this might not be practical for a variety of reasons (lack of a p2m for such domains so no way to find out the list of mfns, dom0 userspace simply doesn't have sufficient context to write sensible things here, etc) but maybe someone has a better head on today... Ian. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 07/01/13 10:25, Ian Campbell wrote: On Fri, 2013-01-04 at 19:11 +, Konrad Rzeszutek Wilk wrote: On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote: Because current KEXEC_CMD_kexec_load does not load kernel image and other things into Xen memory. It means that it should live somewhere in dom0 Linux kernel memory. We could have a very simple hypercall which would have: struct fancy_new_hypercall { xen_pfn_t payload; // IN This would have to be XEN_GUEST_HANDLE(something) since userspace cannot figure out what pfns back its memory. In any case since the hypervisor is going to want to copy the data into the crashkernel space a virtual address is convenient to have. ssize_t len; // IN #define DATA (11) #define DATA_EOF (12) #define DATA_KERNEL (13) #define DATA_RAMDISK (14) unsigned int flags; // IN unsigned int status; // OUT }; which would in a loop just iterate over the payloads and let the hypervisor stick it in the crashkernel space. This is all hand-waving of course. There probably would be a need to figure out how much space you have in the reserved Xen's 'crashkernel' memory region too. This is probably a mad idea but it's Monday morning and I'm sleep deprived so I'll throw it out there... What about adding DOMID_KEXEC (similar DOMID_IO etc)? This would allow dom0 to map the kexec memory space with the usual privcmd mmap hypercalls and build things in it directly. OK, I suspect this might not be practical for a variety of reasons (lack of a p2m for such domains so no way to find out the list of mfns, dom0 userspace simply doesn't have sufficient context to write sensible things here, etc) but maybe someone has a better head on today... Ian. Given that /sbin/kexec creates a binary blob in memory, surely the most simple thing is to get it to suitably mlock() the region and give a list of VAs to the hypervisor. This way, Xen can properly take care of what it does with information and where. For example, at the moment, allowing dom0 to choose where gets overwritten in the Xen crash area is a recipe for disaster if a crash occurs midway through loading/reloading the crash kernel. ~Andrew -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Mon, 2013-01-07 at 10:46 +, Andrew Cooper wrote: Given that /sbin/kexec creates a binary blob in memory, surely the most simple thing is to get it to suitably mlock() the region and give a list of VAs to the hypervisor. More than likely. The DOMID_KEXEC thing was just a radon musing ;-) This way, Xen can properly take care of what it does with information and where. For example, at the moment, allowing dom0 to choose where gets overwritten in the Xen crash area is a recipe for disaster if a crash occurs midway through loading/reloading the crash kernel. That's true. I think there is a double buffering scheme in the current thing and we should preserve that in any new implementation. Ian. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 04, 2013 at 02:11:46PM -0500, Konrad Rzeszutek Wilk wrote: On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote: On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote: On 04.01.13 at 15:22, Daniel Kiper daniel.ki...@oracle.com wrote: On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. Why? Because current KEXEC_CMD_kexec_load does not load kernel image and other things into Xen memory. It means that it should live somewhere in dom0 Linux kernel memory. We could have a very simple hypercall which would have: struct fancy_new_hypercall { xen_pfn_t payload; // IN ssize_t len; // IN #define DATA (11) #define DATA_EOF (12) #define DATA_KERNEL (13) #define DATA_RAMDISK (14) unsigned int flags; // IN unsigned int status; // OUT }; which would in a loop just iterate over the payloads and let the hypervisor stick it in the crashkernel space. This is all hand-waving of course. There probably would be a need to figure out how much space you have in the reserved Xen's 'crashkernel' memory region too. I think that new kexec hypercall function should mimics kexec syscall. It means that all arguments passed to hypercall should have same types if it is possible or if it is not possible then conversion should be done in very easy way. Additionally, I think that one call of new hypercall load function should load all needed thinks in right place and return relevant status. Last but not least, new functionality should be available through /dev/xen/privcmd or directly from kernel without bigger effort. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Mon, 2013-01-07 at 12:34 +, Daniel Kiper wrote: I think that new kexec hypercall function should mimics kexec syscall. We want to have an interface can be used by non-Linux domains (both dom0 and domU) as well though, so please bear this in mind. Historically we've not always been good at this when the hypercall interface is strongly tied to a particular guest implementation (in some sense this is the problem with the current kexec hypercall). Also what makes for a good syscall interface does not necessarily make for a good hypercall interface. Ian. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote: On Fri, Jan 04, 2013 at 02:11:46PM -0500, Konrad Rzeszutek Wilk wrote: On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote: On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote: On 04.01.13 at 15:22, Daniel Kiper daniel.ki...@oracle.com wrote: On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. Why? Because current KEXEC_CMD_kexec_load does not load kernel image and other things into Xen memory. It means that it should live somewhere in dom0 Linux kernel memory. We could have a very simple hypercall which would have: struct fancy_new_hypercall { xen_pfn_t payload; // IN ssize_t len; // IN #define DATA (11) #define DATA_EOF (12) #define DATA_KERNEL (13) #define DATA_RAMDISK (14) unsigned int flags; // IN unsigned int status; // OUT }; which would in a loop just iterate over the payloads and let the hypervisor stick it in the crashkernel space. This is all hand-waving of course. There probably would be a need to figure out how much space you have in the reserved Xen's 'crashkernel' memory region too. I think that new kexec hypercall function should mimics kexec syscall. It means that all arguments passed to hypercall should have same types if it is possible or if it is not possible then conversion should be done in very easy way. Additionally, I think that one call of new hypercall load function should load all needed thinks in right place and return relevant status. Last but not least, new functionality should We are not restricted to just _one_ hypercall. And this loading thing could be similar to the micrcode hypercall - which just points to a virtual address along with the length - and says 'load me'. be available through /dev/xen/privcmd or directly from kernel without bigger effort. Perhaps we should have a email thread on xen-devel where we hash out some ideas. Eric, would you be OK included on this - it would make sense for this mechanism to be as future-proof as possible - and I am not sure what your plans for kexec are in the future? Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote: > On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote: > > >>> On 04.01.13 at 15:22, Daniel Kiper wrote: > > > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: > > >> /sbin/kexec can load the "Xen" crash kernel itself by issuing > > >> hypercalls using /dev/xen/privcmd. This would remove the need for > > >> the dom0 kernel to distinguish between loading a crash kernel for > > >> itself and loading a kernel for Xen. > > >> > > >> Or is this just a silly idea complicating the matter? > > > > > > This is impossible with current Xen kexec/kdump interface. > > > > Why? > > Because current KEXEC_CMD_kexec_load does not load kernel > image and other things into Xen memory. It means that it > should live somewhere in dom0 Linux kernel memory. We could have a very simple hypercall which would have: struct fancy_new_hypercall { xen_pfn_t payload; // IN ssize_t len; // IN #define DATA (1<<1) #define DATA_EOF (1<<2) #define DATA_KERNEL (1<<3) #define DATA_RAMDISK (1<<4) unsigned int flags; // IN unsigned int status; // OUT }; which would in a loop just iterate over the payloads and let the hypervisor stick it in the crashkernel space. This is all hand-waving of course. There probably would be a need to figure out how much space you have in the reserved Xen's 'crashkernel' memory region too. > > Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote: > >>> On 04.01.13 at 15:22, Daniel Kiper wrote: > > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: > >> /sbin/kexec can load the "Xen" crash kernel itself by issuing > >> hypercalls using /dev/xen/privcmd. This would remove the need for > >> the dom0 kernel to distinguish between loading a crash kernel for > >> itself and loading a kernel for Xen. > >> > >> Or is this just a silly idea complicating the matter? > > > > This is impossible with current Xen kexec/kdump interface. > > Why? Because current KEXEC_CMD_kexec_load does not load kernel image and other things into Xen memory. It means that it should live somewhere in dom0 Linux kernel memory. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 04, 2013 at 02:38:44PM +, David Vrabel wrote: > On 04/01/13 14:22, Daniel Kiper wrote: > > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: > >> On 27/12/12 18:02, Eric W. Biederman wrote: > >>> Andrew Cooper writes: > >>> > On 27/12/2012 07:53, Eric W. Biederman wrote: > > The syscall ABI still has the wrong semantics. > > > > Aka totally unmaintainable and umergeable. > > > > The concept of domU support is also strange. What does domU support > > even mean, when the dom0 support is loading a kernel to pick up Xen > > when Xen falls over. > There are two requirements pulling at this patch series, but I agree > that we need to clarify them. > >>> It probably make sense to split them apart a little even. > >>> > >>> > >> > >> Thinking about this split, there might be a way to simply it even more. > >> > >> /sbin/kexec can load the "Xen" crash kernel itself by issuing > >> hypercalls using /dev/xen/privcmd. This would remove the need for > >> the dom0 kernel to distinguish between loading a crash kernel for > >> itself and loading a kernel for Xen. > >> > >> Or is this just a silly idea complicating the matter? > > > > This is impossible with current Xen kexec/kdump interface. > > It should be changed to do that. However, I suppose that > > Xen community would not be interested in such changes. > > I don't see why the hypercall ABI cannot be extended with new sub-ops > that do the right thing -- the existing ABI is a bit weird. > > I plan to start prototyping something shortly (hopefully next week) for > the Xen kexec case. Wow... As I can this time Xen community is interested in... That is great. I agree that current kexec interface is not ideal. David, I am happy to help in that process. However, if you wish I could carry it myself. Anyway, it looks that I should hold on with my Linux kexec/kdump patches. My .5 cents: - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload; probably we should introduce KEXEC_CMD_kexec_load2 and KEXEC_CMD_kexec_unload2; load should __LOAD__ kernel image and other things into hypervisor memory; I suppose that allmost all things could be copied from linux/kernel/kexec.c, linux/arch/x86/kernel/{machine_kexec_$(BITS).c,relocate_kernel_$(BITS).c}; I think that KEXEC_CMD_kexec should stay as is, - Hmmm... Now I think that we should still use kexec syscall to load image into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes all things which are needed to call kdump if dom0 crashes; however, I could be wrong... - last but not least, we should think about support for PV guests too. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
>>> On 04.01.13 at 15:22, Daniel Kiper wrote: > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: >> /sbin/kexec can load the "Xen" crash kernel itself by issuing >> hypercalls using /dev/xen/privcmd. This would remove the need for >> the dom0 kernel to distinguish between loading a crash kernel for >> itself and loading a kernel for Xen. >> >> Or is this just a silly idea complicating the matter? > > This is impossible with current Xen kexec/kdump interface. Why? > It should be changed to do that. However, I suppose that > Xen community would not be interested in such changes. And again - why? Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 04/01/13 14:22, Daniel Kiper wrote: > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: >> On 27/12/12 18:02, Eric W. Biederman wrote: >>> Andrew Cooper writes: >>> On 27/12/2012 07:53, Eric W. Biederman wrote: > The syscall ABI still has the wrong semantics. > > Aka totally unmaintainable and umergeable. > > The concept of domU support is also strange. What does domU support even > mean, when the dom0 support is loading a kernel to pick up Xen when Xen > falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. >>> It probably make sense to split them apart a little even. >>> >>> >> >> Thinking about this split, there might be a way to simply it even more. >> >> /sbin/kexec can load the "Xen" crash kernel itself by issuing >> hypercalls using /dev/xen/privcmd. This would remove the need for >> the dom0 kernel to distinguish between loading a crash kernel for >> itself and loading a kernel for Xen. >> >> Or is this just a silly idea complicating the matter? > > This is impossible with current Xen kexec/kdump interface. > It should be changed to do that. However, I suppose that > Xen community would not be interested in such changes. I don't see why the hypercall ABI cannot be extended with new sub-ops that do the right thing -- the existing ABI is a bit weird. I plan to start prototyping something shortly (hopefully next week) for the Xen kexec case. David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 04, 2013 at 03:22:57PM +0100, Daniel Kiper wrote: > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: > > On 27/12/12 18:02, Eric W. Biederman wrote: > > >Andrew Cooper writes: > > > > > >>On 27/12/2012 07:53, Eric W. Biederman wrote: > > >>>The syscall ABI still has the wrong semantics. > > >>> > > >>>Aka totally unmaintainable and umergeable. > > >>> > > >>>The concept of domU support is also strange. What does domU support > > >>>even mean, when the dom0 support is loading a kernel to pick up Xen when > > >>>Xen falls over. > > >>There are two requirements pulling at this patch series, but I agree > > >>that we need to clarify them. > > >It probably make sense to split them apart a little even. > > > > > > > > > > Thinking about this split, there might be a way to simply it even more. > > > > /sbin/kexec can load the "Xen" crash kernel itself by issuing > > hypercalls using /dev/xen/privcmd. This would remove the need for > > the dom0 kernel to distinguish between loading a crash kernel for > > itself and loading a kernel for Xen. > > > > Or is this just a silly idea complicating the matter? > > This is impossible with current Xen kexec/kdump interface. > It should be changed to do that. However, I suppose that > Xen community would not be interested in such changes. Why not? What is involved in it? IMHO I believe anybody would welcome a new clean design that solves this thorny problem? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, 2013-01-04 at 14:22 +, Daniel Kiper wrote: > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: > > On 27/12/12 18:02, Eric W. Biederman wrote: > > >Andrew Cooper writes: > > > > > >>On 27/12/2012 07:53, Eric W. Biederman wrote: > > >>>The syscall ABI still has the wrong semantics. > > >>> > > >>>Aka totally unmaintainable and umergeable. > > >>> > > >>>The concept of domU support is also strange. What does domU support > > >>>even mean, when the dom0 support is loading a kernel to pick up Xen when > > >>>Xen falls over. > > >>There are two requirements pulling at this patch series, but I agree > > >>that we need to clarify them. > > >It probably make sense to split them apart a little even. > > > > > > > > > > Thinking about this split, there might be a way to simply it even more. > > > > /sbin/kexec can load the "Xen" crash kernel itself by issuing > > hypercalls using /dev/xen/privcmd. This would remove the need for > > the dom0 kernel to distinguish between loading a crash kernel for > > itself and loading a kernel for Xen. > > > > Or is this just a silly idea complicating the matter? > > This is impossible with current Xen kexec/kdump interface. > It should be changed to do that. However, I suppose that > Xen community would not be interested in such changes. The current HYPERVISOR_kexec interface is pretty fricken bad (it basically hardcodes the Linux Circa-2.6.18 internal interface!). I'd be all for a new HYPERVISOR_kexec (with the old gaining a _compat suffix) which implements something more generic that isn't tied to a particular dom0 kernel implementation (be it differing versions of Linux or e.g. *BSD). If that enables /sbin/kexec to load the kernel directly then so much the better, assuming the /sbin/kexec maintainers are happy with that approach. Ian. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: > On 27/12/12 18:02, Eric W. Biederman wrote: > >Andrew Cooper writes: > > > >>On 27/12/2012 07:53, Eric W. Biederman wrote: > >>>The syscall ABI still has the wrong semantics. > >>> > >>>Aka totally unmaintainable and umergeable. > >>> > >>>The concept of domU support is also strange. What does domU support even > >>>mean, when the dom0 support is loading a kernel to pick up Xen when Xen > >>>falls over. > >>There are two requirements pulling at this patch series, but I agree > >>that we need to clarify them. > >It probably make sense to split them apart a little even. > > > > > > Thinking about this split, there might be a way to simply it even more. > > /sbin/kexec can load the "Xen" crash kernel itself by issuing > hypercalls using /dev/xen/privcmd. This would remove the need for > the dom0 kernel to distinguish between loading a crash kernel for > itself and loading a kernel for Xen. > > Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. It should be changed to do that. However, I suppose that Xen community would not be interested in such changes. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: On 27/12/12 18:02, Eric W. Biederman wrote: Andrew Cooperandrew.coop...@citrix.com writes: On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. It probably make sense to split them apart a little even. Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. It should be changed to do that. However, I suppose that Xen community would not be interested in such changes. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, 2013-01-04 at 14:22 +, Daniel Kiper wrote: On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: On 27/12/12 18:02, Eric W. Biederman wrote: Andrew Cooperandrew.coop...@citrix.com writes: On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. It probably make sense to split them apart a little even. Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. It should be changed to do that. However, I suppose that Xen community would not be interested in such changes. The current HYPERVISOR_kexec interface is pretty fricken bad (it basically hardcodes the Linux Circa-2.6.18 internal interface!). I'd be all for a new HYPERVISOR_kexec (with the old gaining a _compat suffix) which implements something more generic that isn't tied to a particular dom0 kernel implementation (be it differing versions of Linux or e.g. *BSD). If that enables /sbin/kexec to load the kernel directly then so much the better, assuming the /sbin/kexec maintainers are happy with that approach. Ian. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 04, 2013 at 03:22:57PM +0100, Daniel Kiper wrote: On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: On 27/12/12 18:02, Eric W. Biederman wrote: Andrew Cooperandrew.coop...@citrix.com writes: On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. It probably make sense to split them apart a little even. Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. It should be changed to do that. However, I suppose that Xen community would not be interested in such changes. Why not? What is involved in it? IMHO I believe anybody would welcome a new clean design that solves this thorny problem? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 04/01/13 14:22, Daniel Kiper wrote: On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: On 27/12/12 18:02, Eric W. Biederman wrote: Andrew Cooperandrew.coop...@citrix.com writes: On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. It probably make sense to split them apart a little even. Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. It should be changed to do that. However, I suppose that Xen community would not be interested in such changes. I don't see why the hypercall ABI cannot be extended with new sub-ops that do the right thing -- the existing ABI is a bit weird. I plan to start prototyping something shortly (hopefully next week) for the Xen kexec case. David -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 04.01.13 at 15:22, Daniel Kiper daniel.ki...@oracle.com wrote: On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. Why? It should be changed to do that. However, I suppose that Xen community would not be interested in such changes. And again - why? Jan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 04, 2013 at 02:38:44PM +, David Vrabel wrote: On 04/01/13 14:22, Daniel Kiper wrote: On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: On 27/12/12 18:02, Eric W. Biederman wrote: Andrew Cooperandrew.coop...@citrix.com writes: On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. It probably make sense to split them apart a little even. Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. It should be changed to do that. However, I suppose that Xen community would not be interested in such changes. I don't see why the hypercall ABI cannot be extended with new sub-ops that do the right thing -- the existing ABI is a bit weird. I plan to start prototyping something shortly (hopefully next week) for the Xen kexec case. Wow... As I can this time Xen community is interested in... That is great. I agree that current kexec interface is not ideal. David, I am happy to help in that process. However, if you wish I could carry it myself. Anyway, it looks that I should hold on with my Linux kexec/kdump patches. My .5 cents: - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload; probably we should introduce KEXEC_CMD_kexec_load2 and KEXEC_CMD_kexec_unload2; load should __LOAD__ kernel image and other things into hypervisor memory; I suppose that allmost all things could be copied from linux/kernel/kexec.c, linux/arch/x86/kernel/{machine_kexec_$(BITS).c,relocate_kernel_$(BITS).c}; I think that KEXEC_CMD_kexec should stay as is, - Hmmm... Now I think that we should still use kexec syscall to load image into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes all things which are needed to call kdump if dom0 crashes; however, I could be wrong... - last but not least, we should think about support for PV guests too. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote: On 04.01.13 at 15:22, Daniel Kiper daniel.ki...@oracle.com wrote: On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. Why? Because current KEXEC_CMD_kexec_load does not load kernel image and other things into Xen memory. It means that it should live somewhere in dom0 Linux kernel memory. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote: On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote: On 04.01.13 at 15:22, Daniel Kiper daniel.ki...@oracle.com wrote: On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote: /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? This is impossible with current Xen kexec/kdump interface. Why? Because current KEXEC_CMD_kexec_load does not load kernel image and other things into Xen memory. It means that it should live somewhere in dom0 Linux kernel memory. We could have a very simple hypercall which would have: struct fancy_new_hypercall { xen_pfn_t payload; // IN ssize_t len; // IN #define DATA (11) #define DATA_EOF (12) #define DATA_KERNEL (13) #define DATA_RAMDISK (14) unsigned int flags; // IN unsigned int status; // OUT }; which would in a loop just iterate over the payloads and let the hypervisor stick it in the crashkernel space. This is all hand-waving of course. There probably would be a need to figure out how much space you have in the reserved Xen's 'crashkernel' memory region too. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
>>> On 02.01.13 at 12:26, Andrew Cooper wrote: > On 27/12/12 18:02, Eric W. Biederman wrote: >> It probably make sense to split them apart a little even. > > Thinking about this split, there might be a way to simply it even more. > > /sbin/kexec can load the "Xen" crash kernel itself by issuing hypercalls > using /dev/xen/privcmd. This would remove the need for the dom0 kernel > to distinguish between loading a crash kernel for itself and loading a > kernel for Xen. > > Or is this just a silly idea complicating the matter? I don't think so (and suggested that before as a response to an earlier submission of this patch set), and it would make most of the discussion here mute. Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 02.01.13 at 12:26, Andrew Cooper andrew.coop...@citrix.com wrote: On 27/12/12 18:02, Eric W. Biederman wrote: It probably make sense to split them apart a little even. Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? I don't think so (and suggested that before as a response to an earlier submission of this patch set), and it would make most of the discussion here mute. Jan -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Thu, 2012-12-27 at 14:18 +, Andrew Cooper wrote: > Many cloud customers and service providers want the ability for a VM > administrator to be able to load a kdump/kexec kernel within a > domain[1]. This allows the VM administrator to take more proactive > steps to isolate the cause of a crash, the state of which is most likely > discarded while tearing down the domain. The result being that as far > as Xen is concerned, the domain is still alive, while the kdump > kernel/environment can work its usual magic. I am not aware of any > feature like this existing in the past. I have a feeling that some versions of the classic-Xen port supported domU kexec as well. Certainly there was some work on that back in 2005, although I can't see much evidence that that attempt ever went anywhere so maybe I'm imagining things. It's possible that I'm confusing domU kexec support with support for domU kexec in some dom0 kernels. That was/is used to support "kexec" from a PV bootloader into the real kernel (which looks to the host a lot like a domU kexec would). Ian. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
Andrew Cooper writes: > On 27/12/12 18:02, Eric W. Biederman wrote: >> Andrew Cooper writes: >> >>> On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. >>> There are two requirements pulling at this patch series, but I agree >>> that we need to clarify them. >> It probably make sense to split them apart a little even. >> >> > > Thinking about this split, there might be a way to simply it even more. > > /sbin/kexec can load the "Xen" crash kernel itself by issuing > hypercalls using /dev/xen/privcmd. This would remove the need for the > dom0 kernel to distinguish between loading a crash kernel for itself > and loading a kernel for Xen. > > Or is this just a silly idea complicating the matter? At a first approximation it sounds reasonable. If the Xen kexec actually copies the loaded kernel to somewhere internal like the linux kexec that would be entirely reasonable. If Xen has other requirements on the dom0 case you might not be able to implement the call without linux kernel support. But if you can implement it all in terms of /dev/xen/privcmd go for it. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 27/12/12 18:02, Eric W. Biederman wrote: Andrew Cooper writes: On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. It probably make sense to split them apart a little even. Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the "Xen" crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? ~Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 27/12/12 18:02, Eric W. Biederman wrote: Andrew Cooperandrew.coop...@citrix.com writes: On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. It probably make sense to split them apart a little even. Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? ~Andrew -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation
Andrew Cooper andrew.coop...@citrix.com writes: On 27/12/12 18:02, Eric W. Biederman wrote: Andrew Cooperandrew.coop...@citrix.com writes: On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. It probably make sense to split them apart a little even. Thinking about this split, there might be a way to simply it even more. /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls using /dev/xen/privcmd. This would remove the need for the dom0 kernel to distinguish between loading a crash kernel for itself and loading a kernel for Xen. Or is this just a silly idea complicating the matter? At a first approximation it sounds reasonable. If the Xen kexec actually copies the loaded kernel to somewhere internal like the linux kexec that would be entirely reasonable. If Xen has other requirements on the dom0 case you might not be able to implement the call without linux kernel support. But if you can implement it all in terms of /dev/xen/privcmd go for it. Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On Thu, 2012-12-27 at 14:18 +, Andrew Cooper wrote: Many cloud customers and service providers want the ability for a VM administrator to be able to load a kdump/kexec kernel within a domain[1]. This allows the VM administrator to take more proactive steps to isolate the cause of a crash, the state of which is most likely discarded while tearing down the domain. The result being that as far as Xen is concerned, the domain is still alive, while the kdump kernel/environment can work its usual magic. I am not aware of any feature like this existing in the past. I have a feeling that some versions of the classic-Xen port supported domU kexec as well. Certainly there was some work on that back in 2005, although I can't see much evidence that that attempt ever went anywhere so maybe I'm imagining things. It's possible that I'm confusing domU kexec support with support for domU kexec in some dom0 kernels. That was/is used to support kexec from a PV bootloader into the real kernel (which looks to the host a lot like a domU kexec would). Ian. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
> Andrew Cooper writes: > > > On 27/12/2012 07:53, Eric W. Biederman wrote: > >> The syscall ABI still has the wrong semantics. > >> > >> Aka totally unmaintainable and umergeable. > >> > >> The concept of domU support is also strange. What does domU support even > >> mean, when the dom0 > support is loading a kernel to pick up Xen when Xen > >> falls over. > > > > There are two requirements pulling at this patch series, but I agree > > that we need to clarify them. > > It probably make sense to split them apart a little even. > > > When dom0 loads a crash kernel, it is loading one for Xen to use. As a > > dom0 crash causes a Xen crash, having dom0 set up a kdump kernel for > > itself is completely useless. This ability is present in "classic Xen > > dom0" kernels, but the feature is currently missing in PVOPS. > > > Many cloud customers and service providers want the ability for a VM > > administrator to be able to load a kdump/kexec kernel within a > > domain[1]. This allows the VM administrator to take more proactive > > steps to isolate the cause of a crash, the state of which is most likely > > discarded while tearing down the domain. The result being that as far > > as Xen is concerned, the domain is still alive, while the kdump > > kernel/environment can work its usual magic. I am not aware of any > > feature like this existing in the past. > > Which makes domU support semantically just the normal kexec/kdump > support. Got it. To some extent. It is true on HVM and PVonHVM guests. However, PV guests requires a bit different kexec/kdump implementation than plain kexec/kdump. Proposed firmware support has almost all required features. PV guest specific features (a few) will be added later (after agreeing generic firmware support which is sufficient at least for dom0). It looks that I should replace domU by PV guest in patch description. > The point of implementing domU is for those times when the hypervisor > admin and the kernel admin are different. Right. > For domU support modifying or adding alternate versions of > machine_kexec.c and relocate_kernel.S to add paravirtualization support > make sense. It is not sufficient. Please look above. > There is the practical argument that for implementation efficiency of > crash dumps it would be better if that support came from the hypervisor > or the hypervisor environment. But this gets into the practical reality I am thinking about that. > that the hypervisor environment does not do that today. Furthermore > kexec all by itself working in a paravirtualized environment under Xen > makes sense. > > domU support is what Peter was worrying about for cleanliness, and > we need some x86 backend ops there, and generally to be careful. As I know we do not need any additional pv_ops stuff if we place all needed things in kexec firmware support. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
> On 12/26/2012 06:18 PM, Daniel Kiper wrote: > > Hi, > > > > This set of patches contains initial kexec/kdump implementation for Xen v3. > > Currently only dom0 is supported, however, almost all infrustructure > > required for domU support is ready. > > > > Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 > > code. > > This could simplify and reduce a bit size of kernel code. However, this > > solution > > requires some changes in baremetal x86 code. First of all code which > > establishes > > transition page table should be moved back from machine_kexec_$(BITS).c to > > relocate_kernel_$(BITS).S. Another important thing which should be changed > > in that > > case is format of page_list array. Xen kexec hypercall requires to > > alternate physical > > addresses with virtual ones. These and other required stuff have not been > > done in that > > version because I am not sure that solution will be accepted by kexec/kdump > > maintainers. > > I hope that this email spark discussion about that topic. > > I want a detailed list of the constraints that this assumes and > therefore imposes on the native implementation as a result of this. We > have had way too many patches where Xen PV hacks effectively nailgun > arbitrary, and sometimes poor, design decisions in place and now we > can't fix them. OK but now I think that we should leave this discussion until all details regarding kexec/kdump generic code will be agreed. Sorry for that. Daniel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
Andrew Cooper writes: > On 27/12/2012 07:53, Eric W. Biederman wrote: >> The syscall ABI still has the wrong semantics. >> >> Aka totally unmaintainable and umergeable. >> >> The concept of domU support is also strange. What does domU support even >> mean, when the dom0 support is loading a kernel to pick up Xen when Xen >> falls over. > > There are two requirements pulling at this patch series, but I agree > that we need to clarify them. It probably make sense to split them apart a little even. > When dom0 loads a crash kernel, it is loading one for Xen to use. As a > dom0 crash causes a Xen crash, having dom0 set up a kdump kernel for > itself is completely useless. This ability is present in "classic Xen > dom0" kernels, but the feature is currently missing in PVOPS. > Many cloud customers and service providers want the ability for a VM > administrator to be able to load a kdump/kexec kernel within a > domain[1]. This allows the VM administrator to take more proactive > steps to isolate the cause of a crash, the state of which is most likely > discarded while tearing down the domain. The result being that as far > as Xen is concerned, the domain is still alive, while the kdump > kernel/environment can work its usual magic. I am not aware of any > feature like this existing in the past. Which makes domU support semantically just the normal kexec/kdump support. Got it. The point of implementing domU is for those times when the hypervisor admin and the kernel admin are different. For domU support modifying or adding alternate versions of machine_kexec.c and relocate_kernel.S to add paravirtualization support make sense. There is the practical argument that for implementation efficiency of crash dumps it would be better if that support came from the hypervisor or the hypervisor environment. But this gets into the practical reality that the hypervisor environment does not do that today. Furthermore kexec all by itself working in a paravirtualized environment under Xen makes sense. domU support is what Peter was worrying about for cleanliness, and we need some x86 backend ops there, and generally to be careful. For dom0 support we need to extend the kexec_load system call, and get it right. When we are done I expect both dom0 and domU support of kexec to work in dom0. I don't know if the normal kexec or kdump case will ever make sense in dom0 but there is no reason for that case to be broken. > ~Andrew > > [1] http://lists.xen.org/archives/html/xen-devel/2012-11/msg01274.html Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 27/12/2012 07:53, Eric W. Biederman wrote: > The syscall ABI still has the wrong semantics. > > Aka totally unmaintainable and umergeable. > > The concept of domU support is also strange. What does domU support even > mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls > over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. When dom0 loads a crash kernel, it is loading one for Xen to use. As a dom0 crash causes a Xen crash, having dom0 set up a kdump kernel for itself is completely useless. This ability is present in "classic Xen dom0" kernels, but the feature is currently missing in PVOPS. Many cloud customers and service providers want the ability for a VM administrator to be able to load a kdump/kexec kernel within a domain[1]. This allows the VM administrator to take more proactive steps to isolate the cause of a crash, the state of which is most likely discarded while tearing down the domain. The result being that as far as Xen is concerned, the domain is still alive, while the kdump kernel/environment can work its usual magic. I am not aware of any feature like this existing in the past. ~Andrew [1] http://lists.xen.org/archives/html/xen-devel/2012-11/msg01274.html > > I expect a lot of decisions about what code can be shared and what code can't > is going to be driven by the simple question what does the syscall mean. > > Sharing machine_kexec.c and relocate_kernel.S does not make much sense to me > when what you are doing is effectively passing your arguments through to the > Xen version of kexec. > > Either Xen has it's own version of those routines or I expect the Xen version > of kexec is buggy. I can't imagine what sharing that code would mean. By > the same token I can't any need to duplicate the code either. > > Furthermore since this is just passing data from one version of the syscall > to another I expect you can share the majority of the code across all > architectures that implement Xen. The only part I can see being arch > specific is the Xen syscall stub. > > With respect to the proposed semantics of silently giving the kexec system > call different meaning when running under Xen, > /sbin/kexec has to act somewhat differently when loading code into the Xen > hypervisor so there is no point not making that explicit in the ABI. > > Eric > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. When dom0 loads a crash kernel, it is loading one for Xen to use. As a dom0 crash causes a Xen crash, having dom0 set up a kdump kernel for itself is completely useless. This ability is present in classic Xen dom0 kernels, but the feature is currently missing in PVOPS. Many cloud customers and service providers want the ability for a VM administrator to be able to load a kdump/kexec kernel within a domain[1]. This allows the VM administrator to take more proactive steps to isolate the cause of a crash, the state of which is most likely discarded while tearing down the domain. The result being that as far as Xen is concerned, the domain is still alive, while the kdump kernel/environment can work its usual magic. I am not aware of any feature like this existing in the past. ~Andrew [1] http://lists.xen.org/archives/html/xen-devel/2012-11/msg01274.html I expect a lot of decisions about what code can be shared and what code can't is going to be driven by the simple question what does the syscall mean. Sharing machine_kexec.c and relocate_kernel.S does not make much sense to me when what you are doing is effectively passing your arguments through to the Xen version of kexec. Either Xen has it's own version of those routines or I expect the Xen version of kexec is buggy. I can't imagine what sharing that code would mean. By the same token I can't any need to duplicate the code either. Furthermore since this is just passing data from one version of the syscall to another I expect you can share the majority of the code across all architectures that implement Xen. The only part I can see being arch specific is the Xen syscall stub. With respect to the proposed semantics of silently giving the kexec system call different meaning when running under Xen, /sbin/kexec has to act somewhat differently when loading code into the Xen hypervisor so there is no point not making that explicit in the ABI. Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
Andrew Cooper andrew.coop...@citrix.com writes: On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. It probably make sense to split them apart a little even. When dom0 loads a crash kernel, it is loading one for Xen to use. As a dom0 crash causes a Xen crash, having dom0 set up a kdump kernel for itself is completely useless. This ability is present in classic Xen dom0 kernels, but the feature is currently missing in PVOPS. Many cloud customers and service providers want the ability for a VM administrator to be able to load a kdump/kexec kernel within a domain[1]. This allows the VM administrator to take more proactive steps to isolate the cause of a crash, the state of which is most likely discarded while tearing down the domain. The result being that as far as Xen is concerned, the domain is still alive, while the kdump kernel/environment can work its usual magic. I am not aware of any feature like this existing in the past. Which makes domU support semantically just the normal kexec/kdump support. Got it. The point of implementing domU is for those times when the hypervisor admin and the kernel admin are different. For domU support modifying or adding alternate versions of machine_kexec.c and relocate_kernel.S to add paravirtualization support make sense. There is the practical argument that for implementation efficiency of crash dumps it would be better if that support came from the hypervisor or the hypervisor environment. But this gets into the practical reality that the hypervisor environment does not do that today. Furthermore kexec all by itself working in a paravirtualized environment under Xen makes sense. domU support is what Peter was worrying about for cleanliness, and we need some x86 backend ops there, and generally to be careful. For dom0 support we need to extend the kexec_load system call, and get it right. When we are done I expect both dom0 and domU support of kexec to work in dom0. I don't know if the normal kexec or kdump case will ever make sense in dom0 but there is no reason for that case to be broken. ~Andrew [1] http://lists.xen.org/archives/html/xen-devel/2012-11/msg01274.html Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 12/26/2012 06:18 PM, Daniel Kiper wrote: Hi, This set of patches contains initial kexec/kdump implementation for Xen v3. Currently only dom0 is supported, however, almost all infrustructure required for domU support is ready. Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 code. This could simplify and reduce a bit size of kernel code. However, this solution requires some changes in baremetal x86 code. First of all code which establishes transition page table should be moved back from machine_kexec_$(BITS).c to relocate_kernel_$(BITS).S. Another important thing which should be changed in that case is format of page_list array. Xen kexec hypercall requires to alternate physical addresses with virtual ones. These and other required stuff have not been done in that version because I am not sure that solution will be accepted by kexec/kdump maintainers. I hope that this email spark discussion about that topic. I want a detailed list of the constraints that this assumes and therefore imposes on the native implementation as a result of this. We have had way too many patches where Xen PV hacks effectively nailgun arbitrary, and sometimes poor, design decisions in place and now we can't fix them. OK but now I think that we should leave this discussion until all details regarding kexec/kdump generic code will be agreed. Sorry for that. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
Andrew Cooper andrew.coop...@citrix.com writes: On 27/12/2012 07:53, Eric W. Biederman wrote: The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. There are two requirements pulling at this patch series, but I agree that we need to clarify them. It probably make sense to split them apart a little even. When dom0 loads a crash kernel, it is loading one for Xen to use. As a dom0 crash causes a Xen crash, having dom0 set up a kdump kernel for itself is completely useless. This ability is present in classic Xen dom0 kernels, but the feature is currently missing in PVOPS. Many cloud customers and service providers want the ability for a VM administrator to be able to load a kdump/kexec kernel within a domain[1]. This allows the VM administrator to take more proactive steps to isolate the cause of a crash, the state of which is most likely discarded while tearing down the domain. The result being that as far as Xen is concerned, the domain is still alive, while the kdump kernel/environment can work its usual magic. I am not aware of any feature like this existing in the past. Which makes domU support semantically just the normal kexec/kdump support. Got it. To some extent. It is true on HVM and PVonHVM guests. However, PV guests requires a bit different kexec/kdump implementation than plain kexec/kdump. Proposed firmware support has almost all required features. PV guest specific features (a few) will be added later (after agreeing generic firmware support which is sufficient at least for dom0). It looks that I should replace domU by PV guest in patch description. The point of implementing domU is for those times when the hypervisor admin and the kernel admin are different. Right. For domU support modifying or adding alternate versions of machine_kexec.c and relocate_kernel.S to add paravirtualization support make sense. It is not sufficient. Please look above. There is the practical argument that for implementation efficiency of crash dumps it would be better if that support came from the hypervisor or the hypervisor environment. But this gets into the practical reality I am thinking about that. that the hypervisor environment does not do that today. Furthermore kexec all by itself working in a paravirtualized environment under Xen makes sense. domU support is what Peter was worrying about for cleanliness, and we need some x86 backend ops there, and generally to be careful. As I know we do not need any additional pv_ops stuff if we place all needed things in kexec firmware support. Daniel -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. I expect a lot of decisions about what code can be shared and what code can't is going to be driven by the simple question what does the syscall mean. Sharing machine_kexec.c and relocate_kernel.S does not make much sense to me when what you are doing is effectively passing your arguments through to the Xen version of kexec. Either Xen has it's own version of those routines or I expect the Xen version of kexec is buggy. I can't imagine what sharing that code would mean. By the same token I can't any need to duplicate the code either. Furthermore since this is just passing data from one version of the syscall to another I expect you can share the majority of the code across all architectures that implement Xen. The only part I can see being arch specific is the Xen syscall stub. With respect to the proposed semantics of silently giving the kexec system call different meaning when running under Xen, /sbin/kexec has to act somewhat differently when loading code into the Xen hypervisor so there is no point not making that explicit in the ABI. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 12/26/2012 06:18 PM, Daniel Kiper wrote: Hi, This set of patches contains initial kexec/kdump implementation for Xen v3. Currently only dom0 is supported, however, almost all infrustructure required for domU support is ready. Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 code. This could simplify and reduce a bit size of kernel code. However, this solution requires some changes in baremetal x86 code. First of all code which establishes transition page table should be moved back from machine_kexec_$(BITS).c to relocate_kernel_$(BITS).S. Another important thing which should be changed in that case is format of page_list array. Xen kexec hypercall requires to alternate physical addresses with virtual ones. These and other required stuff have not been done in that version because I am not sure that solution will be accepted by kexec/kdump maintainers. I hope that this email spark discussion about that topic. I want a detailed list of the constraints that this assumes and therefore imposes on the native implementation as a result of this. We have had way too many patches where Xen PV hacks effectively nailgun arbitrary, and sometimes poor, design decisions in place and now we can't fix them. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 00/11] xen: Initial kexec/kdump implementation
Hi, This set of patches contains initial kexec/kdump implementation for Xen v3. Currently only dom0 is supported, however, almost all infrustructure required for domU support is ready. Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 code. This could simplify and reduce a bit size of kernel code. However, this solution requires some changes in baremetal x86 code. First of all code which establishes transition page table should be moved back from machine_kexec_$(BITS).c to relocate_kernel_$(BITS).S. Another important thing which should be changed in that case is format of page_list array. Xen kexec hypercall requires to alternate physical addresses with virtual ones. These and other required stuff have not been done in that version because I am not sure that solution will be accepted by kexec/kdump maintainers. I hope that this email spark discussion about that topic. Daniel arch/x86/Kconfig |3 + arch/x86/include/asm/kexec.h | 10 +- arch/x86/include/asm/xen/hypercall.h |6 + arch/x86/include/asm/xen/kexec.h | 79 arch/x86/kernel/machine_kexec_64.c | 12 +- arch/x86/kernel/vmlinux.lds.S|7 +- arch/x86/xen/Kconfig |1 + arch/x86/xen/Makefile|3 + arch/x86/xen/enlighten.c | 11 + arch/x86/xen/kexec.c | 150 +++ arch/x86/xen/machine_kexec_32.c | 226 +++ arch/x86/xen/machine_kexec_64.c | 318 +++ arch/x86/xen/relocate_kernel_32.S| 323 +++ arch/x86/xen/relocate_kernel_64.S| 309 ++ drivers/xen/sys-hypervisor.c | 42 ++- include/linux/kexec.h| 26 ++- include/xen/interface/xen.h | 33 ++ kernel/Makefile |1 + kernel/kexec-firmware.c | 743 ++ kernel/kexec.c | 46 ++- 20 files changed, 2331 insertions(+), 18 deletions(-) Daniel Kiper (11): kexec: introduce kexec firmware support x86/kexec: Add extra pointers to transition page table PGD, PUD, PMD and PTE xen: Introduce architecture independent data for kexec/kdump x86/xen: Introduce architecture dependent data for kexec/kdump x86/xen: Register resources required by kexec-tools x86/xen: Add i386 kexec/kdump implementation x86/xen: Add x86_64 kexec/kdump implementation x86/xen: Add kexec/kdump Kconfig and makefile rules x86/xen/enlighten: Add init and crash kexec/kdump hooks drivers/xen: Export vmcoreinfo through sysfs x86: Add Xen kexec control code size check to linker script -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v3 00/11] xen: Initial kexec/kdump implementation
Hi, This set of patches contains initial kexec/kdump implementation for Xen v3. Currently only dom0 is supported, however, almost all infrustructure required for domU support is ready. Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 code. This could simplify and reduce a bit size of kernel code. However, this solution requires some changes in baremetal x86 code. First of all code which establishes transition page table should be moved back from machine_kexec_$(BITS).c to relocate_kernel_$(BITS).S. Another important thing which should be changed in that case is format of page_list array. Xen kexec hypercall requires to alternate physical addresses with virtual ones. These and other required stuff have not been done in that version because I am not sure that solution will be accepted by kexec/kdump maintainers. I hope that this email spark discussion about that topic. Daniel arch/x86/Kconfig |3 + arch/x86/include/asm/kexec.h | 10 +- arch/x86/include/asm/xen/hypercall.h |6 + arch/x86/include/asm/xen/kexec.h | 79 arch/x86/kernel/machine_kexec_64.c | 12 +- arch/x86/kernel/vmlinux.lds.S|7 +- arch/x86/xen/Kconfig |1 + arch/x86/xen/Makefile|3 + arch/x86/xen/enlighten.c | 11 + arch/x86/xen/kexec.c | 150 +++ arch/x86/xen/machine_kexec_32.c | 226 +++ arch/x86/xen/machine_kexec_64.c | 318 +++ arch/x86/xen/relocate_kernel_32.S| 323 +++ arch/x86/xen/relocate_kernel_64.S| 309 ++ drivers/xen/sys-hypervisor.c | 42 ++- include/linux/kexec.h| 26 ++- include/xen/interface/xen.h | 33 ++ kernel/Makefile |1 + kernel/kexec-firmware.c | 743 ++ kernel/kexec.c | 46 ++- 20 files changed, 2331 insertions(+), 18 deletions(-) Daniel Kiper (11): kexec: introduce kexec firmware support x86/kexec: Add extra pointers to transition page table PGD, PUD, PMD and PTE xen: Introduce architecture independent data for kexec/kdump x86/xen: Introduce architecture dependent data for kexec/kdump x86/xen: Register resources required by kexec-tools x86/xen: Add i386 kexec/kdump implementation x86/xen: Add x86_64 kexec/kdump implementation x86/xen: Add kexec/kdump Kconfig and makefile rules x86/xen/enlighten: Add init and crash kexec/kdump hooks drivers/xen: Export vmcoreinfo through sysfs x86: Add Xen kexec control code size check to linker script -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
On 12/26/2012 06:18 PM, Daniel Kiper wrote: Hi, This set of patches contains initial kexec/kdump implementation for Xen v3. Currently only dom0 is supported, however, almost all infrustructure required for domU support is ready. Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 code. This could simplify and reduce a bit size of kernel code. However, this solution requires some changes in baremetal x86 code. First of all code which establishes transition page table should be moved back from machine_kexec_$(BITS).c to relocate_kernel_$(BITS).S. Another important thing which should be changed in that case is format of page_list array. Xen kexec hypercall requires to alternate physical addresses with virtual ones. These and other required stuff have not been done in that version because I am not sure that solution will be accepted by kexec/kdump maintainers. I hope that this email spark discussion about that topic. I want a detailed list of the constraints that this assumes and therefore imposes on the native implementation as a result of this. We have had way too many patches where Xen PV hacks effectively nailgun arbitrary, and sometimes poor, design decisions in place and now we can't fix them. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation
The syscall ABI still has the wrong semantics. Aka totally unmaintainable and umergeable. The concept of domU support is also strange. What does domU support even mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls over. I expect a lot of decisions about what code can be shared and what code can't is going to be driven by the simple question what does the syscall mean. Sharing machine_kexec.c and relocate_kernel.S does not make much sense to me when what you are doing is effectively passing your arguments through to the Xen version of kexec. Either Xen has it's own version of those routines or I expect the Xen version of kexec is buggy. I can't imagine what sharing that code would mean. By the same token I can't any need to duplicate the code either. Furthermore since this is just passing data from one version of the syscall to another I expect you can share the majority of the code across all architectures that implement Xen. The only part I can see being arch specific is the Xen syscall stub. With respect to the proposed semantics of silently giving the kexec system call different meaning when running under Xen, /sbin/kexec has to act somewhat differently when loading code into the Xen hypervisor so there is no point not making that explicit in the ABI. Eric -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/