Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread H. Peter Anvin

On 01/11/2013 01:08 PM, Vivek Goyal wrote:


A signed /sbin/kexec would realistically have to be statically linked,
at least in the short term; otherwise the libraries and ld.so would need
verification as well.


Yes. That's the expectation. Sign only statically linked exeutables which
don't do any of dlopen() stuff either.

In fact in the patch, I fail the exec() if signed executable has
interpreter.



As I said, though (and possibly not for kexec, that depends): in the 
long term we probably want a way to be able to sign all kinds binaries 
in the system.


-hpa


--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Vivek Goyal
On Fri, Jan 11, 2013 at 01:03:41PM -0800, H. Peter Anvin wrote:
> On 01/11/2013 12:52 PM, Vivek Goyal wrote:
> > 
> > Eric,
> > 
> > In a private conversation, David Howells suggested why not pass kernel
> > signature in a segment to kernel and kernel can do the verification.
> > 
> > /sbin/kexec signature is verified by kernel at exec() time. Then
> > /sbin/kexec just passes one signature segment (after regular segment) for
> > each segment being loaded. The segments which don't have signature,
> > are passed with section size 0. And signature passing behavior can be
> > controlled by one new kexec flag.
> > 
> > That way /sbin/kexec does not have to worry about doing any verification
> > by itself. In fact, I am not sure how it can do the verification when
> > crypto libraries it will need are not signed (assuming they are not
> > statically linked in).
> > 
> > What do you think about this idea?
> > 
> 
> A signed /sbin/kexec would realistically have to be statically linked,
> at least in the short term; otherwise the libraries and ld.so would need
> verification as well.

Yes. That's the expectation. Sign only statically linked exeutables which
don't do any of dlopen() stuff either.

In fact in the patch, I fail the exec() if signed executable has
interpreter.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread H. Peter Anvin
On 01/11/2013 12:52 PM, Vivek Goyal wrote:
> 
> Eric,
> 
> In a private conversation, David Howells suggested why not pass kernel
> signature in a segment to kernel and kernel can do the verification.
> 
> /sbin/kexec signature is verified by kernel at exec() time. Then
> /sbin/kexec just passes one signature segment (after regular segment) for
> each segment being loaded. The segments which don't have signature,
> are passed with section size 0. And signature passing behavior can be
> controlled by one new kexec flag.
> 
> That way /sbin/kexec does not have to worry about doing any verification
> by itself. In fact, I am not sure how it can do the verification when
> crypto libraries it will need are not signed (assuming they are not
> statically linked in).
> 
> What do you think about this idea?
> 

A signed /sbin/kexec would realistically have to be statically linked,
at least in the short term; otherwise the libraries and ld.so would need
verification as well.

Now, that *might* very well have some real value -- there are certainly
users out there who would very much want only binaries signed with
specific keys to get run on their system.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Vivek Goyal
On Fri, Jan 11, 2013 at 12:26:56PM -0800, Eric W. Biederman wrote:

[..]
> Recently there is a desire to figure out how to /sbin/kexec support
> signed kernel images.  What will probably happen is to have a specially
> trusted userspace application perform the verification.  Sort of like
> dom0 for the linux userspace.  A few other ideas have been batted around
> but none that have stuck.

[ CC David Howells ]

Eric,

In a private conversation, David Howells suggested why not pass kernel
signature in a segment to kernel and kernel can do the verification.

/sbin/kexec signature is verified by kernel at exec() time. Then
/sbin/kexec just passes one signature segment (after regular segment) for
each segment being loaded. The segments which don't have signature,
are passed with section size 0. And signature passing behavior can be
controlled by one new kexec flag.

That way /sbin/kexec does not have to worry about doing any verification
by itself. In fact, I am not sure how it can do the verification when
crypto libraries it will need are not signed (assuming they are not
statically linked in).

What do you think about this idea?

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Vivek Goyal
On Fri, Jan 11, 2013 at 12:26:48PM -0800, H. Peter Anvin wrote:
> >
> >And there is nothing fancy to be done for EFI and SecureBoot? Or is
> >that something that the kernel has to handle on its own (so somehow
> >passing some certificates to somewhere).
> >
> 
> For EFI, no... other than passing the EFI parameters, which
> apparently is *not* currently done (David Woodhouse is working on
> it.)  Secure boot is still a work in progress.

For secureboot, as a first step in that direction, I just wrote some code
to sign elf executable and be able to verify it in kernel upon exec(). I
am soon planning to post RFC code (most likely next week).

Hopefully we will be able to sign statically signed /sbin/kexec, give
it extra capability (upon signature verification) to be able to call
sys_exec().

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread H. Peter Anvin


And there is nothing fancy to be done for EFI and SecureBoot? Or is
that something that the kernel has to handle on its own (so somehow
passing some certificates to somewhere).



For EFI, no... other than passing the EFI parameters, which apparently 
is *not* currently done (David Woodhouse is working on it.)  Secure boot 
is still a work in progress.



--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Eric W. Biederman
Konrad Rzeszutek Wilk  writes:

> On Thu, Jan 10, 2013 at 08:16:48PM -0800, Eric W. Biederman wrote:

>> The basic kexec interface is.
>> 
>> load ranges of virtual addresses physical addresses.
>> jump to the physical address  with identity mapped page tables.
>> 
>> There are a few flags to allow for different usage scenarios like
>> kexec on panic vs normal kexec.
>
> And there is nothing fancy to be done for EFI and SecureBoot?

There is a mess with EFI.  Reports are that EFI is a bug ridden pile,
and people keep advocating that we make more and more EFI calls in the
main kernel.  There is an argument over set_virtual_mapping, which is a
call that can be made only once which relocates the EFI code to a
different address, which makes life inconvient for kexec.  There is
another argument that EFI doesn't actually work if you don't make the
set_virtual_mapping call so we can't remove it and always use physical
addresses.

Frankly the only sane way to run a linux kernel under EFI is to scrape
up the information needed to talk to the hardware directly and ignore
EFI.  That is what we have historically done in the face of BIOS madness
and if anything the situation is worse with EFI, but it looks like we
are going to have to learn that the hard way.

Recently there is a desire to figure out how to /sbin/kexec support
signed kernel images.  What will probably happen is to have a specially
trusted userspace application perform the verification.  Sort of like
dom0 for the linux userspace.  A few other ideas have been batted around
but none that have stuck.

None of that is really about SecureBoot.  It is all trusting the kernel
binary but not trusting userspace.  With SecureBoot being an excuse for
coming up with a policy like that.

It looks like the answer to SecureBoot at this point may simply be just
reconfigure your BIOS or root Windows and EFI to get the hardware to do
what you want.

So the answer for looking forward for Xen dom0 is: A trusted /sbin/kexec
won't require changes.  The other suggest solution is a flag that says a
specific chunk of the loaded image is a signature that the magic trust
faires can verify.  As long as you have a flag bit free you should be
able to implement that policy if we ever implement it.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Eric W. Biederman
David Vrabel  writes:

> On 11/01/13 13:22, Daniel Kiper wrote:
>> On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote:
>>> On 04/01/13 17:01, Daniel Kiper wrote:
 My .5 cents:
   - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload;
 probably we should introduce KEXEC_CMD_kexec_load2 and 
 KEXEC_CMD_kexec_unload2;
 load should __LOAD__ kernel image and other things into hypervisor 
 memory;
>>>
>>> Yes, but I don't see how we can easily support both ABIs easily.  I'd be
>>> in favour of replacing the existing hypercalls and requiring updated
>>> kexec tools in dom0 (this isn't that different to requiring the correct
>>> libxc in dom0).
>> 
>> Why? Just define new strutures for new functions of kexec hypercall.
>> That should suffice.
>
> The current hypervisor ABI depends on an internal kernel ABI (i.e., the
> ABI provided by relocate_kernel).  We do not want hypervisor internals
> to be constrained by having to be compatible with kernel internals.

I think this is violent agreement.  A new call with new arguments seems
agreed upon.  The only question seems to be what happens to the old
hypercall.  Keeping the current deprecated hypercall with the current
ABI and not updating it, or modifying the current hypercall to return
the xen equivalant of -ENOSYS seems to be the only question.

Certainly /sbin/kexec will only support the new hypercall once the
support has merged.

>> No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
>> system is completly shutdown. Return form 
>> HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
>> would require to restore some kernel functionalities. It maybe impossible
>> in some cases. Additionally, it means that some changes should be made
>> in generic kexec code path. As I know kexec maintainers are very reluctant
>> to make such things.
>
> Huh?  There only needs to be a call to a new hypervisor_crash_kexec()
> function (which would then call the Xen specific crash hypercall) at the
> very beginning of crash_kexec().  If this returns the normal
> crash/shutdown path is done (which could even include a guest kexec!).

Can you imagine what crash_kexec would look like if every architecture
would hard code their own little piece in there?

The practical issue with changing crash_kexec is that you are hard
coding Xen policy just before a jump to a piece of code whose purpose
is to implement policy.

>From a maintenance and code comprehension stand-ponit it is much cleaner
to put the hypervisor_crash_kexec() hypercall into the code that is
loaded with sys_kexec_load and is branched to by crash_kexec.  I would
have no problem with hard coding that behavior into /sbin/kexec in
the case of Xen dom0.

Having any code have different semantics when running under Xen is a
maintenance nightmare, and why we are having the conversation years and
years after the initial deployment of Xen.  A tiny hard coded stub that
calls a hypercall should work indefinitely with no one having to do
anything.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Daniel Kiper
On Fri, Jan 11, 2013 at 03:22:35PM +, David Vrabel wrote:
> On 11/01/13 13:22, Daniel Kiper wrote:
> > On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote:
> >> On 04/01/13 17:01, Daniel Kiper wrote:
> >>> My .5 cents:
> >>>   - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload;
> >>> probably we should introduce KEXEC_CMD_kexec_load2 and 
> >>> KEXEC_CMD_kexec_unload2;
> >>> load should __LOAD__ kernel image and other things into hypervisor 
> >>> memory;
> >>
> >> Yes, but I don't see how we can easily support both ABIs easily.  I'd be
> >> in favour of replacing the existing hypercalls and requiring updated
> >> kexec tools in dom0 (this isn't that different to requiring the correct
> >> libxc in dom0).
> >
> > Why? Just define new strutures for new functions of kexec hypercall.
> > That should suffice.
>
> The current hypervisor ABI depends on an internal kernel ABI (i.e., the
> ABI provided by relocate_kernel).  We do not want hypervisor internals
> to be constrained by having to be compatible with kernel internals.

I agree. I did not sugest to stay with current interface. Old 
KEXEC_CMD_kexec_load
and KEXEC_CMD_kexec_unload should stay as is for backward compatibility (maybe
someday they should be removed). However, I do not see any problem in adding
new KEXEC_CMD_kexec_load2 and KEXEC_CMD_kexec_unload2 functions with completely
new arguments to existing kexec hypercall. Let's say something like that:

struct kexec_segment {
  void *buf;
  size_t bufsz;
  unsigned long mem;
  size_t memsz;
};

struct xen_kexec_load2 {
  unsigned long entry;
  unsigned long nr_segments;
  struct kexec_segment *segments;
  unsigned long flags;
};

struct xen_kexec_load2 xkl2;

...

rc = HYPERVISOR_kexec_op(KEXEC_CMD_kexec_load2, );

Regarding relocate_kernel(), it should be Xen hypervisor specific but
probably most of the code will be similar to its Linux Kernel version.
It should only at the end leave machine in state identical with state
left by Linux Kernel version of relocate_kernel(). Just to be compatible
with existing kexec/kdump implementations.

> >>> probably we should introduce KEXEC_CMD_kexec_load2 and KEXEC_CMD_k
>
> >>>   - Hmmm... Now I think that we should still use kexec syscall to load 
> >>> image
> >>> into Xen memory (with new KEXEC_CMD_kexec_load2) because it 
> >>> establishes
> >>> all things which are needed to call kdump if dom0 crashes; however,
> >>> I could be wrong...
> >>
> >> I don't think we need the kexec syscall.  The kernel can unconditionally
> >> do the crash hypercall, which will return if the kdump kernel isn't
> >> loaded and the kernel can fall back to the regular non-kexec panic.
> >
> > No, please do not do that. When you call 
> > HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
> > system is completly shutdown. Return form 
> > HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
> > would require to restore some kernel functionalities. It maybe impossible
> > in some cases. Additionally, it means that some changes should be made
> > in generic kexec code path. As I know kexec maintainers are very reluctant
> > to make such things.
>
> Huh?  There only needs to be a call to a new hypervisor_crash_kexec()
> function (which would then call the Xen specific crash hypercall) at the
> very beginning of crash_kexec().  If this returns the normal
> crash/shutdown path is done (which could even include a guest kexec!).

I am still not convinced. Howerver, go ahead with your vision in this case.
Later we will see it makes sense.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Konrad Rzeszutek Wilk
On Thu, Jan 10, 2013 at 08:16:48PM -0800, Eric W. Biederman wrote:
> Konrad Rzeszutek Wilk  writes:
> 
> > On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote:
> >> I think that new kexec hypercall function should mimics kexec syscall.
> >> It means that all arguments passed to hypercall should have same types
> >> if it is possible or if it is not possible then conversion should be done
> >> in very easy way. Additionally, I think that one call of new hypercall
> >> load function should load all needed thinks in right place and
> >> return relevant status. Last but not least, new functionality should
> >
> > We are not restricted to just _one_ hypercall. And this loading
> > thing could be similar to the micrcode hypercall - which just points
> > to a virtual address along with the length - and says 'load me'.
> >
> >> be available through /dev/xen/privcmd or directly from kernel without
> >> bigger effort.
> >
> > Perhaps we should have a email thread on xen-devel where we hash out
> > some ideas. Eric, would you be OK included on this - it would make
> > sense for this mechanism to be as future-proof as possible - and I am not
> > sure what your plans for kexec are in the future?
> 
> The basic kexec interface is.
> 
> load ranges of virtual addresses physical addresses.
> jump to the physical address  with identity mapped page tables.
> 
> There are a few flags to allow for different usage scenarios like
> kexec on panic vs normal kexec.

And there is nothing fancy to be done for EFI and SecureBoot? Or is
that something that the kernel has to handle on its own (so somehow
passing some certificates to somewhere).

> 
> It is very very simple and very extensible.  All of the weird glue
> happens in userspace.
> 
> Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread David Vrabel
On 11/01/13 13:22, Daniel Kiper wrote:
> On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote:
>> On 04/01/13 17:01, Daniel Kiper wrote:
>>> My .5 cents:
>>>   - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload;
>>> probably we should introduce KEXEC_CMD_kexec_load2 and 
>>> KEXEC_CMD_kexec_unload2;
>>> load should __LOAD__ kernel image and other things into hypervisor 
>>> memory;
>>
>> Yes, but I don't see how we can easily support both ABIs easily.  I'd be
>> in favour of replacing the existing hypercalls and requiring updated
>> kexec tools in dom0 (this isn't that different to requiring the correct
>> libxc in dom0).
> 
> Why? Just define new strutures for new functions of kexec hypercall.
> That should suffice.

The current hypervisor ABI depends on an internal kernel ABI (i.e., the
ABI provided by relocate_kernel).  We do not want hypervisor internals
to be constrained by having to be compatible with kernel internals.

>>>   - Hmmm... Now I think that we should still use kexec syscall to load image
>>> into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes
>>> all things which are needed to call kdump if dom0 crashes; however,
>>> I could be wrong...
>>
>> I don't think we need the kexec syscall.  The kernel can unconditionally
>> do the crash hypercall, which will return if the kdump kernel isn't
>> loaded and the kernel can fall back to the regular non-kexec panic.
> 
> No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
> system is completly shutdown. Return form HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
> would require to restore some kernel functionalities. It maybe impossible
> in some cases. Additionally, it means that some changes should be made
> in generic kexec code path. As I know kexec maintainers are very reluctant
> to make such things.

Huh?  There only needs to be a call to a new hypervisor_crash_kexec()
function (which would then call the Xen specific crash hypercall) at the
very beginning of crash_kexec().  If this returns the normal
crash/shutdown path is done (which could even include a guest kexec!).

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Daniel Kiper
On Mon, Jan 07, 2013 at 01:49:44PM +, Ian Campbell wrote:
> On Mon, 2013-01-07 at 12:34 +, Daniel Kiper wrote:
> > I think that new kexec hypercall function should mimics kexec syscall.
> 
> We want to have an interface can be used by non-Linux domains (both dom0
> and domU) as well though, so please bear this in mind.

I agree, but all arguments passed to kexec syscall are quiet generic and they
do not impose any limitations. Just look into include/linux/kexec.h.
That is why I think that a lot of things could be taken from
Linux kexec implementation.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Daniel Kiper
On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote:
> On 04/01/13 17:01, Daniel Kiper wrote:
> > On Fri, Jan 04, 2013 at 02:38:44PM +, David Vrabel wrote:
> >> On 04/01/13 14:22, Daniel Kiper wrote:
> >>> On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
>  On 27/12/12 18:02, Eric W. Biederman wrote:
> > Andrew Cooper  writes:
> >
> >> On 27/12/2012 07:53, Eric W. Biederman wrote:
> >>> The syscall ABI still has the wrong semantics.
> >>>
> >>> Aka totally unmaintainable and umergeable.
> >>>
> >>> The concept of domU support is also strange.  What does domU support 
> >>> even mean, when the dom0 support is loading a kernel to pick up Xen 
> >>> when Xen falls over.
> >> There are two requirements pulling at this patch series, but I agree
> >> that we need to clarify them.
> > It probably make sense to split them apart a little even.
> >
> >
> 
>  Thinking about this split, there might be a way to simply it even more.
> 
>  /sbin/kexec can load the "Xen" crash kernel itself by issuing
>  hypercalls using /dev/xen/privcmd.  This would remove the need for
>  the dom0 kernel to distinguish between loading a crash kernel for
>  itself and loading a kernel for Xen.
> 
>  Or is this just a silly idea complicating the matter?
> >>>
> >>> This is impossible with current Xen kexec/kdump interface.
> >>> It should be changed to do that. However, I suppose that
> >>> Xen community would not be interested in such changes.
> >>
> >> I don't see why the hypercall ABI cannot be extended with new sub-ops
> >> that do the right thing -- the existing ABI is a bit weird.
> >>
> >> I plan to start prototyping something shortly (hopefully next week) for
> >> the Xen kexec case.
> >
> > Wow... As I can this time Xen community is interested in...
> > That is great. I agree that current kexec interface is not ideal.
>
> I spent some more time looking at the existing interface and
> implementation and it really is broken.
>
> > David, I am happy to help in that process. However, if you wish I could
> > carry it myself. Anyway, it looks that I should hold on with my
> > Linux kexec/kdump patches.
>
> I should be able to post some prototype patches for Xen in a few weeks.
>  No guarantees though.

That is great. If you need any help drop me a line.

> > My .5 cents:
> >   - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload;
> > probably we should introduce KEXEC_CMD_kexec_load2 and 
> > KEXEC_CMD_kexec_unload2;
> > load should __LOAD__ kernel image and other things into hypervisor 
> > memory;
>
> Yes, but I don't see how we can easily support both ABIs easily.  I'd be
> in favour of replacing the existing hypercalls and requiring updated
> kexec tools in dom0 (this isn't that different to requiring the correct
> libxc in dom0).

Why? Just define new strutures for new functions of kexec hypercall.
That should suffice.

> > I suppose that allmost all things could be copied from 
> > linux/kernel/kexec.c,
> > 
> > linux/arch/x86/kernel/{machine_kexec_$(BITS).c,relocate_kernel_$(BITS).c};
> > I think that KEXEC_CMD_kexec should stay as is,
>
> I don't think we want all the junk from Linux inside Xen -- we only want
> to support the kdump case and do not have to handle returning from the
> kexec image.

I do not want to implement kexec jump or stuff like. However, I think that
it is worth use code which could be used. As I know there are lot of stuff
which was taken with smaller or bigger changes from Linux Kernel.
Why we would like to reinvent the wheel this time?

Additionally, we should not drop kexec support. It is main part of kdump.
In case of kdump new kernel (and other stuff) is placed in prealocated
space in contrary to kexec. That's all. kexec is useful if you would like
to quickly (skipping BIOS) switch from Xen to baremetal Linux. If you drop
kexec support from Xen then you need alter kexec-tools package in bunch
of distros to take into account new Xen behavior.
I think that it is not we want to do.

> >   - Hmmm... Now I think that we should still use kexec syscall to load image
> > into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes
> > all things which are needed to call kdump if dom0 crashes; however,
> > I could be wrong...
>
> I don't think we need the kexec syscall.  The kernel can unconditionally
> do the crash hypercall, which will return if the kdump kernel isn't
> loaded and the kernel can fall back to the regular non-kexec panic.

No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
system is completly shutdown. Return form HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
would require to restore some kernel functionalities. It maybe impossible
in some cases. Additionally, it means that some changes should be made
in generic kexec code path. As I know kexec maintainers are very reluctant
to make such 

Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Daniel Kiper
On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote:
 On 04/01/13 17:01, Daniel Kiper wrote:
  On Fri, Jan 04, 2013 at 02:38:44PM +, David Vrabel wrote:
  On 04/01/13 14:22, Daniel Kiper wrote:
  On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
  On 27/12/12 18:02, Eric W. Biederman wrote:
  Andrew Cooperandrew.coop...@citrix.com  writes:
 
  On 27/12/2012 07:53, Eric W. Biederman wrote:
  The syscall ABI still has the wrong semantics.
 
  Aka totally unmaintainable and umergeable.
 
  The concept of domU support is also strange.  What does domU support 
  even mean, when the dom0 support is loading a kernel to pick up Xen 
  when Xen falls over.
  There are two requirements pulling at this patch series, but I agree
  that we need to clarify them.
  It probably make sense to split them apart a little even.
 
 
 
  Thinking about this split, there might be a way to simply it even more.
 
  /sbin/kexec can load the Xen crash kernel itself by issuing
  hypercalls using /dev/xen/privcmd.  This would remove the need for
  the dom0 kernel to distinguish between loading a crash kernel for
  itself and loading a kernel for Xen.
 
  Or is this just a silly idea complicating the matter?
 
  This is impossible with current Xen kexec/kdump interface.
  It should be changed to do that. However, I suppose that
  Xen community would not be interested in such changes.
 
  I don't see why the hypercall ABI cannot be extended with new sub-ops
  that do the right thing -- the existing ABI is a bit weird.
 
  I plan to start prototyping something shortly (hopefully next week) for
  the Xen kexec case.
 
  Wow... As I can this time Xen community is interested in...
  That is great. I agree that current kexec interface is not ideal.

 I spent some more time looking at the existing interface and
 implementation and it really is broken.

  David, I am happy to help in that process. However, if you wish I could
  carry it myself. Anyway, it looks that I should hold on with my
  Linux kexec/kdump patches.

 I should be able to post some prototype patches for Xen in a few weeks.
  No guarantees though.

That is great. If you need any help drop me a line.

  My .5 cents:
- We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload;
  probably we should introduce KEXEC_CMD_kexec_load2 and 
  KEXEC_CMD_kexec_unload2;
  load should __LOAD__ kernel image and other things into hypervisor 
  memory;

 Yes, but I don't see how we can easily support both ABIs easily.  I'd be
 in favour of replacing the existing hypercalls and requiring updated
 kexec tools in dom0 (this isn't that different to requiring the correct
 libxc in dom0).

Why? Just define new strutures for new functions of kexec hypercall.
That should suffice.

  I suppose that allmost all things could be copied from 
  linux/kernel/kexec.c,
  
  linux/arch/x86/kernel/{machine_kexec_$(BITS).c,relocate_kernel_$(BITS).c};
  I think that KEXEC_CMD_kexec should stay as is,

 I don't think we want all the junk from Linux inside Xen -- we only want
 to support the kdump case and do not have to handle returning from the
 kexec image.

I do not want to implement kexec jump or stuff like. However, I think that
it is worth use code which could be used. As I know there are lot of stuff
which was taken with smaller or bigger changes from Linux Kernel.
Why we would like to reinvent the wheel this time?

Additionally, we should not drop kexec support. It is main part of kdump.
In case of kdump new kernel (and other stuff) is placed in prealocated
space in contrary to kexec. That's all. kexec is useful if you would like
to quickly (skipping BIOS) switch from Xen to baremetal Linux. If you drop
kexec support from Xen then you need alter kexec-tools package in bunch
of distros to take into account new Xen behavior.
I think that it is not we want to do.

- Hmmm... Now I think that we should still use kexec syscall to load image
  into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes
  all things which are needed to call kdump if dom0 crashes; however,
  I could be wrong...

 I don't think we need the kexec syscall.  The kernel can unconditionally
 do the crash hypercall, which will return if the kdump kernel isn't
 loaded and the kernel can fall back to the regular non-kexec panic.

No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
system is completly shutdown. Return form HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
would require to restore some kernel functionalities. It maybe impossible
in some cases. Additionally, it means that some changes should be made
in generic kexec code path. As I know kexec maintainers are very reluctant
to make such things.

 This will allow the kexec syscall to be used only for the domU kexec case.

- last but not least, we should think about support for PV guests
  too.

 I won't be looking at this.

OK.

 To avoid confusion about the two largely 

Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Daniel Kiper
On Mon, Jan 07, 2013 at 01:49:44PM +, Ian Campbell wrote:
 On Mon, 2013-01-07 at 12:34 +, Daniel Kiper wrote:
  I think that new kexec hypercall function should mimics kexec syscall.
 
 We want to have an interface can be used by non-Linux domains (both dom0
 and domU) as well though, so please bear this in mind.

I agree, but all arguments passed to kexec syscall are quiet generic and they
do not impose any limitations. Just look into include/linux/kexec.h.
That is why I think that a lot of things could be taken from
Linux kexec implementation.

Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread David Vrabel
On 11/01/13 13:22, Daniel Kiper wrote:
 On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote:
 On 04/01/13 17:01, Daniel Kiper wrote:
 My .5 cents:
   - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload;
 probably we should introduce KEXEC_CMD_kexec_load2 and 
 KEXEC_CMD_kexec_unload2;
 load should __LOAD__ kernel image and other things into hypervisor 
 memory;

 Yes, but I don't see how we can easily support both ABIs easily.  I'd be
 in favour of replacing the existing hypercalls and requiring updated
 kexec tools in dom0 (this isn't that different to requiring the correct
 libxc in dom0).
 
 Why? Just define new strutures for new functions of kexec hypercall.
 That should suffice.

The current hypervisor ABI depends on an internal kernel ABI (i.e., the
ABI provided by relocate_kernel).  We do not want hypervisor internals
to be constrained by having to be compatible with kernel internals.

   - Hmmm... Now I think that we should still use kexec syscall to load image
 into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes
 all things which are needed to call kdump if dom0 crashes; however,
 I could be wrong...

 I don't think we need the kexec syscall.  The kernel can unconditionally
 do the crash hypercall, which will return if the kdump kernel isn't
 loaded and the kernel can fall back to the regular non-kexec panic.
 
 No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
 system is completly shutdown. Return form HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
 would require to restore some kernel functionalities. It maybe impossible
 in some cases. Additionally, it means that some changes should be made
 in generic kexec code path. As I know kexec maintainers are very reluctant
 to make such things.

Huh?  There only needs to be a call to a new hypervisor_crash_kexec()
function (which would then call the Xen specific crash hypercall) at the
very beginning of crash_kexec().  If this returns the normal
crash/shutdown path is done (which could even include a guest kexec!).

David
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Konrad Rzeszutek Wilk
On Thu, Jan 10, 2013 at 08:16:48PM -0800, Eric W. Biederman wrote:
 Konrad Rzeszutek Wilk konrad.w...@oracle.com writes:
 
  On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote:
  I think that new kexec hypercall function should mimics kexec syscall.
  It means that all arguments passed to hypercall should have same types
  if it is possible or if it is not possible then conversion should be done
  in very easy way. Additionally, I think that one call of new hypercall
  load function should load all needed thinks in right place and
  return relevant status. Last but not least, new functionality should
 
  We are not restricted to just _one_ hypercall. And this loading
  thing could be similar to the micrcode hypercall - which just points
  to a virtual address along with the length - and says 'load me'.
 
  be available through /dev/xen/privcmd or directly from kernel without
  bigger effort.
 
  Perhaps we should have a email thread on xen-devel where we hash out
  some ideas. Eric, would you be OK included on this - it would make
  sense for this mechanism to be as future-proof as possible - and I am not
  sure what your plans for kexec are in the future?
 
 The basic kexec interface is.
 
 load ranges of virtual addresses physical addresses.
 jump to the physical address  with identity mapped page tables.
 
 There are a few flags to allow for different usage scenarios like
 kexec on panic vs normal kexec.

And there is nothing fancy to be done for EFI and SecureBoot? Or is
that something that the kernel has to handle on its own (so somehow
passing some certificates to somewhere).

 
 It is very very simple and very extensible.  All of the weird glue
 happens in userspace.
 
 Eric
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Daniel Kiper
On Fri, Jan 11, 2013 at 03:22:35PM +, David Vrabel wrote:
 On 11/01/13 13:22, Daniel Kiper wrote:
  On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote:
  On 04/01/13 17:01, Daniel Kiper wrote:
  My .5 cents:
- We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload;
  probably we should introduce KEXEC_CMD_kexec_load2 and 
  KEXEC_CMD_kexec_unload2;
  load should __LOAD__ kernel image and other things into hypervisor 
  memory;
 
  Yes, but I don't see how we can easily support both ABIs easily.  I'd be
  in favour of replacing the existing hypercalls and requiring updated
  kexec tools in dom0 (this isn't that different to requiring the correct
  libxc in dom0).
 
  Why? Just define new strutures for new functions of kexec hypercall.
  That should suffice.

 The current hypervisor ABI depends on an internal kernel ABI (i.e., the
 ABI provided by relocate_kernel).  We do not want hypervisor internals
 to be constrained by having to be compatible with kernel internals.

I agree. I did not sugest to stay with current interface. Old 
KEXEC_CMD_kexec_load
and KEXEC_CMD_kexec_unload should stay as is for backward compatibility (maybe
someday they should be removed). However, I do not see any problem in adding
new KEXEC_CMD_kexec_load2 and KEXEC_CMD_kexec_unload2 functions with completely
new arguments to existing kexec hypercall. Let's say something like that:

struct kexec_segment {
  void *buf;
  size_t bufsz;
  unsigned long mem;
  size_t memsz;
};

struct xen_kexec_load2 {
  unsigned long entry;
  unsigned long nr_segments;
  struct kexec_segment *segments;
  unsigned long flags;
};

struct xen_kexec_load2 xkl2;

...

rc = HYPERVISOR_kexec_op(KEXEC_CMD_kexec_load2, xkl2);

Regarding relocate_kernel(), it should be Xen hypervisor specific but
probably most of the code will be similar to its Linux Kernel version.
It should only at the end leave machine in state identical with state
left by Linux Kernel version of relocate_kernel(). Just to be compatible
with existing kexec/kdump implementations.

  probably we should introduce KEXEC_CMD_kexec_load2 and KEXEC_CMD_k

- Hmmm... Now I think that we should still use kexec syscall to load 
  image
  into Xen memory (with new KEXEC_CMD_kexec_load2) because it 
  establishes
  all things which are needed to call kdump if dom0 crashes; however,
  I could be wrong...
 
  I don't think we need the kexec syscall.  The kernel can unconditionally
  do the crash hypercall, which will return if the kdump kernel isn't
  loaded and the kernel can fall back to the regular non-kexec panic.
 
  No, please do not do that. When you call 
  HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
  system is completly shutdown. Return form 
  HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
  would require to restore some kernel functionalities. It maybe impossible
  in some cases. Additionally, it means that some changes should be made
  in generic kexec code path. As I know kexec maintainers are very reluctant
  to make such things.

 Huh?  There only needs to be a call to a new hypervisor_crash_kexec()
 function (which would then call the Xen specific crash hypercall) at the
 very beginning of crash_kexec().  If this returns the normal
 crash/shutdown path is done (which could even include a guest kexec!).

I am still not convinced. Howerver, go ahead with your vision in this case.
Later we will see it makes sense.

Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Eric W. Biederman
David Vrabel david.vra...@citrix.com writes:

 On 11/01/13 13:22, Daniel Kiper wrote:
 On Thu, Jan 10, 2013 at 02:19:55PM +, David Vrabel wrote:
 On 04/01/13 17:01, Daniel Kiper wrote:
 My .5 cents:
   - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload;
 probably we should introduce KEXEC_CMD_kexec_load2 and 
 KEXEC_CMD_kexec_unload2;
 load should __LOAD__ kernel image and other things into hypervisor 
 memory;

 Yes, but I don't see how we can easily support both ABIs easily.  I'd be
 in favour of replacing the existing hypercalls and requiring updated
 kexec tools in dom0 (this isn't that different to requiring the correct
 libxc in dom0).
 
 Why? Just define new strutures for new functions of kexec hypercall.
 That should suffice.

 The current hypervisor ABI depends on an internal kernel ABI (i.e., the
 ABI provided by relocate_kernel).  We do not want hypervisor internals
 to be constrained by having to be compatible with kernel internals.

I think this is violent agreement.  A new call with new arguments seems
agreed upon.  The only question seems to be what happens to the old
hypercall.  Keeping the current deprecated hypercall with the current
ABI and not updating it, or modifying the current hypercall to return
the xen equivalant of -ENOSYS seems to be the only question.

Certainly /sbin/kexec will only support the new hypercall once the
support has merged.

 No, please do not do that. When you call HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
 system is completly shutdown. Return form 
 HYPERVISOR_kexec_op(KEXEC_CMD_kexec)
 would require to restore some kernel functionalities. It maybe impossible
 in some cases. Additionally, it means that some changes should be made
 in generic kexec code path. As I know kexec maintainers are very reluctant
 to make such things.

 Huh?  There only needs to be a call to a new hypervisor_crash_kexec()
 function (which would then call the Xen specific crash hypercall) at the
 very beginning of crash_kexec().  If this returns the normal
 crash/shutdown path is done (which could even include a guest kexec!).

Can you imagine what crash_kexec would look like if every architecture
would hard code their own little piece in there?

The practical issue with changing crash_kexec is that you are hard
coding Xen policy just before a jump to a piece of code whose purpose
is to implement policy.

From a maintenance and code comprehension stand-ponit it is much cleaner
to put the hypervisor_crash_kexec() hypercall into the code that is
loaded with sys_kexec_load and is branched to by crash_kexec.  I would
have no problem with hard coding that behavior into /sbin/kexec in
the case of Xen dom0.

Having any code have different semantics when running under Xen is a
maintenance nightmare, and why we are having the conversation years and
years after the initial deployment of Xen.  A tiny hard coded stub that
calls a hypercall should work indefinitely with no one having to do
anything.

Eric

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Eric W. Biederman
Konrad Rzeszutek Wilk konrad.w...@oracle.com writes:

 On Thu, Jan 10, 2013 at 08:16:48PM -0800, Eric W. Biederman wrote:

 The basic kexec interface is.
 
 load ranges of virtual addresses physical addresses.
 jump to the physical address  with identity mapped page tables.
 
 There are a few flags to allow for different usage scenarios like
 kexec on panic vs normal kexec.

 And there is nothing fancy to be done for EFI and SecureBoot?

There is a mess with EFI.  Reports are that EFI is a bug ridden pile,
and people keep advocating that we make more and more EFI calls in the
main kernel.  There is an argument over set_virtual_mapping, which is a
call that can be made only once which relocates the EFI code to a
different address, which makes life inconvient for kexec.  There is
another argument that EFI doesn't actually work if you don't make the
set_virtual_mapping call so we can't remove it and always use physical
addresses.

Frankly the only sane way to run a linux kernel under EFI is to scrape
up the information needed to talk to the hardware directly and ignore
EFI.  That is what we have historically done in the face of BIOS madness
and if anything the situation is worse with EFI, but it looks like we
are going to have to learn that the hard way.

Recently there is a desire to figure out how to /sbin/kexec support
signed kernel images.  What will probably happen is to have a specially
trusted userspace application perform the verification.  Sort of like
dom0 for the linux userspace.  A few other ideas have been batted around
but none that have stuck.

None of that is really about SecureBoot.  It is all trusting the kernel
binary but not trusting userspace.  With SecureBoot being an excuse for
coming up with a policy like that.

It looks like the answer to SecureBoot at this point may simply be just
reconfigure your BIOS or root Windows and EFI to get the hardware to do
what you want.

So the answer for looking forward for Xen dom0 is: A trusted /sbin/kexec
won't require changes.  The other suggest solution is a flag that says a
specific chunk of the loaded image is a signature that the magic trust
faires can verify.  As long as you have a flag bit free you should be
able to implement that policy if we ever implement it.

Eric
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread H. Peter Anvin


And there is nothing fancy to be done for EFI and SecureBoot? Or is
that something that the kernel has to handle on its own (so somehow
passing some certificates to somewhere).



For EFI, no... other than passing the EFI parameters, which apparently 
is *not* currently done (David Woodhouse is working on it.)  Secure boot 
is still a work in progress.



--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Vivek Goyal
On Fri, Jan 11, 2013 at 12:26:48PM -0800, H. Peter Anvin wrote:
 
 And there is nothing fancy to be done for EFI and SecureBoot? Or is
 that something that the kernel has to handle on its own (so somehow
 passing some certificates to somewhere).
 
 
 For EFI, no... other than passing the EFI parameters, which
 apparently is *not* currently done (David Woodhouse is working on
 it.)  Secure boot is still a work in progress.

For secureboot, as a first step in that direction, I just wrote some code
to sign elf executable and be able to verify it in kernel upon exec(). I
am soon planning to post RFC code (most likely next week).

Hopefully we will be able to sign statically signed /sbin/kexec, give
it extra capability (upon signature verification) to be able to call
sys_exec().

Thanks
Vivek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Vivek Goyal
On Fri, Jan 11, 2013 at 12:26:56PM -0800, Eric W. Biederman wrote:

[..]
 Recently there is a desire to figure out how to /sbin/kexec support
 signed kernel images.  What will probably happen is to have a specially
 trusted userspace application perform the verification.  Sort of like
 dom0 for the linux userspace.  A few other ideas have been batted around
 but none that have stuck.

[ CC David Howells ]

Eric,

In a private conversation, David Howells suggested why not pass kernel
signature in a segment to kernel and kernel can do the verification.

/sbin/kexec signature is verified by kernel at exec() time. Then
/sbin/kexec just passes one signature segment (after regular segment) for
each segment being loaded. The segments which don't have signature,
are passed with section size 0. And signature passing behavior can be
controlled by one new kexec flag.

That way /sbin/kexec does not have to worry about doing any verification
by itself. In fact, I am not sure how it can do the verification when
crypto libraries it will need are not signed (assuming they are not
statically linked in).

What do you think about this idea?

Thanks
Vivek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread H. Peter Anvin
On 01/11/2013 12:52 PM, Vivek Goyal wrote:
 
 Eric,
 
 In a private conversation, David Howells suggested why not pass kernel
 signature in a segment to kernel and kernel can do the verification.
 
 /sbin/kexec signature is verified by kernel at exec() time. Then
 /sbin/kexec just passes one signature segment (after regular segment) for
 each segment being loaded. The segments which don't have signature,
 are passed with section size 0. And signature passing behavior can be
 controlled by one new kexec flag.
 
 That way /sbin/kexec does not have to worry about doing any verification
 by itself. In fact, I am not sure how it can do the verification when
 crypto libraries it will need are not signed (assuming they are not
 statically linked in).
 
 What do you think about this idea?
 

A signed /sbin/kexec would realistically have to be statically linked,
at least in the short term; otherwise the libraries and ld.so would need
verification as well.

Now, that *might* very well have some real value -- there are certainly
users out there who would very much want only binaries signed with
specific keys to get run on their system.

-hpa


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread Vivek Goyal
On Fri, Jan 11, 2013 at 01:03:41PM -0800, H. Peter Anvin wrote:
 On 01/11/2013 12:52 PM, Vivek Goyal wrote:
  
  Eric,
  
  In a private conversation, David Howells suggested why not pass kernel
  signature in a segment to kernel and kernel can do the verification.
  
  /sbin/kexec signature is verified by kernel at exec() time. Then
  /sbin/kexec just passes one signature segment (after regular segment) for
  each segment being loaded. The segments which don't have signature,
  are passed with section size 0. And signature passing behavior can be
  controlled by one new kexec flag.
  
  That way /sbin/kexec does not have to worry about doing any verification
  by itself. In fact, I am not sure how it can do the verification when
  crypto libraries it will need are not signed (assuming they are not
  statically linked in).
  
  What do you think about this idea?
  
 
 A signed /sbin/kexec would realistically have to be statically linked,
 at least in the short term; otherwise the libraries and ld.so would need
 verification as well.

Yes. That's the expectation. Sign only statically linked exeutables which
don't do any of dlopen() stuff either.

In fact in the patch, I fail the exec() if signed executable has
interpreter.

Thanks
Vivek
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-11 Thread H. Peter Anvin

On 01/11/2013 01:08 PM, Vivek Goyal wrote:


A signed /sbin/kexec would realistically have to be statically linked,
at least in the short term; otherwise the libraries and ld.so would need
verification as well.


Yes. That's the expectation. Sign only statically linked exeutables which
don't do any of dlopen() stuff either.

In fact in the patch, I fail the exec() if signed executable has
interpreter.



As I said, though (and possibly not for kexec, that depends): in the 
long term we probably want a way to be able to sign all kinds binaries 
in the system.


-hpa


--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-10 Thread Eric W. Biederman
Konrad Rzeszutek Wilk  writes:

> On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote:
>> I think that new kexec hypercall function should mimics kexec syscall.
>> It means that all arguments passed to hypercall should have same types
>> if it is possible or if it is not possible then conversion should be done
>> in very easy way. Additionally, I think that one call of new hypercall
>> load function should load all needed thinks in right place and
>> return relevant status. Last but not least, new functionality should
>
> We are not restricted to just _one_ hypercall. And this loading
> thing could be similar to the micrcode hypercall - which just points
> to a virtual address along with the length - and says 'load me'.
>
>> be available through /dev/xen/privcmd or directly from kernel without
>> bigger effort.
>
> Perhaps we should have a email thread on xen-devel where we hash out
> some ideas. Eric, would you be OK included on this - it would make
> sense for this mechanism to be as future-proof as possible - and I am not
> sure what your plans for kexec are in the future?

The basic kexec interface is.

load ranges of virtual addresses physical addresses.
jump to the physical address  with identity mapped page tables.

There are a few flags to allow for different usage scenarios like
kexec on panic vs normal kexec.

It is very very simple and very extensible.  All of the weird glue
happens in userspace.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-10 Thread David Vrabel
On 04/01/13 17:01, Daniel Kiper wrote:
> On Fri, Jan 04, 2013 at 02:38:44PM +, David Vrabel wrote:
>> On 04/01/13 14:22, Daniel Kiper wrote:
>>> On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
 On 27/12/12 18:02, Eric W. Biederman wrote:
> Andrew Cooper  writes:
>
>> On 27/12/2012 07:53, Eric W. Biederman wrote:
>>> The syscall ABI still has the wrong semantics.
>>>
>>> Aka totally unmaintainable and umergeable.
>>>
>>> The concept of domU support is also strange.  What does domU support 
>>> even mean, when the dom0 support is loading a kernel to pick up Xen 
>>> when Xen falls over.
>> There are two requirements pulling at this patch series, but I agree
>> that we need to clarify them.
> It probably make sense to split them apart a little even.
>
>

 Thinking about this split, there might be a way to simply it even more.

 /sbin/kexec can load the "Xen" crash kernel itself by issuing
 hypercalls using /dev/xen/privcmd.  This would remove the need for
 the dom0 kernel to distinguish between loading a crash kernel for
 itself and loading a kernel for Xen.

 Or is this just a silly idea complicating the matter?
>>>
>>> This is impossible with current Xen kexec/kdump interface.
>>> It should be changed to do that. However, I suppose that
>>> Xen community would not be interested in such changes.
>>
>> I don't see why the hypercall ABI cannot be extended with new sub-ops
>> that do the right thing -- the existing ABI is a bit weird.
>>
>> I plan to start prototyping something shortly (hopefully next week) for
>> the Xen kexec case.
> 
> Wow... As I can this time Xen community is interested in...
> That is great. I agree that current kexec interface is not ideal.

I spent some more time looking at the existing interface and
implementation and it really is broken.

> David, I am happy to help in that process. However, if you wish I could
> carry it myself. Anyway, it looks that I should hold on with my
> Linux kexec/kdump patches.

I should be able to post some prototype patches for Xen in a few weeks.
 No guarantees though.

> My .5 cents:
>   - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload;
> probably we should introduce KEXEC_CMD_kexec_load2 and 
> KEXEC_CMD_kexec_unload2;
> load should __LOAD__ kernel image and other things into hypervisor memory;

Yes, but I don't see how we can easily support both ABIs easily.  I'd be
in favour of replacing the existing hypercalls and requiring updated
kexec tools in dom0 (this isn't that different to requiring the correct
libxc in dom0).

> I suppose that allmost all things could be copied from 
> linux/kernel/kexec.c,
> linux/arch/x86/kernel/{machine_kexec_$(BITS).c,relocate_kernel_$(BITS).c};
> I think that KEXEC_CMD_kexec should stay as is,

I don't think we want all the junk from Linux inside Xen -- we only want
to support the kdump case and do not have to handle returning from the
kexec image.

>   - Hmmm... Now I think that we should still use kexec syscall to load image
> into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes
> all things which are needed to call kdump if dom0 crashes; however,
> I could be wrong...

I don't think we need the kexec syscall.  The kernel can unconditionally
do the crash hypercall, which will return if the kdump kernel isn't
loaded and the kernel can fall back to the regular non-kexec panic.

This will allow the kexec syscall to be used only for the domU kexec case.

>   - last but not least, we should think about support for PV guests
> too.

I won't be looking at this.

To avoid confusion about the two largely orthogonal sorts of kexec how
about defining some terms.  I suggest:

Xen kexec: Xen executes the image in response to a Xen crash or a
hypercall from a privileged domain.

Guest kexec: The guest kernel executes the images within the domain in
response to a guest kernel crash or a system call.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-10 Thread Eric W. Biederman
Konrad Rzeszutek Wilk konrad.w...@oracle.com writes:

 On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote:
 I think that new kexec hypercall function should mimics kexec syscall.
 It means that all arguments passed to hypercall should have same types
 if it is possible or if it is not possible then conversion should be done
 in very easy way. Additionally, I think that one call of new hypercall
 load function should load all needed thinks in right place and
 return relevant status. Last but not least, new functionality should

 We are not restricted to just _one_ hypercall. And this loading
 thing could be similar to the micrcode hypercall - which just points
 to a virtual address along with the length - and says 'load me'.

 be available through /dev/xen/privcmd or directly from kernel without
 bigger effort.

 Perhaps we should have a email thread on xen-devel where we hash out
 some ideas. Eric, would you be OK included on this - it would make
 sense for this mechanism to be as future-proof as possible - and I am not
 sure what your plans for kexec are in the future?

The basic kexec interface is.

load ranges of virtual addresses physical addresses.
jump to the physical address  with identity mapped page tables.

There are a few flags to allow for different usage scenarios like
kexec on panic vs normal kexec.

It is very very simple and very extensible.  All of the weird glue
happens in userspace.

Eric
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-10 Thread David Vrabel
On 04/01/13 17:01, Daniel Kiper wrote:
 On Fri, Jan 04, 2013 at 02:38:44PM +, David Vrabel wrote:
 On 04/01/13 14:22, Daniel Kiper wrote:
 On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
 On 27/12/12 18:02, Eric W. Biederman wrote:
 Andrew Cooperandrew.coop...@citrix.com  writes:

 On 27/12/2012 07:53, Eric W. Biederman wrote:
 The syscall ABI still has the wrong semantics.

 Aka totally unmaintainable and umergeable.

 The concept of domU support is also strange.  What does domU support 
 even mean, when the dom0 support is loading a kernel to pick up Xen 
 when Xen falls over.
 There are two requirements pulling at this patch series, but I agree
 that we need to clarify them.
 It probably make sense to split them apart a little even.



 Thinking about this split, there might be a way to simply it even more.

 /sbin/kexec can load the Xen crash kernel itself by issuing
 hypercalls using /dev/xen/privcmd.  This would remove the need for
 the dom0 kernel to distinguish between loading a crash kernel for
 itself and loading a kernel for Xen.

 Or is this just a silly idea complicating the matter?

 This is impossible with current Xen kexec/kdump interface.
 It should be changed to do that. However, I suppose that
 Xen community would not be interested in such changes.

 I don't see why the hypercall ABI cannot be extended with new sub-ops
 that do the right thing -- the existing ABI is a bit weird.

 I plan to start prototyping something shortly (hopefully next week) for
 the Xen kexec case.
 
 Wow... As I can this time Xen community is interested in...
 That is great. I agree that current kexec interface is not ideal.

I spent some more time looking at the existing interface and
implementation and it really is broken.

 David, I am happy to help in that process. However, if you wish I could
 carry it myself. Anyway, it looks that I should hold on with my
 Linux kexec/kdump patches.

I should be able to post some prototype patches for Xen in a few weeks.
 No guarantees though.

 My .5 cents:
   - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload;
 probably we should introduce KEXEC_CMD_kexec_load2 and 
 KEXEC_CMD_kexec_unload2;
 load should __LOAD__ kernel image and other things into hypervisor memory;

Yes, but I don't see how we can easily support both ABIs easily.  I'd be
in favour of replacing the existing hypercalls and requiring updated
kexec tools in dom0 (this isn't that different to requiring the correct
libxc in dom0).

 I suppose that allmost all things could be copied from 
 linux/kernel/kexec.c,
 linux/arch/x86/kernel/{machine_kexec_$(BITS).c,relocate_kernel_$(BITS).c};
 I think that KEXEC_CMD_kexec should stay as is,

I don't think we want all the junk from Linux inside Xen -- we only want
to support the kdump case and do not have to handle returning from the
kexec image.

   - Hmmm... Now I think that we should still use kexec syscall to load image
 into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes
 all things which are needed to call kdump if dom0 crashes; however,
 I could be wrong...

I don't think we need the kexec syscall.  The kernel can unconditionally
do the crash hypercall, which will return if the kdump kernel isn't
loaded and the kernel can fall back to the regular non-kexec panic.

This will allow the kexec syscall to be used only for the domU kexec case.

   - last but not least, we should think about support for PV guests
 too.

I won't be looking at this.

To avoid confusion about the two largely orthogonal sorts of kexec how
about defining some terms.  I suggest:

Xen kexec: Xen executes the image in response to a Xen crash or a
hypercall from a privileged domain.

Guest kexec: The guest kernel executes the images within the domain in
response to a guest kernel crash or a system call.

David
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-07 Thread Konrad Rzeszutek Wilk
On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote:
> On Fri, Jan 04, 2013 at 02:11:46PM -0500, Konrad Rzeszutek Wilk wrote:
> > On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote:
> > > On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote:
> > > > >>> On 04.01.13 at 15:22, Daniel Kiper  wrote:
> > > > > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
> > > > >> /sbin/kexec can load the "Xen" crash kernel itself by issuing
> > > > >> hypercalls using /dev/xen/privcmd.  This would remove the need for
> > > > >> the dom0 kernel to distinguish between loading a crash kernel for
> > > > >> itself and loading a kernel for Xen.
> > > > >>
> > > > >> Or is this just a silly idea complicating the matter?
> > > > >
> > > > > This is impossible with current Xen kexec/kdump interface.
> > > >
> > > > Why?
> > >
> > > Because current KEXEC_CMD_kexec_load does not load kernel
> > > image and other things into Xen memory. It means that it
> > > should live somewhere in dom0 Linux kernel memory.
> >
> > We could have a very simple hypercall which would have:
> >
> > struct fancy_new_hypercall {
> > xen_pfn_t payload; // IN
> > ssize_t len; // IN
> > #define DATA (1<<1)
> > #define DATA_EOF (1<<2)
> > #define DATA_KERNEL (1<<3)
> > #define DATA_RAMDISK (1<<4)
> > unsigned int flags; // IN
> > unsigned int status; // OUT
> > };
> >
> > which would in a loop just iterate over the payloads and
> > let the hypervisor stick it in the crashkernel space.
> >
> > This is all hand-waving of course. There probably would be a need
> > to figure out how much space you have in the reserved Xen's
> > 'crashkernel' memory region too.
> 
> I think that new kexec hypercall function should mimics kexec syscall.
> It means that all arguments passed to hypercall should have same types
> if it is possible or if it is not possible then conversion should be done
> in very easy way. Additionally, I think that one call of new hypercall
> load function should load all needed thinks in right place and
> return relevant status. Last but not least, new functionality should

We are not restricted to just _one_ hypercall. And this loading
thing could be similar to the micrcode hypercall - which just points
to a virtual address along with the length - and says 'load me'.

> be available through /dev/xen/privcmd or directly from kernel without
> bigger effort.

Perhaps we should have a email thread on xen-devel where we hash out
some ideas. Eric, would you be OK included on this - it would make
sense for this mechanism to be as future-proof as possible - and I am not
sure what your plans for kexec are in the future?
> 
> Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-07 Thread Ian Campbell
On Mon, 2013-01-07 at 12:34 +, Daniel Kiper wrote:
> I think that new kexec hypercall function should mimics kexec syscall.

We want to have an interface can be used by non-Linux domains (both dom0
and domU) as well though, so please bear this in mind.

Historically we've not always been good at this when the hypercall
interface is strongly tied to a particular guest implementation (in some
sense this is the problem with the current kexec hypercall).

Also what makes for a good syscall interface does not necessarily make
for a good hypercall interface.

Ian.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-07 Thread Daniel Kiper
On Fri, Jan 04, 2013 at 02:11:46PM -0500, Konrad Rzeszutek Wilk wrote:
> On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote:
> > On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote:
> > > >>> On 04.01.13 at 15:22, Daniel Kiper  wrote:
> > > > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
> > > >> /sbin/kexec can load the "Xen" crash kernel itself by issuing
> > > >> hypercalls using /dev/xen/privcmd.  This would remove the need for
> > > >> the dom0 kernel to distinguish between loading a crash kernel for
> > > >> itself and loading a kernel for Xen.
> > > >>
> > > >> Or is this just a silly idea complicating the matter?
> > > >
> > > > This is impossible with current Xen kexec/kdump interface.
> > >
> > > Why?
> >
> > Because current KEXEC_CMD_kexec_load does not load kernel
> > image and other things into Xen memory. It means that it
> > should live somewhere in dom0 Linux kernel memory.
>
> We could have a very simple hypercall which would have:
>
> struct fancy_new_hypercall {
>   xen_pfn_t payload; // IN
>   ssize_t len; // IN
> #define DATA (1<<1)
> #define DATA_EOF (1<<2)
> #define DATA_KERNEL (1<<3)
> #define DATA_RAMDISK (1<<4)
>   unsigned int flags; // IN
>   unsigned int status; // OUT
> };
>
> which would in a loop just iterate over the payloads and
> let the hypervisor stick it in the crashkernel space.
>
> This is all hand-waving of course. There probably would be a need
> to figure out how much space you have in the reserved Xen's
> 'crashkernel' memory region too.

I think that new kexec hypercall function should mimics kexec syscall.
It means that all arguments passed to hypercall should have same types
if it is possible or if it is not possible then conversion should be done
in very easy way. Additionally, I think that one call of new hypercall
load function should load all needed thinks in right place and
return relevant status. Last but not least, new functionality should
be available through /dev/xen/privcmd or directly from kernel without
bigger effort.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-07 Thread Ian Campbell
On Mon, 2013-01-07 at 10:46 +, Andrew Cooper wrote:

> Given that /sbin/kexec creates a binary blob in memory, surely the most 
> simple thing is to get it to suitably mlock() the region and give a list 
> of VAs to the hypervisor.

More than likely. The DOMID_KEXEC thing was just a radon musing ;-)

> This way, Xen can properly take care of what it does with information 
> and where.  For example, at the moment, allowing dom0 to choose where 
> gets overwritten in the Xen crash area is a recipe for disaster if a 
> crash occurs midway through loading/reloading the crash kernel.

That's true. I think there is a double buffering scheme in the current
thing and we should preserve that in any new implementation.

Ian.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-07 Thread Andrew Cooper

On 07/01/13 10:25, Ian Campbell wrote:

On Fri, 2013-01-04 at 19:11 +, Konrad Rzeszutek Wilk wrote:

On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote:

Because current KEXEC_CMD_kexec_load does not load kernel
image and other things into Xen memory. It means that it
should live somewhere in dom0 Linux kernel memory.

We could have a very simple hypercall which would have:

struct fancy_new_hypercall {
xen_pfn_t payload; // IN

This would have to be XEN_GUEST_HANDLE(something) since userspace cannot
figure out what pfns back its memory. In any case since the hypervisor
is going to want to copy the data into the crashkernel space a virtual
address is convenient to have.


ssize_t len; // IN
#define DATA (1<<1)
#define DATA_EOF (1<<2)
#define DATA_KERNEL (1<<3)
#define DATA_RAMDISK (1<<4)
unsigned int flags; // IN
unsigned int status; // OUT
};

which would in a loop just iterate over the payloads and
let the hypervisor stick it in the crashkernel space.

This is all hand-waving of course. There probably would be a need
to figure out how much space you have in the reserved Xen's
'crashkernel' memory region too.

This is probably a mad idea but it's Monday morning and I'm sleep
deprived so I'll throw it out there...

What about adding DOMID_KEXEC (similar DOMID_IO etc)? This would allow
dom0 to map the kexec memory space with the usual privcmd mmap
hypercalls and build things in it directly.

OK, I suspect this might not be practical for a variety of reasons (lack
of a p2m for such domains so no way to find out the list of mfns, dom0
userspace simply doesn't have sufficient context to write sensible
things here, etc) but maybe someone has a better head on today...

Ian.



Given that /sbin/kexec creates a binary blob in memory, surely the most 
simple thing is to get it to suitably mlock() the region and give a list 
of VAs to the hypervisor.


This way, Xen can properly take care of what it does with information 
and where.  For example, at the moment, allowing dom0 to choose where 
gets overwritten in the Xen crash area is a recipe for disaster if a 
crash occurs midway through loading/reloading the crash kernel.


~Andrew

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-07 Thread Ian Campbell
On Fri, 2013-01-04 at 19:11 +, Konrad Rzeszutek Wilk wrote:
> On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote:
> > On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote:
> > > >>> On 04.01.13 at 15:22, Daniel Kiper  wrote:
> > > > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
> > > >> /sbin/kexec can load the "Xen" crash kernel itself by issuing
> > > >> hypercalls using /dev/xen/privcmd.  This would remove the need for
> > > >> the dom0 kernel to distinguish between loading a crash kernel for
> > > >> itself and loading a kernel for Xen.
> > > >>
> > > >> Or is this just a silly idea complicating the matter?
> > > >
> > > > This is impossible with current Xen kexec/kdump interface.
> > >
> > > Why?
> > 
> > Because current KEXEC_CMD_kexec_load does not load kernel
> > image and other things into Xen memory. It means that it
> > should live somewhere in dom0 Linux kernel memory.
> 
> We could have a very simple hypercall which would have:
> 
> struct fancy_new_hypercall {
>   xen_pfn_t payload; // IN

This would have to be XEN_GUEST_HANDLE(something) since userspace cannot
figure out what pfns back its memory. In any case since the hypervisor
is going to want to copy the data into the crashkernel space a virtual
address is convenient to have.

>   ssize_t len; // IN
> #define DATA (1<<1)
> #define DATA_EOF (1<<2)
> #define DATA_KERNEL (1<<3)
> #define DATA_RAMDISK (1<<4)
>   unsigned int flags; // IN
>   unsigned int status; // OUT
> };
> 
> which would in a loop just iterate over the payloads and
> let the hypervisor stick it in the crashkernel space.
> 
> This is all hand-waving of course. There probably would be a need
> to figure out how much space you have in the reserved Xen's
> 'crashkernel' memory region too.

This is probably a mad idea but it's Monday morning and I'm sleep
deprived so I'll throw it out there...

What about adding DOMID_KEXEC (similar DOMID_IO etc)? This would allow
dom0 to map the kexec memory space with the usual privcmd mmap
hypercalls and build things in it directly.

OK, I suspect this might not be practical for a variety of reasons (lack
of a p2m for such domains so no way to find out the list of mfns, dom0
userspace simply doesn't have sufficient context to write sensible
things here, etc) but maybe someone has a better head on today...

Ian.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-07 Thread Ian Campbell
On Fri, 2013-01-04 at 19:11 +, Konrad Rzeszutek Wilk wrote:
 On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote:
  On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote:
On 04.01.13 at 15:22, Daniel Kiper daniel.ki...@oracle.com wrote:
On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
/sbin/kexec can load the Xen crash kernel itself by issuing
hypercalls using /dev/xen/privcmd.  This would remove the need for
the dom0 kernel to distinguish between loading a crash kernel for
itself and loading a kernel for Xen.
   
Or is this just a silly idea complicating the matter?
   
This is impossible with current Xen kexec/kdump interface.
  
   Why?
  
  Because current KEXEC_CMD_kexec_load does not load kernel
  image and other things into Xen memory. It means that it
  should live somewhere in dom0 Linux kernel memory.
 
 We could have a very simple hypercall which would have:
 
 struct fancy_new_hypercall {
   xen_pfn_t payload; // IN

This would have to be XEN_GUEST_HANDLE(something) since userspace cannot
figure out what pfns back its memory. In any case since the hypervisor
is going to want to copy the data into the crashkernel space a virtual
address is convenient to have.

   ssize_t len; // IN
 #define DATA (11)
 #define DATA_EOF (12)
 #define DATA_KERNEL (13)
 #define DATA_RAMDISK (14)
   unsigned int flags; // IN
   unsigned int status; // OUT
 };
 
 which would in a loop just iterate over the payloads and
 let the hypervisor stick it in the crashkernel space.
 
 This is all hand-waving of course. There probably would be a need
 to figure out how much space you have in the reserved Xen's
 'crashkernel' memory region too.

This is probably a mad idea but it's Monday morning and I'm sleep
deprived so I'll throw it out there...

What about adding DOMID_KEXEC (similar DOMID_IO etc)? This would allow
dom0 to map the kexec memory space with the usual privcmd mmap
hypercalls and build things in it directly.

OK, I suspect this might not be practical for a variety of reasons (lack
of a p2m for such domains so no way to find out the list of mfns, dom0
userspace simply doesn't have sufficient context to write sensible
things here, etc) but maybe someone has a better head on today...

Ian.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-07 Thread Andrew Cooper

On 07/01/13 10:25, Ian Campbell wrote:

On Fri, 2013-01-04 at 19:11 +, Konrad Rzeszutek Wilk wrote:

On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote:

Because current KEXEC_CMD_kexec_load does not load kernel
image and other things into Xen memory. It means that it
should live somewhere in dom0 Linux kernel memory.

We could have a very simple hypercall which would have:

struct fancy_new_hypercall {
xen_pfn_t payload; // IN

This would have to be XEN_GUEST_HANDLE(something) since userspace cannot
figure out what pfns back its memory. In any case since the hypervisor
is going to want to copy the data into the crashkernel space a virtual
address is convenient to have.


ssize_t len; // IN
#define DATA (11)
#define DATA_EOF (12)
#define DATA_KERNEL (13)
#define DATA_RAMDISK (14)
unsigned int flags; // IN
unsigned int status; // OUT
};

which would in a loop just iterate over the payloads and
let the hypervisor stick it in the crashkernel space.

This is all hand-waving of course. There probably would be a need
to figure out how much space you have in the reserved Xen's
'crashkernel' memory region too.

This is probably a mad idea but it's Monday morning and I'm sleep
deprived so I'll throw it out there...

What about adding DOMID_KEXEC (similar DOMID_IO etc)? This would allow
dom0 to map the kexec memory space with the usual privcmd mmap
hypercalls and build things in it directly.

OK, I suspect this might not be practical for a variety of reasons (lack
of a p2m for such domains so no way to find out the list of mfns, dom0
userspace simply doesn't have sufficient context to write sensible
things here, etc) but maybe someone has a better head on today...

Ian.



Given that /sbin/kexec creates a binary blob in memory, surely the most 
simple thing is to get it to suitably mlock() the region and give a list 
of VAs to the hypervisor.


This way, Xen can properly take care of what it does with information 
and where.  For example, at the moment, allowing dom0 to choose where 
gets overwritten in the Xen crash area is a recipe for disaster if a 
crash occurs midway through loading/reloading the crash kernel.


~Andrew

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-07 Thread Ian Campbell
On Mon, 2013-01-07 at 10:46 +, Andrew Cooper wrote:

 Given that /sbin/kexec creates a binary blob in memory, surely the most 
 simple thing is to get it to suitably mlock() the region and give a list 
 of VAs to the hypervisor.

More than likely. The DOMID_KEXEC thing was just a radon musing ;-)

 This way, Xen can properly take care of what it does with information 
 and where.  For example, at the moment, allowing dom0 to choose where 
 gets overwritten in the Xen crash area is a recipe for disaster if a 
 crash occurs midway through loading/reloading the crash kernel.

That's true. I think there is a double buffering scheme in the current
thing and we should preserve that in any new implementation.

Ian.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-07 Thread Daniel Kiper
On Fri, Jan 04, 2013 at 02:11:46PM -0500, Konrad Rzeszutek Wilk wrote:
 On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote:
  On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote:
On 04.01.13 at 15:22, Daniel Kiper daniel.ki...@oracle.com wrote:
On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
/sbin/kexec can load the Xen crash kernel itself by issuing
hypercalls using /dev/xen/privcmd.  This would remove the need for
the dom0 kernel to distinguish between loading a crash kernel for
itself and loading a kernel for Xen.
   
Or is this just a silly idea complicating the matter?
   
This is impossible with current Xen kexec/kdump interface.
  
   Why?
 
  Because current KEXEC_CMD_kexec_load does not load kernel
  image and other things into Xen memory. It means that it
  should live somewhere in dom0 Linux kernel memory.

 We could have a very simple hypercall which would have:

 struct fancy_new_hypercall {
   xen_pfn_t payload; // IN
   ssize_t len; // IN
 #define DATA (11)
 #define DATA_EOF (12)
 #define DATA_KERNEL (13)
 #define DATA_RAMDISK (14)
   unsigned int flags; // IN
   unsigned int status; // OUT
 };

 which would in a loop just iterate over the payloads and
 let the hypervisor stick it in the crashkernel space.

 This is all hand-waving of course. There probably would be a need
 to figure out how much space you have in the reserved Xen's
 'crashkernel' memory region too.

I think that new kexec hypercall function should mimics kexec syscall.
It means that all arguments passed to hypercall should have same types
if it is possible or if it is not possible then conversion should be done
in very easy way. Additionally, I think that one call of new hypercall
load function should load all needed thinks in right place and
return relevant status. Last but not least, new functionality should
be available through /dev/xen/privcmd or directly from kernel without
bigger effort.

Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-07 Thread Ian Campbell
On Mon, 2013-01-07 at 12:34 +, Daniel Kiper wrote:
 I think that new kexec hypercall function should mimics kexec syscall.

We want to have an interface can be used by non-Linux domains (both dom0
and domU) as well though, so please bear this in mind.

Historically we've not always been good at this when the hypercall
interface is strongly tied to a particular guest implementation (in some
sense this is the problem with the current kexec hypercall).

Also what makes for a good syscall interface does not necessarily make
for a good hypercall interface.

Ian.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-07 Thread Konrad Rzeszutek Wilk
On Mon, Jan 07, 2013 at 01:34:04PM +0100, Daniel Kiper wrote:
 On Fri, Jan 04, 2013 at 02:11:46PM -0500, Konrad Rzeszutek Wilk wrote:
  On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote:
   On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote:
 On 04.01.13 at 15:22, Daniel Kiper daniel.ki...@oracle.com wrote:
 On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
 /sbin/kexec can load the Xen crash kernel itself by issuing
 hypercalls using /dev/xen/privcmd.  This would remove the need for
 the dom0 kernel to distinguish between loading a crash kernel for
 itself and loading a kernel for Xen.

 Or is this just a silly idea complicating the matter?

 This is impossible with current Xen kexec/kdump interface.
   
Why?
  
   Because current KEXEC_CMD_kexec_load does not load kernel
   image and other things into Xen memory. It means that it
   should live somewhere in dom0 Linux kernel memory.
 
  We could have a very simple hypercall which would have:
 
  struct fancy_new_hypercall {
  xen_pfn_t payload; // IN
  ssize_t len; // IN
  #define DATA (11)
  #define DATA_EOF (12)
  #define DATA_KERNEL (13)
  #define DATA_RAMDISK (14)
  unsigned int flags; // IN
  unsigned int status; // OUT
  };
 
  which would in a loop just iterate over the payloads and
  let the hypervisor stick it in the crashkernel space.
 
  This is all hand-waving of course. There probably would be a need
  to figure out how much space you have in the reserved Xen's
  'crashkernel' memory region too.
 
 I think that new kexec hypercall function should mimics kexec syscall.
 It means that all arguments passed to hypercall should have same types
 if it is possible or if it is not possible then conversion should be done
 in very easy way. Additionally, I think that one call of new hypercall
 load function should load all needed thinks in right place and
 return relevant status. Last but not least, new functionality should

We are not restricted to just _one_ hypercall. And this loading
thing could be similar to the micrcode hypercall - which just points
to a virtual address along with the length - and says 'load me'.

 be available through /dev/xen/privcmd or directly from kernel without
 bigger effort.

Perhaps we should have a email thread on xen-devel where we hash out
some ideas. Eric, would you be OK included on this - it would make
sense for this mechanism to be as future-proof as possible - and I am not
sure what your plans for kexec are in the future?
 
 Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Konrad Rzeszutek Wilk
On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote:
> On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote:
> > >>> On 04.01.13 at 15:22, Daniel Kiper  wrote:
> > > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
> > >> /sbin/kexec can load the "Xen" crash kernel itself by issuing
> > >> hypercalls using /dev/xen/privcmd.  This would remove the need for
> > >> the dom0 kernel to distinguish between loading a crash kernel for
> > >> itself and loading a kernel for Xen.
> > >>
> > >> Or is this just a silly idea complicating the matter?
> > >
> > > This is impossible with current Xen kexec/kdump interface.
> >
> > Why?
> 
> Because current KEXEC_CMD_kexec_load does not load kernel
> image and other things into Xen memory. It means that it
> should live somewhere in dom0 Linux kernel memory.

We could have a very simple hypercall which would have:

struct fancy_new_hypercall {
xen_pfn_t payload; // IN
ssize_t len; // IN
#define DATA (1<<1)
#define DATA_EOF (1<<2)
#define DATA_KERNEL (1<<3)
#define DATA_RAMDISK (1<<4)
unsigned int flags; // IN
unsigned int status; // OUT
};

which would in a loop just iterate over the payloads and
let the hypervisor stick it in the crashkernel space.

This is all hand-waving of course. There probably would be a need
to figure out how much space you have in the reserved Xen's
'crashkernel' memory region too.

> 
> Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Daniel Kiper
On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote:
> >>> On 04.01.13 at 15:22, Daniel Kiper  wrote:
> > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
> >> /sbin/kexec can load the "Xen" crash kernel itself by issuing
> >> hypercalls using /dev/xen/privcmd.  This would remove the need for
> >> the dom0 kernel to distinguish between loading a crash kernel for
> >> itself and loading a kernel for Xen.
> >>
> >> Or is this just a silly idea complicating the matter?
> >
> > This is impossible with current Xen kexec/kdump interface.
>
> Why?

Because current KEXEC_CMD_kexec_load does not load kernel
image and other things into Xen memory. It means that it
should live somewhere in dom0 Linux kernel memory.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Daniel Kiper
On Fri, Jan 04, 2013 at 02:38:44PM +, David Vrabel wrote:
> On 04/01/13 14:22, Daniel Kiper wrote:
> > On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
> >> On 27/12/12 18:02, Eric W. Biederman wrote:
> >>> Andrew Cooper  writes:
> >>>
>  On 27/12/2012 07:53, Eric W. Biederman wrote:
> > The syscall ABI still has the wrong semantics.
> >
> > Aka totally unmaintainable and umergeable.
> >
> > The concept of domU support is also strange.  What does domU support 
> > even mean, when the dom0 support is loading a kernel to pick up Xen 
> > when Xen falls over.
>  There are two requirements pulling at this patch series, but I agree
>  that we need to clarify them.
> >>> It probably make sense to split them apart a little even.
> >>>
> >>>
> >>
> >> Thinking about this split, there might be a way to simply it even more.
> >>
> >> /sbin/kexec can load the "Xen" crash kernel itself by issuing
> >> hypercalls using /dev/xen/privcmd.  This would remove the need for
> >> the dom0 kernel to distinguish between loading a crash kernel for
> >> itself and loading a kernel for Xen.
> >>
> >> Or is this just a silly idea complicating the matter?
> >
> > This is impossible with current Xen kexec/kdump interface.
> > It should be changed to do that. However, I suppose that
> > Xen community would not be interested in such changes.
>
> I don't see why the hypercall ABI cannot be extended with new sub-ops
> that do the right thing -- the existing ABI is a bit weird.
>
> I plan to start prototyping something shortly (hopefully next week) for
> the Xen kexec case.

Wow... As I can this time Xen community is interested in...
That is great. I agree that current kexec interface is not ideal.

David, I am happy to help in that process. However, if you wish I could
carry it myself. Anyway, it looks that I should hold on with my
Linux kexec/kdump patches.

My .5 cents:
  - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload;
probably we should introduce KEXEC_CMD_kexec_load2 and 
KEXEC_CMD_kexec_unload2;
load should __LOAD__ kernel image and other things into hypervisor memory;
I suppose that allmost all things could be copied from linux/kernel/kexec.c,
linux/arch/x86/kernel/{machine_kexec_$(BITS).c,relocate_kernel_$(BITS).c};
I think that KEXEC_CMD_kexec should stay as is,
  - Hmmm... Now I think that we should still use kexec syscall to load image
into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes
all things which are needed to call kdump if dom0 crashes; however,
I could be wrong...
  - last but not least, we should think about support for PV guests too.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Jan Beulich
>>> On 04.01.13 at 15:22, Daniel Kiper  wrote:
> On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
>> /sbin/kexec can load the "Xen" crash kernel itself by issuing
>> hypercalls using /dev/xen/privcmd.  This would remove the need for
>> the dom0 kernel to distinguish between loading a crash kernel for
>> itself and loading a kernel for Xen.
>>
>> Or is this just a silly idea complicating the matter?
> 
> This is impossible with current Xen kexec/kdump interface.

Why?

> It should be changed to do that. However, I suppose that
> Xen community would not be interested in such changes.

And again - why?

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread David Vrabel
On 04/01/13 14:22, Daniel Kiper wrote:
> On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
>> On 27/12/12 18:02, Eric W. Biederman wrote:
>>> Andrew Cooper  writes:
>>>
 On 27/12/2012 07:53, Eric W. Biederman wrote:
> The syscall ABI still has the wrong semantics.
>
> Aka totally unmaintainable and umergeable.
>
> The concept of domU support is also strange.  What does domU support even 
> mean, when the dom0 support is loading a kernel to pick up Xen when Xen 
> falls over.
 There are two requirements pulling at this patch series, but I agree
 that we need to clarify them.
>>> It probably make sense to split them apart a little even.
>>>
>>>
>>
>> Thinking about this split, there might be a way to simply it even more.
>>
>> /sbin/kexec can load the "Xen" crash kernel itself by issuing
>> hypercalls using /dev/xen/privcmd.  This would remove the need for
>> the dom0 kernel to distinguish between loading a crash kernel for
>> itself and loading a kernel for Xen.
>>
>> Or is this just a silly idea complicating the matter?
> 
> This is impossible with current Xen kexec/kdump interface.
> It should be changed to do that. However, I suppose that
> Xen community would not be interested in such changes.

I don't see why the hypercall ABI cannot be extended with new sub-ops
that do the right thing -- the existing ABI is a bit weird.

I plan to start prototyping something shortly (hopefully next week) for
the Xen kexec case.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Konrad Rzeszutek Wilk
On Fri, Jan 04, 2013 at 03:22:57PM +0100, Daniel Kiper wrote:
> On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
> > On 27/12/12 18:02, Eric W. Biederman wrote:
> > >Andrew Cooper  writes:
> > >
> > >>On 27/12/2012 07:53, Eric W. Biederman wrote:
> > >>>The syscall ABI still has the wrong semantics.
> > >>>
> > >>>Aka totally unmaintainable and umergeable.
> > >>>
> > >>>The concept of domU support is also strange.  What does domU support 
> > >>>even mean, when the dom0 support is loading a kernel to pick up Xen when 
> > >>>Xen falls over.
> > >>There are two requirements pulling at this patch series, but I agree
> > >>that we need to clarify them.
> > >It probably make sense to split them apart a little even.
> > >
> > >
> >
> > Thinking about this split, there might be a way to simply it even more.
> >
> > /sbin/kexec can load the "Xen" crash kernel itself by issuing
> > hypercalls using /dev/xen/privcmd.  This would remove the need for
> > the dom0 kernel to distinguish between loading a crash kernel for
> > itself and loading a kernel for Xen.
> >
> > Or is this just a silly idea complicating the matter?
> 
> This is impossible with current Xen kexec/kdump interface.
> It should be changed to do that. However, I suppose that
> Xen community would not be interested in such changes.

Why not? What is involved in it? IMHO I believe anybody would
welcome a new clean design that solves this thorny problem?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Ian Campbell
On Fri, 2013-01-04 at 14:22 +, Daniel Kiper wrote:
> On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
> > On 27/12/12 18:02, Eric W. Biederman wrote:
> > >Andrew Cooper  writes:
> > >
> > >>On 27/12/2012 07:53, Eric W. Biederman wrote:
> > >>>The syscall ABI still has the wrong semantics.
> > >>>
> > >>>Aka totally unmaintainable and umergeable.
> > >>>
> > >>>The concept of domU support is also strange.  What does domU support 
> > >>>even mean, when the dom0 support is loading a kernel to pick up Xen when 
> > >>>Xen falls over.
> > >>There are two requirements pulling at this patch series, but I agree
> > >>that we need to clarify them.
> > >It probably make sense to split them apart a little even.
> > >
> > >
> >
> > Thinking about this split, there might be a way to simply it even more.
> >
> > /sbin/kexec can load the "Xen" crash kernel itself by issuing
> > hypercalls using /dev/xen/privcmd.  This would remove the need for
> > the dom0 kernel to distinguish between loading a crash kernel for
> > itself and loading a kernel for Xen.
> >
> > Or is this just a silly idea complicating the matter?
> 
> This is impossible with current Xen kexec/kdump interface.
> It should be changed to do that. However, I suppose that
> Xen community would not be interested in such changes.

The current HYPERVISOR_kexec interface is pretty fricken bad (it
basically hardcodes the Linux Circa-2.6.18 internal interface!).

I'd be all for a new HYPERVISOR_kexec (with the old gaining a _compat
suffix) which implements something more generic that isn't tied to a
particular dom0 kernel implementation (be it differing versions of Linux
or e.g. *BSD).

If that enables /sbin/kexec to load the kernel directly then so much the
better, assuming the /sbin/kexec maintainers are happy with that
approach.

Ian.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Daniel Kiper
On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
> On 27/12/12 18:02, Eric W. Biederman wrote:
> >Andrew Cooper  writes:
> >
> >>On 27/12/2012 07:53, Eric W. Biederman wrote:
> >>>The syscall ABI still has the wrong semantics.
> >>>
> >>>Aka totally unmaintainable and umergeable.
> >>>
> >>>The concept of domU support is also strange.  What does domU support even 
> >>>mean, when the dom0 support is loading a kernel to pick up Xen when Xen 
> >>>falls over.
> >>There are two requirements pulling at this patch series, but I agree
> >>that we need to clarify them.
> >It probably make sense to split them apart a little even.
> >
> >
>
> Thinking about this split, there might be a way to simply it even more.
>
> /sbin/kexec can load the "Xen" crash kernel itself by issuing
> hypercalls using /dev/xen/privcmd.  This would remove the need for
> the dom0 kernel to distinguish between loading a crash kernel for
> itself and loading a kernel for Xen.
>
> Or is this just a silly idea complicating the matter?

This is impossible with current Xen kexec/kdump interface.
It should be changed to do that. However, I suppose that
Xen community would not be interested in such changes.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Daniel Kiper
On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
 On 27/12/12 18:02, Eric W. Biederman wrote:
 Andrew Cooperandrew.coop...@citrix.com  writes:
 
 On 27/12/2012 07:53, Eric W. Biederman wrote:
 The syscall ABI still has the wrong semantics.
 
 Aka totally unmaintainable and umergeable.
 
 The concept of domU support is also strange.  What does domU support even 
 mean, when the dom0 support is loading a kernel to pick up Xen when Xen 
 falls over.
 There are two requirements pulling at this patch series, but I agree
 that we need to clarify them.
 It probably make sense to split them apart a little even.
 
 

 Thinking about this split, there might be a way to simply it even more.

 /sbin/kexec can load the Xen crash kernel itself by issuing
 hypercalls using /dev/xen/privcmd.  This would remove the need for
 the dom0 kernel to distinguish between loading a crash kernel for
 itself and loading a kernel for Xen.

 Or is this just a silly idea complicating the matter?

This is impossible with current Xen kexec/kdump interface.
It should be changed to do that. However, I suppose that
Xen community would not be interested in such changes.

Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Ian Campbell
On Fri, 2013-01-04 at 14:22 +, Daniel Kiper wrote:
 On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
  On 27/12/12 18:02, Eric W. Biederman wrote:
  Andrew Cooperandrew.coop...@citrix.com  writes:
  
  On 27/12/2012 07:53, Eric W. Biederman wrote:
  The syscall ABI still has the wrong semantics.
  
  Aka totally unmaintainable and umergeable.
  
  The concept of domU support is also strange.  What does domU support 
  even mean, when the dom0 support is loading a kernel to pick up Xen when 
  Xen falls over.
  There are two requirements pulling at this patch series, but I agree
  that we need to clarify them.
  It probably make sense to split them apart a little even.
  
  
 
  Thinking about this split, there might be a way to simply it even more.
 
  /sbin/kexec can load the Xen crash kernel itself by issuing
  hypercalls using /dev/xen/privcmd.  This would remove the need for
  the dom0 kernel to distinguish between loading a crash kernel for
  itself and loading a kernel for Xen.
 
  Or is this just a silly idea complicating the matter?
 
 This is impossible with current Xen kexec/kdump interface.
 It should be changed to do that. However, I suppose that
 Xen community would not be interested in such changes.

The current HYPERVISOR_kexec interface is pretty fricken bad (it
basically hardcodes the Linux Circa-2.6.18 internal interface!).

I'd be all for a new HYPERVISOR_kexec (with the old gaining a _compat
suffix) which implements something more generic that isn't tied to a
particular dom0 kernel implementation (be it differing versions of Linux
or e.g. *BSD).

If that enables /sbin/kexec to load the kernel directly then so much the
better, assuming the /sbin/kexec maintainers are happy with that
approach.

Ian.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Konrad Rzeszutek Wilk
On Fri, Jan 04, 2013 at 03:22:57PM +0100, Daniel Kiper wrote:
 On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
  On 27/12/12 18:02, Eric W. Biederman wrote:
  Andrew Cooperandrew.coop...@citrix.com  writes:
  
  On 27/12/2012 07:53, Eric W. Biederman wrote:
  The syscall ABI still has the wrong semantics.
  
  Aka totally unmaintainable and umergeable.
  
  The concept of domU support is also strange.  What does domU support 
  even mean, when the dom0 support is loading a kernel to pick up Xen when 
  Xen falls over.
  There are two requirements pulling at this patch series, but I agree
  that we need to clarify them.
  It probably make sense to split them apart a little even.
  
  
 
  Thinking about this split, there might be a way to simply it even more.
 
  /sbin/kexec can load the Xen crash kernel itself by issuing
  hypercalls using /dev/xen/privcmd.  This would remove the need for
  the dom0 kernel to distinguish between loading a crash kernel for
  itself and loading a kernel for Xen.
 
  Or is this just a silly idea complicating the matter?
 
 This is impossible with current Xen kexec/kdump interface.
 It should be changed to do that. However, I suppose that
 Xen community would not be interested in such changes.

Why not? What is involved in it? IMHO I believe anybody would
welcome a new clean design that solves this thorny problem?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread David Vrabel
On 04/01/13 14:22, Daniel Kiper wrote:
 On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
 On 27/12/12 18:02, Eric W. Biederman wrote:
 Andrew Cooperandrew.coop...@citrix.com  writes:

 On 27/12/2012 07:53, Eric W. Biederman wrote:
 The syscall ABI still has the wrong semantics.

 Aka totally unmaintainable and umergeable.

 The concept of domU support is also strange.  What does domU support even 
 mean, when the dom0 support is loading a kernel to pick up Xen when Xen 
 falls over.
 There are two requirements pulling at this patch series, but I agree
 that we need to clarify them.
 It probably make sense to split them apart a little even.



 Thinking about this split, there might be a way to simply it even more.

 /sbin/kexec can load the Xen crash kernel itself by issuing
 hypercalls using /dev/xen/privcmd.  This would remove the need for
 the dom0 kernel to distinguish between loading a crash kernel for
 itself and loading a kernel for Xen.

 Or is this just a silly idea complicating the matter?
 
 This is impossible with current Xen kexec/kdump interface.
 It should be changed to do that. However, I suppose that
 Xen community would not be interested in such changes.

I don't see why the hypercall ABI cannot be extended with new sub-ops
that do the right thing -- the existing ABI is a bit weird.

I plan to start prototyping something shortly (hopefully next week) for
the Xen kexec case.

David
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Jan Beulich
 On 04.01.13 at 15:22, Daniel Kiper daniel.ki...@oracle.com wrote:
 On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
 /sbin/kexec can load the Xen crash kernel itself by issuing
 hypercalls using /dev/xen/privcmd.  This would remove the need for
 the dom0 kernel to distinguish between loading a crash kernel for
 itself and loading a kernel for Xen.

 Or is this just a silly idea complicating the matter?
 
 This is impossible with current Xen kexec/kdump interface.

Why?

 It should be changed to do that. However, I suppose that
 Xen community would not be interested in such changes.

And again - why?

Jan

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Daniel Kiper
On Fri, Jan 04, 2013 at 02:38:44PM +, David Vrabel wrote:
 On 04/01/13 14:22, Daniel Kiper wrote:
  On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
  On 27/12/12 18:02, Eric W. Biederman wrote:
  Andrew Cooperandrew.coop...@citrix.com  writes:
 
  On 27/12/2012 07:53, Eric W. Biederman wrote:
  The syscall ABI still has the wrong semantics.
 
  Aka totally unmaintainable and umergeable.
 
  The concept of domU support is also strange.  What does domU support 
  even mean, when the dom0 support is loading a kernel to pick up Xen 
  when Xen falls over.
  There are two requirements pulling at this patch series, but I agree
  that we need to clarify them.
  It probably make sense to split them apart a little even.
 
 
 
  Thinking about this split, there might be a way to simply it even more.
 
  /sbin/kexec can load the Xen crash kernel itself by issuing
  hypercalls using /dev/xen/privcmd.  This would remove the need for
  the dom0 kernel to distinguish between loading a crash kernel for
  itself and loading a kernel for Xen.
 
  Or is this just a silly idea complicating the matter?
 
  This is impossible with current Xen kexec/kdump interface.
  It should be changed to do that. However, I suppose that
  Xen community would not be interested in such changes.

 I don't see why the hypercall ABI cannot be extended with new sub-ops
 that do the right thing -- the existing ABI is a bit weird.

 I plan to start prototyping something shortly (hopefully next week) for
 the Xen kexec case.

Wow... As I can this time Xen community is interested in...
That is great. I agree that current kexec interface is not ideal.

David, I am happy to help in that process. However, if you wish I could
carry it myself. Anyway, it looks that I should hold on with my
Linux kexec/kdump patches.

My .5 cents:
  - We should focus on KEXEC_CMD_kexec_load and KEXEC_CMD_kexec_unload;
probably we should introduce KEXEC_CMD_kexec_load2 and 
KEXEC_CMD_kexec_unload2;
load should __LOAD__ kernel image and other things into hypervisor memory;
I suppose that allmost all things could be copied from linux/kernel/kexec.c,
linux/arch/x86/kernel/{machine_kexec_$(BITS).c,relocate_kernel_$(BITS).c};
I think that KEXEC_CMD_kexec should stay as is,
  - Hmmm... Now I think that we should still use kexec syscall to load image
into Xen memory (with new KEXEC_CMD_kexec_load2) because it establishes
all things which are needed to call kdump if dom0 crashes; however,
I could be wrong...
  - last but not least, we should think about support for PV guests too.

Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Daniel Kiper
On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote:
  On 04.01.13 at 15:22, Daniel Kiper daniel.ki...@oracle.com wrote:
  On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
  /sbin/kexec can load the Xen crash kernel itself by issuing
  hypercalls using /dev/xen/privcmd.  This would remove the need for
  the dom0 kernel to distinguish between loading a crash kernel for
  itself and loading a kernel for Xen.
 
  Or is this just a silly idea complicating the matter?
 
  This is impossible with current Xen kexec/kdump interface.

 Why?

Because current KEXEC_CMD_kexec_load does not load kernel
image and other things into Xen memory. It means that it
should live somewhere in dom0 Linux kernel memory.

Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-04 Thread Konrad Rzeszutek Wilk
On Fri, Jan 04, 2013 at 06:07:51PM +0100, Daniel Kiper wrote:
 On Fri, Jan 04, 2013 at 02:41:17PM +, Jan Beulich wrote:
   On 04.01.13 at 15:22, Daniel Kiper daniel.ki...@oracle.com wrote:
   On Wed, Jan 02, 2013 at 11:26:43AM +, Andrew Cooper wrote:
   /sbin/kexec can load the Xen crash kernel itself by issuing
   hypercalls using /dev/xen/privcmd.  This would remove the need for
   the dom0 kernel to distinguish between loading a crash kernel for
   itself and loading a kernel for Xen.
  
   Or is this just a silly idea complicating the matter?
  
   This is impossible with current Xen kexec/kdump interface.
 
  Why?
 
 Because current KEXEC_CMD_kexec_load does not load kernel
 image and other things into Xen memory. It means that it
 should live somewhere in dom0 Linux kernel memory.

We could have a very simple hypercall which would have:

struct fancy_new_hypercall {
xen_pfn_t payload; // IN
ssize_t len; // IN
#define DATA (11)
#define DATA_EOF (12)
#define DATA_KERNEL (13)
#define DATA_RAMDISK (14)
unsigned int flags; // IN
unsigned int status; // OUT
};

which would in a loop just iterate over the payloads and
let the hypervisor stick it in the crashkernel space.

This is all hand-waving of course. There probably would be a need
to figure out how much space you have in the reserved Xen's
'crashkernel' memory region too.

 
 Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-03 Thread Jan Beulich
>>> On 02.01.13 at 12:26, Andrew Cooper  wrote:
> On 27/12/12 18:02, Eric W. Biederman wrote:
>> It probably make sense to split them apart a little even.
> 
> Thinking about this split, there might be a way to simply it even more.
> 
> /sbin/kexec can load the "Xen" crash kernel itself by issuing hypercalls 
> using /dev/xen/privcmd.  This would remove the need for the dom0 kernel 
> to distinguish between loading a crash kernel for itself and loading a 
> kernel for Xen.
> 
> Or is this just a silly idea complicating the matter?

I don't think so (and suggested that before as a response to an
earlier submission of this patch set), and it would make most of
the discussion here mute.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-03 Thread Jan Beulich
 On 02.01.13 at 12:26, Andrew Cooper andrew.coop...@citrix.com wrote:
 On 27/12/12 18:02, Eric W. Biederman wrote:
 It probably make sense to split them apart a little even.
 
 Thinking about this split, there might be a way to simply it even more.
 
 /sbin/kexec can load the Xen crash kernel itself by issuing hypercalls 
 using /dev/xen/privcmd.  This would remove the need for the dom0 kernel 
 to distinguish between loading a crash kernel for itself and loading a 
 kernel for Xen.
 
 Or is this just a silly idea complicating the matter?

I don't think so (and suggested that before as a response to an
earlier submission of this patch set), and it would make most of
the discussion here mute.

Jan

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-02 Thread Ian Campbell
On Thu, 2012-12-27 at 14:18 +, Andrew Cooper wrote:
> Many cloud customers and service providers want the ability for a VM
> administrator to be able to load a kdump/kexec kernel within a
> domain[1].  This allows the VM administrator to take more proactive
> steps to isolate the cause of a crash, the state of which is most likely
> discarded while tearing down the domain.  The result being that as far
> as Xen is concerned, the domain is still alive, while the kdump
> kernel/environment can work its usual magic.  I am not aware of any
> feature like this existing in the past.

I have a feeling that some versions of the classic-Xen port supported
domU kexec as well. Certainly there was some work on that back in 2005,
although I can't see much evidence that that attempt ever went anywhere
so maybe I'm imagining things.

It's possible that I'm confusing domU kexec support with support for
domU kexec in some dom0 kernels. That was/is used to support "kexec"
from a PV bootloader into the real kernel (which looks to the host a lot
like a domU kexec would).

Ian.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-02 Thread Eric W. Biederman
Andrew Cooper  writes:

> On 27/12/12 18:02, Eric W. Biederman wrote:
>> Andrew Cooper  writes:
>>
>>> On 27/12/2012 07:53, Eric W. Biederman wrote:
 The syscall ABI still has the wrong semantics.

 Aka totally unmaintainable and umergeable.

 The concept of domU support is also strange.  What does domU support even 
 mean, when the dom0 support is loading a kernel to pick up Xen when Xen 
 falls over.
>>> There are two requirements pulling at this patch series, but I agree
>>> that we need to clarify them.
>> It probably make sense to split them apart a little even.
>>
>>
>
> Thinking about this split, there might be a way to simply it even more.
>
> /sbin/kexec can load the "Xen" crash kernel itself by issuing
> hypercalls using /dev/xen/privcmd.  This would remove the need for the
> dom0 kernel to distinguish between loading a crash kernel for itself
> and loading a kernel for Xen.
>
> Or is this just a silly idea complicating the matter?

At a first approximation it sounds reasonable.

If the Xen kexec actually copies the loaded kernel to somewhere internal
like the linux kexec that would be entirely reasonable.  If Xen has
other requirements on the dom0 case you might not be able to implement
the call without linux kernel support.

But if you can implement it all in terms of /dev/xen/privcmd go for it.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-02 Thread Andrew Cooper

On 27/12/12 18:02, Eric W. Biederman wrote:

Andrew Cooper  writes:


On 27/12/2012 07:53, Eric W. Biederman wrote:

The syscall ABI still has the wrong semantics.

Aka totally unmaintainable and umergeable.

The concept of domU support is also strange.  What does domU support even mean, 
when the dom0 support is loading a kernel to pick up Xen when Xen falls over.

There are two requirements pulling at this patch series, but I agree
that we need to clarify them.

It probably make sense to split them apart a little even.




Thinking about this split, there might be a way to simply it even more.

/sbin/kexec can load the "Xen" crash kernel itself by issuing hypercalls 
using /dev/xen/privcmd.  This would remove the need for the dom0 kernel 
to distinguish between loading a crash kernel for itself and loading a 
kernel for Xen.


Or is this just a silly idea complicating the matter?

~Andrew
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-02 Thread Andrew Cooper

On 27/12/12 18:02, Eric W. Biederman wrote:

Andrew Cooperandrew.coop...@citrix.com  writes:


On 27/12/2012 07:53, Eric W. Biederman wrote:

The syscall ABI still has the wrong semantics.

Aka totally unmaintainable and umergeable.

The concept of domU support is also strange.  What does domU support even mean, 
when the dom0 support is loading a kernel to pick up Xen when Xen falls over.

There are two requirements pulling at this patch series, but I agree
that we need to clarify them.

It probably make sense to split them apart a little even.




Thinking about this split, there might be a way to simply it even more.

/sbin/kexec can load the Xen crash kernel itself by issuing hypercalls 
using /dev/xen/privcmd.  This would remove the need for the dom0 kernel 
to distinguish between loading a crash kernel for itself and loading a 
kernel for Xen.


Or is this just a silly idea complicating the matter?

~Andrew
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-02 Thread Eric W. Biederman
Andrew Cooper andrew.coop...@citrix.com writes:

 On 27/12/12 18:02, Eric W. Biederman wrote:
 Andrew Cooperandrew.coop...@citrix.com  writes:

 On 27/12/2012 07:53, Eric W. Biederman wrote:
 The syscall ABI still has the wrong semantics.

 Aka totally unmaintainable and umergeable.

 The concept of domU support is also strange.  What does domU support even 
 mean, when the dom0 support is loading a kernel to pick up Xen when Xen 
 falls over.
 There are two requirements pulling at this patch series, but I agree
 that we need to clarify them.
 It probably make sense to split them apart a little even.



 Thinking about this split, there might be a way to simply it even more.

 /sbin/kexec can load the Xen crash kernel itself by issuing
 hypercalls using /dev/xen/privcmd.  This would remove the need for the
 dom0 kernel to distinguish between loading a crash kernel for itself
 and loading a kernel for Xen.

 Or is this just a silly idea complicating the matter?

At a first approximation it sounds reasonable.

If the Xen kexec actually copies the loaded kernel to somewhere internal
like the linux kexec that would be entirely reasonable.  If Xen has
other requirements on the dom0 case you might not be able to implement
the call without linux kernel support.

But if you can implement it all in terms of /dev/xen/privcmd go for it.

Eric

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2013-01-02 Thread Ian Campbell
On Thu, 2012-12-27 at 14:18 +, Andrew Cooper wrote:
 Many cloud customers and service providers want the ability for a VM
 administrator to be able to load a kdump/kexec kernel within a
 domain[1].  This allows the VM administrator to take more proactive
 steps to isolate the cause of a crash, the state of which is most likely
 discarded while tearing down the domain.  The result being that as far
 as Xen is concerned, the domain is still alive, while the kdump
 kernel/environment can work its usual magic.  I am not aware of any
 feature like this existing in the past.

I have a feeling that some versions of the classic-Xen port supported
domU kexec as well. Certainly there was some work on that back in 2005,
although I can't see much evidence that that attempt ever went anywhere
so maybe I'm imagining things.

It's possible that I'm confusing domU kexec support with support for
domU kexec in some dom0 kernels. That was/is used to support kexec
from a PV bootloader into the real kernel (which looks to the host a lot
like a domU kexec would).

Ian.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-27 Thread Daniel Kiper
> Andrew Cooper  writes:
>
> > On 27/12/2012 07:53, Eric W. Biederman wrote:
> >> The syscall ABI still has the wrong semantics.
> >>
> >> Aka totally unmaintainable and umergeable.
> >>
> >> The concept of domU support is also strange.  What does domU support even 
> >> mean, when the dom0 > support is loading a kernel to pick up Xen when Xen 
> >> falls over.
> >
> > There are two requirements pulling at this patch series, but I agree
> > that we need to clarify them.
>
> It probably make sense to split them apart a little even.
>
> > When dom0 loads a crash kernel, it is loading one for Xen to use.  As a
> > dom0 crash causes a Xen crash, having dom0 set up a kdump kernel for
> > itself is completely useless.  This ability is present in "classic Xen
> > dom0" kernels, but the feature is currently missing in PVOPS.
>
> > Many cloud customers and service providers want the ability for a VM
> > administrator to be able to load a kdump/kexec kernel within a
> > domain[1].  This allows the VM administrator to take more proactive
> > steps to isolate the cause of a crash, the state of which is most likely
> > discarded while tearing down the domain.  The result being that as far
> > as Xen is concerned, the domain is still alive, while the kdump
> > kernel/environment can work its usual magic.  I am not aware of any
> > feature like this existing in the past.
>
> Which makes domU support semantically just the normal kexec/kdump
> support.  Got it.

To some extent. It is true on HVM and PVonHVM guests. However,
PV guests requires a bit different kexec/kdump implementation
than plain kexec/kdump. Proposed firmware support has almost
all required features. PV guest specific features (a few) will
be added later (after agreeing generic firmware support which
is sufficient at least for dom0).

It looks that I should replace domU by PV guest in patch description.

> The point of implementing domU is for those times when the hypervisor
> admin and the kernel admin are different.

Right.

> For domU support modifying or adding alternate versions of
> machine_kexec.c and relocate_kernel.S to add paravirtualization support
> make sense.

It is not sufficient. Please look above.

> There is the practical argument that for implementation efficiency of
> crash dumps it would be better if that support came from the hypervisor
> or the hypervisor environment.  But this gets into the practical reality

I am thinking about that.

> that the hypervisor environment does not do that today.  Furthermore
> kexec all by itself working in a paravirtualized environment under Xen
> makes sense.
>
> domU support is what Peter was worrying about for cleanliness, and
> we need some x86 backend ops there, and generally to be careful.

As I know we do not need any additional pv_ops stuff
if we place all needed things in kexec firmware support.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-27 Thread Daniel Kiper
> On 12/26/2012 06:18 PM, Daniel Kiper wrote:
> > Hi,
> >
> > This set of patches contains initial kexec/kdump implementation for Xen v3.
> > Currently only dom0 is supported, however, almost all infrustructure
> > required for domU support is ready.
> >
> > Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 
> > code.
> > This could simplify and reduce a bit size of kernel code. However, this 
> > solution
> > requires some changes in baremetal x86 code. First of all code which 
> > establishes
> > transition page table should be moved back from machine_kexec_$(BITS).c to
> > relocate_kernel_$(BITS).S. Another important thing which should be changed 
> > in that
> > case is format of page_list array. Xen kexec hypercall requires to 
> > alternate physical
> > addresses with virtual ones. These and other required stuff have not been 
> > done in that
> > version because I am not sure that solution will be accepted by kexec/kdump 
> > maintainers.
> > I hope that this email spark discussion about that topic.
>
> I want a detailed list of the constraints that this assumes and 
> therefore imposes on the native implementation as a result of this.  We 
> have had way too many patches where Xen PV hacks effectively nailgun 
> arbitrary, and sometimes poor, design decisions in place and now we 
> can't fix them.

OK but now I think that we should leave this discussion
until all details regarding kexec/kdump generic code
will be agreed. Sorry for that.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-27 Thread Eric W. Biederman
Andrew Cooper  writes:

> On 27/12/2012 07:53, Eric W. Biederman wrote:
>> The syscall ABI still has the wrong semantics.
>>
>> Aka totally unmaintainable and umergeable.
>>
>> The concept of domU support is also strange.  What does domU support even 
>> mean, when the dom0 support is loading a kernel to pick up Xen when Xen 
>> falls over.
>
> There are two requirements pulling at this patch series, but I agree
> that we need to clarify them.

It probably make sense to split them apart a little even.

> When dom0 loads a crash kernel, it is loading one for Xen to use.  As a
> dom0 crash causes a Xen crash, having dom0 set up a kdump kernel for
> itself is completely useless.  This ability is present in "classic Xen
> dom0" kernels, but the feature is currently missing in PVOPS.

> Many cloud customers and service providers want the ability for a VM
> administrator to be able to load a kdump/kexec kernel within a
> domain[1].  This allows the VM administrator to take more proactive
> steps to isolate the cause of a crash, the state of which is most likely
> discarded while tearing down the domain.  The result being that as far
> as Xen is concerned, the domain is still alive, while the kdump
> kernel/environment can work its usual magic.  I am not aware of any
> feature like this existing in the past.

Which makes domU support semantically just the normal kexec/kdump
support.  Got it.

The point of implementing domU is for those times when the hypervisor
admin and the kernel admin are different.

For domU support modifying or adding alternate versions of
machine_kexec.c and relocate_kernel.S to add paravirtualization support
make sense.

There is the practical argument that for implementation efficiency of
crash dumps it would be better if that support came from the hypervisor
or the hypervisor environment.  But this gets into the practical reality
that the hypervisor environment does not do that today.  Furthermore
kexec all by itself working in a paravirtualized environment under Xen
makes sense.

domU support is what Peter was worrying about for cleanliness, and
we need some x86 backend ops there, and generally to be careful.


For dom0 support we need to extend the kexec_load system call, and
get it right.

When we are done I expect both dom0 and domU support of kexec to work
in dom0.  I don't know if the normal kexec or kdump case will ever make
sense in dom0 but there is no reason for that case to be broken.

> ~Andrew
>
> [1] http://lists.xen.org/archives/html/xen-devel/2012-11/msg01274.html

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-27 Thread Andrew Cooper
On 27/12/2012 07:53, Eric W. Biederman wrote:
> The syscall ABI still has the wrong semantics.
>
> Aka totally unmaintainable and umergeable.
>
> The concept of domU support is also strange.  What does domU support even 
> mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls 
> over.

There are two requirements pulling at this patch series, but I agree
that we need to clarify them.

When dom0 loads a crash kernel, it is loading one for Xen to use.  As a
dom0 crash causes a Xen crash, having dom0 set up a kdump kernel for
itself is completely useless.  This ability is present in "classic Xen
dom0" kernels, but the feature is currently missing in PVOPS.

Many cloud customers and service providers want the ability for a VM
administrator to be able to load a kdump/kexec kernel within a
domain[1].  This allows the VM administrator to take more proactive
steps to isolate the cause of a crash, the state of which is most likely
discarded while tearing down the domain.  The result being that as far
as Xen is concerned, the domain is still alive, while the kdump
kernel/environment can work its usual magic.  I am not aware of any
feature like this existing in the past.

~Andrew

[1] http://lists.xen.org/archives/html/xen-devel/2012-11/msg01274.html

>
> I expect a lot of decisions about what code can be shared and what code can't 
> is going to be driven by the simple question what does the syscall mean.
>
> Sharing machine_kexec.c and relocate_kernel.S does not make much sense to me 
> when what you are doing is effectively passing your arguments through to the 
> Xen version of kexec.
>
> Either Xen has it's own version of those routines or I expect the Xen version 
> of kexec is buggy.   I can't imagine what sharing that code would mean.  By 
> the same token I can't any need to duplicate the code either.
>
> Furthermore since this is just passing data from one version of the syscall 
> to another I expect you can share the majority of the code across all 
> architectures that implement Xen.  The only part I can see being arch 
> specific is the Xen syscall stub.
>
> With respect to the proposed semantics of silently giving the kexec system 
> call different meaning when running under Xen,
> /sbin/kexec has to act somewhat differently when loading code into the Xen 
> hypervisor so there is no point not making that explicit in the ABI.
>
> Eric
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-27 Thread Andrew Cooper
On 27/12/2012 07:53, Eric W. Biederman wrote:
 The syscall ABI still has the wrong semantics.

 Aka totally unmaintainable and umergeable.

 The concept of domU support is also strange.  What does domU support even 
 mean, when the dom0 support is loading a kernel to pick up Xen when Xen falls 
 over.

There are two requirements pulling at this patch series, but I agree
that we need to clarify them.

When dom0 loads a crash kernel, it is loading one for Xen to use.  As a
dom0 crash causes a Xen crash, having dom0 set up a kdump kernel for
itself is completely useless.  This ability is present in classic Xen
dom0 kernels, but the feature is currently missing in PVOPS.

Many cloud customers and service providers want the ability for a VM
administrator to be able to load a kdump/kexec kernel within a
domain[1].  This allows the VM administrator to take more proactive
steps to isolate the cause of a crash, the state of which is most likely
discarded while tearing down the domain.  The result being that as far
as Xen is concerned, the domain is still alive, while the kdump
kernel/environment can work its usual magic.  I am not aware of any
feature like this existing in the past.

~Andrew

[1] http://lists.xen.org/archives/html/xen-devel/2012-11/msg01274.html


 I expect a lot of decisions about what code can be shared and what code can't 
 is going to be driven by the simple question what does the syscall mean.

 Sharing machine_kexec.c and relocate_kernel.S does not make much sense to me 
 when what you are doing is effectively passing your arguments through to the 
 Xen version of kexec.

 Either Xen has it's own version of those routines or I expect the Xen version 
 of kexec is buggy.   I can't imagine what sharing that code would mean.  By 
 the same token I can't any need to duplicate the code either.

 Furthermore since this is just passing data from one version of the syscall 
 to another I expect you can share the majority of the code across all 
 architectures that implement Xen.  The only part I can see being arch 
 specific is the Xen syscall stub.

 With respect to the proposed semantics of silently giving the kexec system 
 call different meaning when running under Xen,
 /sbin/kexec has to act somewhat differently when loading code into the Xen 
 hypervisor so there is no point not making that explicit in the ABI.

 Eric


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-27 Thread Eric W. Biederman
Andrew Cooper andrew.coop...@citrix.com writes:

 On 27/12/2012 07:53, Eric W. Biederman wrote:
 The syscall ABI still has the wrong semantics.

 Aka totally unmaintainable and umergeable.

 The concept of domU support is also strange.  What does domU support even 
 mean, when the dom0 support is loading a kernel to pick up Xen when Xen 
 falls over.

 There are two requirements pulling at this patch series, but I agree
 that we need to clarify them.

It probably make sense to split them apart a little even.

 When dom0 loads a crash kernel, it is loading one for Xen to use.  As a
 dom0 crash causes a Xen crash, having dom0 set up a kdump kernel for
 itself is completely useless.  This ability is present in classic Xen
 dom0 kernels, but the feature is currently missing in PVOPS.

 Many cloud customers and service providers want the ability for a VM
 administrator to be able to load a kdump/kexec kernel within a
 domain[1].  This allows the VM administrator to take more proactive
 steps to isolate the cause of a crash, the state of which is most likely
 discarded while tearing down the domain.  The result being that as far
 as Xen is concerned, the domain is still alive, while the kdump
 kernel/environment can work its usual magic.  I am not aware of any
 feature like this existing in the past.

Which makes domU support semantically just the normal kexec/kdump
support.  Got it.

The point of implementing domU is for those times when the hypervisor
admin and the kernel admin are different.

For domU support modifying or adding alternate versions of
machine_kexec.c and relocate_kernel.S to add paravirtualization support
make sense.

There is the practical argument that for implementation efficiency of
crash dumps it would be better if that support came from the hypervisor
or the hypervisor environment.  But this gets into the practical reality
that the hypervisor environment does not do that today.  Furthermore
kexec all by itself working in a paravirtualized environment under Xen
makes sense.

domU support is what Peter was worrying about for cleanliness, and
we need some x86 backend ops there, and generally to be careful.


For dom0 support we need to extend the kexec_load system call, and
get it right.

When we are done I expect both dom0 and domU support of kexec to work
in dom0.  I don't know if the normal kexec or kdump case will ever make
sense in dom0 but there is no reason for that case to be broken.

 ~Andrew

 [1] http://lists.xen.org/archives/html/xen-devel/2012-11/msg01274.html

Eric
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-27 Thread Daniel Kiper
 On 12/26/2012 06:18 PM, Daniel Kiper wrote:
  Hi,
 
  This set of patches contains initial kexec/kdump implementation for Xen v3.
  Currently only dom0 is supported, however, almost all infrustructure
  required for domU support is ready.
 
  Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 
  code.
  This could simplify and reduce a bit size of kernel code. However, this 
  solution
  requires some changes in baremetal x86 code. First of all code which 
  establishes
  transition page table should be moved back from machine_kexec_$(BITS).c to
  relocate_kernel_$(BITS).S. Another important thing which should be changed 
  in that
  case is format of page_list array. Xen kexec hypercall requires to 
  alternate physical
  addresses with virtual ones. These and other required stuff have not been 
  done in that
  version because I am not sure that solution will be accepted by kexec/kdump 
  maintainers.
  I hope that this email spark discussion about that topic.

 I want a detailed list of the constraints that this assumes and 
 therefore imposes on the native implementation as a result of this.  We 
 have had way too many patches where Xen PV hacks effectively nailgun 
 arbitrary, and sometimes poor, design decisions in place and now we 
 can't fix them.

OK but now I think that we should leave this discussion
until all details regarding kexec/kdump generic code
will be agreed. Sorry for that.

Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-27 Thread Daniel Kiper
 Andrew Cooper andrew.coop...@citrix.com writes:

  On 27/12/2012 07:53, Eric W. Biederman wrote:
  The syscall ABI still has the wrong semantics.
 
  Aka totally unmaintainable and umergeable.
 
  The concept of domU support is also strange.  What does domU support even 
  mean, when the dom0  support is loading a kernel to pick up Xen when Xen 
  falls over.
 
  There are two requirements pulling at this patch series, but I agree
  that we need to clarify them.

 It probably make sense to split them apart a little even.

  When dom0 loads a crash kernel, it is loading one for Xen to use.  As a
  dom0 crash causes a Xen crash, having dom0 set up a kdump kernel for
  itself is completely useless.  This ability is present in classic Xen
  dom0 kernels, but the feature is currently missing in PVOPS.

  Many cloud customers and service providers want the ability for a VM
  administrator to be able to load a kdump/kexec kernel within a
  domain[1].  This allows the VM administrator to take more proactive
  steps to isolate the cause of a crash, the state of which is most likely
  discarded while tearing down the domain.  The result being that as far
  as Xen is concerned, the domain is still alive, while the kdump
  kernel/environment can work its usual magic.  I am not aware of any
  feature like this existing in the past.

 Which makes domU support semantically just the normal kexec/kdump
 support.  Got it.

To some extent. It is true on HVM and PVonHVM guests. However,
PV guests requires a bit different kexec/kdump implementation
than plain kexec/kdump. Proposed firmware support has almost
all required features. PV guest specific features (a few) will
be added later (after agreeing generic firmware support which
is sufficient at least for dom0).

It looks that I should replace domU by PV guest in patch description.

 The point of implementing domU is for those times when the hypervisor
 admin and the kernel admin are different.

Right.

 For domU support modifying or adding alternate versions of
 machine_kexec.c and relocate_kernel.S to add paravirtualization support
 make sense.

It is not sufficient. Please look above.

 There is the practical argument that for implementation efficiency of
 crash dumps it would be better if that support came from the hypervisor
 or the hypervisor environment.  But this gets into the practical reality

I am thinking about that.

 that the hypervisor environment does not do that today.  Furthermore
 kexec all by itself working in a paravirtualized environment under Xen
 makes sense.

 domU support is what Peter was worrying about for cleanliness, and
 we need some x86 backend ops there, and generally to be careful.

As I know we do not need any additional pv_ops stuff
if we place all needed things in kexec firmware support.

Daniel
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-26 Thread Eric W. Biederman
The syscall ABI still has the wrong semantics.

Aka totally unmaintainable and umergeable.

The concept of domU support is also strange.  What does domU support even mean, 
when the dom0 support is loading a kernel to pick up Xen when Xen falls over.

I expect a lot of decisions about what code can be shared and what code can't 
is going to be driven by the simple question what does the syscall mean.

Sharing machine_kexec.c and relocate_kernel.S does not make much sense to me 
when what you are doing is effectively passing your arguments through to the 
Xen version of kexec.

Either Xen has it's own version of those routines or I expect the Xen version 
of kexec is buggy.   I can't imagine what sharing that code would mean.  By the 
same token I can't any need to duplicate the code either.

Furthermore since this is just passing data from one version of the syscall to 
another I expect you can share the majority of the code across all 
architectures that implement Xen.  The only part I can see being arch specific 
is the Xen syscall stub.

With respect to the proposed semantics of silently giving the kexec system call 
different meaning when running under Xen,
/sbin/kexec has to act somewhat differently when loading code into the Xen 
hypervisor so there is no point not making that explicit in the ABI.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-26 Thread H. Peter Anvin

On 12/26/2012 06:18 PM, Daniel Kiper wrote:

Hi,

This set of patches contains initial kexec/kdump implementation for Xen v3.
Currently only dom0 is supported, however, almost all infrustructure
required for domU support is ready.

Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 code.
This could simplify and reduce a bit size of kernel code. However, this solution
requires some changes in baremetal x86 code. First of all code which establishes
transition page table should be moved back from machine_kexec_$(BITS).c to
relocate_kernel_$(BITS).S. Another important thing which should be changed in 
that
case is format of page_list array. Xen kexec hypercall requires to alternate 
physical
addresses with virtual ones. These and other required stuff have not been done 
in that
version because I am not sure that solution will be accepted by kexec/kdump 
maintainers.
I hope that this email spark discussion about that topic.



I want a detailed list of the constraints that this assumes and 
therefore imposes on the native implementation as a result of this.  We 
have had way too many patches where Xen PV hacks effectively nailgun 
arbitrary, and sometimes poor, design decisions in place and now we 
can't fix them.


-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-26 Thread Daniel Kiper

Hi,

This set of patches contains initial kexec/kdump implementation for Xen v3.
Currently only dom0 is supported, however, almost all infrustructure
required for domU support is ready.

Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 code.
This could simplify and reduce a bit size of kernel code. However, this solution
requires some changes in baremetal x86 code. First of all code which establishes
transition page table should be moved back from machine_kexec_$(BITS).c to
relocate_kernel_$(BITS).S. Another important thing which should be changed in 
that
case is format of page_list array. Xen kexec hypercall requires to alternate 
physical
addresses with virtual ones. These and other required stuff have not been done 
in that
version because I am not sure that solution will be accepted by kexec/kdump 
maintainers.
I hope that this email spark discussion about that topic.

Daniel

 arch/x86/Kconfig |3 +
 arch/x86/include/asm/kexec.h |   10 +-
 arch/x86/include/asm/xen/hypercall.h |6 +
 arch/x86/include/asm/xen/kexec.h |   79 
 arch/x86/kernel/machine_kexec_64.c   |   12 +-
 arch/x86/kernel/vmlinux.lds.S|7 +-
 arch/x86/xen/Kconfig |1 +
 arch/x86/xen/Makefile|3 +
 arch/x86/xen/enlighten.c |   11 +
 arch/x86/xen/kexec.c |  150 +++
 arch/x86/xen/machine_kexec_32.c  |  226 +++
 arch/x86/xen/machine_kexec_64.c  |  318 +++
 arch/x86/xen/relocate_kernel_32.S|  323 +++
 arch/x86/xen/relocate_kernel_64.S|  309 ++
 drivers/xen/sys-hypervisor.c |   42 ++-
 include/linux/kexec.h|   26 ++-
 include/xen/interface/xen.h  |   33 ++
 kernel/Makefile  |1 +
 kernel/kexec-firmware.c  |  743 ++
 kernel/kexec.c   |   46 ++-
 20 files changed, 2331 insertions(+), 18 deletions(-)

Daniel Kiper (11):
  kexec: introduce kexec firmware support
  x86/kexec: Add extra pointers to transition page table PGD, PUD, PMD and 
PTE
  xen: Introduce architecture independent data for kexec/kdump
  x86/xen: Introduce architecture dependent data for kexec/kdump
  x86/xen: Register resources required by kexec-tools
  x86/xen: Add i386 kexec/kdump implementation
  x86/xen: Add x86_64 kexec/kdump implementation
  x86/xen: Add kexec/kdump Kconfig and makefile rules
  x86/xen/enlighten: Add init and crash kexec/kdump hooks
  drivers/xen: Export vmcoreinfo through sysfs
  x86: Add Xen kexec control code size check to linker script
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-26 Thread Daniel Kiper

Hi,

This set of patches contains initial kexec/kdump implementation for Xen v3.
Currently only dom0 is supported, however, almost all infrustructure
required for domU support is ready.

Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 code.
This could simplify and reduce a bit size of kernel code. However, this solution
requires some changes in baremetal x86 code. First of all code which establishes
transition page table should be moved back from machine_kexec_$(BITS).c to
relocate_kernel_$(BITS).S. Another important thing which should be changed in 
that
case is format of page_list array. Xen kexec hypercall requires to alternate 
physical
addresses with virtual ones. These and other required stuff have not been done 
in that
version because I am not sure that solution will be accepted by kexec/kdump 
maintainers.
I hope that this email spark discussion about that topic.

Daniel

 arch/x86/Kconfig |3 +
 arch/x86/include/asm/kexec.h |   10 +-
 arch/x86/include/asm/xen/hypercall.h |6 +
 arch/x86/include/asm/xen/kexec.h |   79 
 arch/x86/kernel/machine_kexec_64.c   |   12 +-
 arch/x86/kernel/vmlinux.lds.S|7 +-
 arch/x86/xen/Kconfig |1 +
 arch/x86/xen/Makefile|3 +
 arch/x86/xen/enlighten.c |   11 +
 arch/x86/xen/kexec.c |  150 +++
 arch/x86/xen/machine_kexec_32.c  |  226 +++
 arch/x86/xen/machine_kexec_64.c  |  318 +++
 arch/x86/xen/relocate_kernel_32.S|  323 +++
 arch/x86/xen/relocate_kernel_64.S|  309 ++
 drivers/xen/sys-hypervisor.c |   42 ++-
 include/linux/kexec.h|   26 ++-
 include/xen/interface/xen.h  |   33 ++
 kernel/Makefile  |1 +
 kernel/kexec-firmware.c  |  743 ++
 kernel/kexec.c   |   46 ++-
 20 files changed, 2331 insertions(+), 18 deletions(-)

Daniel Kiper (11):
  kexec: introduce kexec firmware support
  x86/kexec: Add extra pointers to transition page table PGD, PUD, PMD and 
PTE
  xen: Introduce architecture independent data for kexec/kdump
  x86/xen: Introduce architecture dependent data for kexec/kdump
  x86/xen: Register resources required by kexec-tools
  x86/xen: Add i386 kexec/kdump implementation
  x86/xen: Add x86_64 kexec/kdump implementation
  x86/xen: Add kexec/kdump Kconfig and makefile rules
  x86/xen/enlighten: Add init and crash kexec/kdump hooks
  drivers/xen: Export vmcoreinfo through sysfs
  x86: Add Xen kexec control code size check to linker script
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-26 Thread H. Peter Anvin

On 12/26/2012 06:18 PM, Daniel Kiper wrote:

Hi,

This set of patches contains initial kexec/kdump implementation for Xen v3.
Currently only dom0 is supported, however, almost all infrustructure
required for domU support is ready.

Jan Beulich suggested to merge Xen x86 assembler code with baremetal x86 code.
This could simplify and reduce a bit size of kernel code. However, this solution
requires some changes in baremetal x86 code. First of all code which establishes
transition page table should be moved back from machine_kexec_$(BITS).c to
relocate_kernel_$(BITS).S. Another important thing which should be changed in 
that
case is format of page_list array. Xen kexec hypercall requires to alternate 
physical
addresses with virtual ones. These and other required stuff have not been done 
in that
version because I am not sure that solution will be accepted by kexec/kdump 
maintainers.
I hope that this email spark discussion about that topic.



I want a detailed list of the constraints that this assumes and 
therefore imposes on the native implementation as a result of this.  We 
have had way too many patches where Xen PV hacks effectively nailgun 
arbitrary, and sometimes poor, design decisions in place and now we 
can't fix them.


-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 00/11] xen: Initial kexec/kdump implementation

2012-12-26 Thread Eric W. Biederman
The syscall ABI still has the wrong semantics.

Aka totally unmaintainable and umergeable.

The concept of domU support is also strange.  What does domU support even mean, 
when the dom0 support is loading a kernel to pick up Xen when Xen falls over.

I expect a lot of decisions about what code can be shared and what code can't 
is going to be driven by the simple question what does the syscall mean.

Sharing machine_kexec.c and relocate_kernel.S does not make much sense to me 
when what you are doing is effectively passing your arguments through to the 
Xen version of kexec.

Either Xen has it's own version of those routines or I expect the Xen version 
of kexec is buggy.   I can't imagine what sharing that code would mean.  By the 
same token I can't any need to duplicate the code either.

Furthermore since this is just passing data from one version of the syscall to 
another I expect you can share the majority of the code across all 
architectures that implement Xen.  The only part I can see being arch specific 
is the Xen syscall stub.

With respect to the proposed semantics of silently giving the kexec system call 
different meaning when running under Xen,
/sbin/kexec has to act somewhat differently when loading code into the Xen 
hypervisor so there is no point not making that explicit in the ABI.

Eric

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/