On Mon, 2006-08-21 at 21:41 -0600, Eric W. Biederman wrote: > Magnus Damm <[EMAIL PROTECTED]> writes: > > > On Mon, 2006-08-21 at 15:02 -0600, Eric W. Biederman wrote: > >> Magnus Damm <[EMAIL PROTECTED]> writes: > >> > >> > x86_64: Reload CS when startup_64 is used. > >> > > >> > The current x86_64 startup code never reloads CS during the early boot > > process > >> > if the 64-bit function startup_64 is used as entry point. The 32-bit > >> > entry > >> > point startup_32 does the right thing and reloads CS, and this is what > >> > most > >> > people are using if they use bzImage. > >> > > >> > This patch fixes the case when the Linux kernel is booted into using > >> > kexec > >> > under Xen. The Xen hypervisor is using large CS values which makes the > > x86_64 > >> > kernel fail - but only if vmlinux is booted, bzImage works well because > >> > it > >> > is using the 32-bit entry point. > >> > > >> > The main question is if we require that the boot loader should setup CS > >> > to some certain offset to be able to boot the kernel. The sane solution > >> > IMO > >> > should be that the kernel requires that the loaded descriptors are > >> > correct, > >> > but that the exact offset within the GDT the boot loader is using should > >> > not > >> > matter. This is the way the i386 boot works if I understand things > > correctly. > >> > >> What extra reload of cs does Xen introduce? > > > > None, but Xen is using CS values that are very different from Linux: > > > > xen/include/public/arch-x86_64.h: > > > > #define FLAT_RING3_CS32 0xe023 /* GDT index 260 */ > > #define FLAT_RING3_CS64 0xe033 /* GDT index 261 */ > > #define FLAT_RING3_DS32 0xe02b /* GDT index 262 */ > > #define FLAT_RING3_DS64 0x0000 /* NULL selector */ > > #define FLAT_RING3_SS32 0xe02b /* GDT index 262 */ > > #define FLAT_RING3_SS64 0xe02b /* GDT index 262 */ > > > > The main problem is that startup_64 depends on that CS is set to > > __KERNEL_CS when it shouldn't. On top of that the purgatory code in > > kexec-tools doesn't setup CS when using a 64-bit entry point. The > > following (mangled) patch solves that for me: > > I believe the cs shadow registers are fine.
Yeah, I know you do. =) > > --- 0001/purgatory/arch/x86_64/entry64.S > > +++ work/purgatory/arch/x86_64/entry64.S 2006-08-18 > > 15:34:23.000000000 +0900 > > @@ -37,8 +37,9 @@ entry64: > > movl %eax, %fs > > movl %eax, %gs > > > > - /* In 64bit mode the code segment is meaningless */ > > Would you care to explain to me how the above comment is not true. > As I recall the only meaning %cs has is if you are a kernel > or user space process. The base address and everything else > mean nothing. > > In addition the value in %cs never means anything. It is the > values in the cs shadow registers that count. The value in %cs > just reflects where those values in %cs came from. > > So if we never reload %cs and only use the shadow values why does > the value in %cs matter? Well, I cannot explain it well myself. But if you try setting CS to something else than 0x18 and start the x86_64 kernel using startup_64 you will see what I am talking about. It does not work. A good example of when this doesn't work is when my page_table_a patches are used on x86_64 to boot a vmlinux. It does not work because CS is set to something else then __KERNEL_CS. Things just happen to work in most cases today (excluding being rebooted from Xen) because either a) vmlinux is booted and CS is set to 0x18 by Linux before kexec:ing. or b) bzImage is booted (32 bit entry point) > >> I'm not really comfortable with a half virtualized case. > > > > That I don't understand, care to explain more? > > The only case where I can think of that the value in %cs would matter > is if you change %cs. The only way I can see that is if you are > half para-virtualized. Because we are running privileged > instructions in startup_64 it shouldn't work for the paravirtualized > case, and it should work just like it does today in the > fully virtualized case. I'm not talking about running Linux under Xen, this is a standard non-virtualized kernel that is started _from_ a Xen environment using kexec-tools. Think kdump kernel being started by the Xen hypervisor on panic. Does that make sense? Thanks, / magnus _______________________________________________ fastboot mailing list [email protected] https://lists.osdl.org/mailman/listinfo/fastboot
