Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
On Mar 15 2007 20:03, Zachary Amsden wrote: > Well testing that is not so fun. I installed SUSE Pro 9.0, and strings on > ld.so contains the magic at_sysinfo assert! But it doesn't install TLS > libraries, so I'll have to install them by hand. 9.0 is kinda old. And if you want some TLS libs, install the _i686_ glibc package (not done by default). Jan -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
ASR: Address Space Randomization (was: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT)
ebiederm wrote: > I'm tempted to rant on the pure insanity of address space randomization > but that is a whole other issue... Please do rant; all I can see asr brings is one big performance hit. Of course, it's not enough to just attack this at the kernel, but glibc has to play accordingly as well. Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Eric W. Biederman wrote: > There are three ways of finding the VDSO. > - AT_SYSINFO > - AT_SYSINFO_EHDR > - known fixed address (see x86_64) > > Currently it doesn't sound like you need to deal with the known fixed > address case but COMPAT_VDSO also provides that. > Yes, I don't think any 32-bit userspace expects a fixed address any more. 64-bit is another matter. > If userspace uses AT_SYSINFO the premise is that it is not expecting > not to need to perform any relocation processing. > > If userspace uses AT_SYSINFO_EHDR it expects it needs to perform > relocation processing and fixes up whatever needs fixing up. > Correct. > With the module code we have shown the kernel is capable of performing > relocation processing at times and it works. > Good point. It would be good to reuse that machinery. > So is it possible to simply relocate the normal vdso and fixup > it's program header so it shows that relocation is not necessary. > If you can do that and still export AT_SYSINFO so the problem user > space still runs you are good. (If you can relocate the vdso > you should be able to relocate it anywhere). > Yes. The plan is to map the relocated compat vdso at some fixed address in all processes, and map a non-relocated non-compat vdso at some randomized address (it will probably be the same bits in either case). We could map a relocated vdso to a randomized address (ie, only one vdso mapping), but that would require a per-process copy of the vdso and effort to relocate on each exec. > Otherwise it probably just make sense to simply not export a VDSO > on those systems. > We did that in an earlier version of the patch, and Ingo complained, with some justification. > This would leave COMPAT_VDSO for the case where you must use one magic > fixed address, and if user space does not require that it means > COMPAT_VDSO could be completely removed. > FC1 and SuSE 9 both shipped with broken glibcs which require weak-COMPAT_VDSO (not fixed address, but pre-relocated). There are still enough of these around that we need to cater to them. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Eric W. Biederman wrote: I'm not quite familiar with the context. And I'm to lazy to look right now. What is the difference with COMPAT_VDSO that it doesn't do relocation? What are we preserving? COMPAT_VDSO causes the link address to be fixed at compile time to match the virtual address of the VDSO. !COMPAT_VDSO just links at zero. The practical question here is if we already have all of the relocation logic for the VDSO why do we need to add more? There wasn't relocation logic before, the VDSO just got remapped to a different virtual address without any relocation at all. Which is safe, because it is all hand-coded assembly, relocatable code. But not complete, since the ELF headers don't have any fixup applied for the relocation, and there are broken linkers which look at the ELF headers and assert fail if ph->p_vaddr != _rtld_local._dl_sysinfo_dso; these broken dynamic linkers are what COMPAT_VDSO is protecting. I'm tempted to rant on the pure insanity of address space randomization but that is a whole other issue... Firesticks in ant nests is all I'm saying about that. Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Jeremy Fitzhardinge <[EMAIL PROTECTED]> writes: > Ingo Molnar wrote: >> that's what is the case right now, but much of the intention behind the >> vma based vDSO is to enable per-process randomized vdso locations, and >> various distributions do that. So the 'modern' vDSO concept is very much >> relocatable. > > No, the point is that it never needs relocating. The kernel can map it > anywhere and userspace can cope. Its only the broken glibcs which > require relocation. Ok. So to summarize. There are three ways of finding the VDSO. - AT_SYSINFO - AT_SYSINFO_EHDR - known fixed address (see x86_64) Currently it doesn't sound like you need to deal with the known fixed address case but COMPAT_VDSO also provides that. If userspace uses AT_SYSINFO the premise is that it is not expecting not to need to perform any relocation processing. If userspace uses AT_SYSINFO_EHDR it expects it needs to perform relocation processing and fixes up whatever needs fixing up. With the module code we have shown the kernel is capable of performing relocation processing at times and it works. So is it possible to simply relocate the normal vdso and fixup it's program header so it shows that relocation is not necessary. If you can do that and still export AT_SYSINFO so the problem user space still runs you are good. (If you can relocate the vdso you should be able to relocate it anywhere). Otherwise it probably just make sense to simply not export a VDSO on those systems. This would leave COMPAT_VDSO for the case where you must use one magic fixed address, and if user space does not require that it means COMPAT_VDSO could be completely removed. Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
>>> Jeremy Fitzhardinge <[EMAIL PROTECTED]> 16.03.07 17:46 >>> >Jan Beulich wrote: >> I have one, too (which is one reasone why I created the original Xen patch). >> > >It's some version of SuSE 9, right? What glibc version? Yes. 2.3.2. Jan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Jan Beulich wrote: > I have one, too (which is one reasone why I created the original Xen patch). > It's some version of SuSE 9, right? What glibc version? >>> I'm playing safe. Binary identical relocation to 0xe000 was my goal. >>> >> Yeah, fair enough. But as Eric likes to keep pointing out, an >> executable ELF file need not have any sections at all, so the only safe >> course for anything "real" is via the section headers. >> > > Program headers you mean. > Er, yep. >> So I guess the right thing to do is relocate the dynamic stuff via >> PT_DYNAMIC, and relocate the symtab if its present. >> > > Symtab should also be deduced from program headers. > Well, the normal symtab might be completely missing. But yes, the dynamic symtab should be in the PT_DYNAMIC. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
* Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: > > The practical question here is if we already have all of the > > relocation logic for the VDSO why do we need to add more? > > The kernel doesn't normally ever relocate the vdso; [...] that's what is the case right now, but much of the intention behind the vma based vDSO is to enable per-process randomized vdso locations, and various distributions do that. So the 'modern' vDSO concept is very much relocatable. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Ingo Molnar wrote: > that's what is the case right now, but much of the intention behind the > vma based vDSO is to enable per-process randomized vdso locations, and > various distributions do that. So the 'modern' vDSO concept is very much > relocatable. No, the point is that it never needs relocating. The kernel can map it anywhere and userspace can cope. Its only the broken glibcs which require relocation. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Eric W. Biederman wrote: > I'm not quite familiar with the context. And I'm to lazy to look right now. > What is the difference with COMPAT_VDSO that it doesn't do relocation? > What are we preserving? > > The issue is that with COMPAT_VDSO, the vdso gets mapped at two places: one random address, and one fixed address (traditionally 0xe000 I think, but that's not mandatory). The important point is that the fixed-address is the same one that the vdso itself is linked for, so that old broken glibcs that some vendors shipped won't explode (because they use AT_SYSINFO but not AT_SYSINFO_EHDR, so they don't account for the difference in link and map address). The problem with the COMPAT_VDSO with paravirt is that the hypervisor may steal some of the kernel address space, and so push down the address where the fixed address vdso can be mapped. Zach's patch relocates the immobile COMPAT_VDSO version of the vdso page so that map=link address, regardless of where the kernel's runtime environment puts the top of the kernel address space. I guess the other solution is to simply put the compat_vdso mapping at some low address (like the top of the user address space), and not worry about it moving. I don't know if this would work, but I seem to remember someone mentioning that it had been done in the past. > The practical question here is if we already have all of the relocation logic > for the VDSO why do we need to add more? > The kernel doesn't normally ever relocate the vdso; usermode can generally cope with it where ever it gets mapped. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Zachary Amsden <[EMAIL PROTECTED]> writes: > Paravirt-ops guests which move the fixmap also end up moving the syscall VDSO. > This fails if it is prelinked at a fixed address, which is why COMPAT_VDSO is > broken under CONFIG_VMI (and also under CONFIG_XEN). Several options are > available to try to address this. Jan had cooked up a patch for Xen that used > build magic to find the parts of the VDSO that need relocation. I don't like > the idea of having auto-generated relocations, as someday something could > change > between two linked objects (timestamp, elf notes perhaps) that is not a > relocation. So I prefer human supervision over the relocation and explicitly > fixing everything by hand. I'm not necessarily advocating one solution over > the > other; my way is more code to maintain if the VDSO linkage changes. I'm > looking > for feedback about which way is best. > > Also, it appears that COMPAT_VDSO could disappear entirely. Since this > approach > should work with older broken ld.so (2.3.2 is the version, I believe), we > should > be able to switch over completely to using the gate vma style of linking the > vdso. One can even get the address randomization benefits by simply running > fixup on the vdso if you are prepared to take the cost of allocating an extra > page per process. Or you could randomize just once at boot, which makes the > randomization per-machine, still sufficient to slow network based worm attacks > which might rely on a fixed VDSO address. > > Clearly this patch needs more testing and feedback, which I'm sure it will > get... > > > > Zach > > P.S. - Eric, I've copied you as you appear to be an ELF expert, or at least > have > a greater grasp of Elven Magic than me, and I'm hoping I got all the dynamic > tags which need relocation right. > Invoke black magic to relocate the VDSO even when COMPAT_VDSO is enabled > by fixing up the ELF object. I'm not quite familiar with the context. And I'm to lazy to look right now. What is the difference with COMPAT_VDSO that it doesn't do relocation? What are we preserving? The practical question here is if we already have all of the relocation logic for the VDSO why do we need to add more? I'm tempted to rant on the pure insanity of address space randomization but that is a whole other issue... Eric - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Jan Beulich wrote: So I guess the right thing to do is relocate the dynamic stuff via PT_DYNAMIC, and relocate the symtab if its present. Symtab should also be deduced from program headers. Learning more all the time.. I'm actually surprised this got re-implemented from scratch, when my patch already had both variants (one just #ifdef-ed out), and was tested in both forms (actually, I first implemented the ELF form, and only after seeing the bloat it added to the sources I came up her than with the second variant, which in the end unfortunately didn't add significantly less bloat to the Makefile. This wasn't re-implemented from scratch - I did this in another lifetime: http://lists.xensource.com/archives/html/xen-devel/2005-08/msg00284.html Either way of doing things is fine with me - I would just prefer that if it has to get down and dirty, we do it in source rather than hidden in a makefile. But just a personal preference. Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
>>> Jeremy Fitzhardinge <[EMAIL PROTECTED]> 16.03.07 06:10 >>> >Zachary Amsden wrote: >> Well testing that is not so fun. I installed SUSE Pro 9.0, and >> strings on ld.so contains the magic at_sysinfo assert! But it doesn't >> install TLS libraries, so I'll have to install them by hand. >> >> In works - in theory. Look, a puppy! >> >> Scratchbox is rumored to produce the fabled assertion even on modern >> distros by installing its own toolchain which includes the dreaded glibc. > >I think Andi and Andrew have boxes which are afflicted. I have one, too (which is one reasone why I created the original Xen patch). >> I'm playing safe. Binary identical relocation to 0xe000 was my goal. > >Yeah, fair enough. But as Eric likes to keep pointing out, an >executable ELF file need not have any sections at all, so the only safe >course for anything "real" is via the section headers. Program headers you mean. >So I guess the right thing to do is relocate the dynamic stuff via >PT_DYNAMIC, and relocate the symtab if its present. Symtab should also be deduced from program headers. I'm actually surprised this got re-implemented from scratch, when my patch already had both variants (one just #ifdef-ed out), and was tested in both forms (actually, I first implemented the ELF form, and only after seeing the bloat it added to the sources I came up with the second variant, which in the end unfortunately didn't add significantly less bloat to the Makefile. Jan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Jeremy Fitzhardinge wrote: +} else if (strcmp(secstrings+sechdrs[i].sh_name, ".dynamic") == 0) { +Elf32_Dyn *dyn = (void *)hdr + sechdrs[i].sh_offset; +int tag; +while ((tag = (++dyn)->d_tag) != DT_NULL) Um, no. Walk based on size instead? No, I was just complaining about the embedded assignment, before dinner, so I was overly terse. My last embedded assignment was a robot microcontroller, and I dropped out of that class. So I _need_ embedded assignments. Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Zachary Amsden wrote: > Well testing that is not so fun. I installed SUSE Pro 9.0, and > strings on ld.so contains the magic at_sysinfo assert! But it doesn't > install TLS libraries, so I'll have to install them by hand. > > In works - in theory. Look, a puppy! > > Scratchbox is rumored to produce the fabled assertion even on modern > distros by installing its own toolchain which includes the dreaded glibc. I think Andi and Andrew have boxes which are afflicted. > I'm playing safe. Binary identical relocation to 0xe000 was my goal. Yeah, fair enough. But as Eric likes to keep pointing out, an executable ELF file need not have any sections at all, so the only safe course for anything "real" is via the section headers. So I guess the right thing to do is relocate the dynamic stuff via PT_DYNAMIC, and relocate the symtab if its present. >>> +} else if (strcmp(secstrings+sechdrs[i].sh_name, >>> ".dynamic") == 0) { >>> +Elf32_Dyn *dyn = (void *)hdr + sechdrs[i].sh_offset; >>> +int tag; >>> +while ((tag = (++dyn)->d_tag) != DT_NULL) >>> >> >> Um, no. >> > > Walk based on size instead? No, I was just complaining about the embedded assignment, before dinner, so I was overly terse. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Jeremy Fitzhardinge wrote: Zachary Amsden wrote: Invoke black magic to relocate the VDSO even when COMPAT_VDSO is enabled by fixing up the ELF object. So does it actually work? Can you boot the broken distros with this in place? Well testing that is not so fun. I installed SUSE Pro 9.0, and strings on ld.so contains the magic at_sysinfo assert! But it doesn't install TLS libraries, so I'll have to install them by hand. In works - in theory. Look, a puppy! Scratchbox is rumored to produce the fabled assertion even on modern distros by installing its own toolchain which includes the dreaded glibc. Using sections is wrong; you should be going through the phdrs, and looking for PT_DYNAMIC for relocation. Will do. Does anyone expect the symbolic info to be correct? It might be better to just stomp it so nobody gets any ideas. On the other hand, we don't want to break compatibility with anything... I'm playing safe. Binary identical relocation to 0xe000 was my goal. + } else if (strcmp(secstrings+sechdrs[i].sh_name, ".dynamic") == 0) { + Elf32_Dyn *dyn = (void *)hdr + sechdrs[i].sh_offset; + int tag; + while ((tag = (++dyn)->d_tag) != DT_NULL) Um, no. Walk based on size instead? + } else if (strcmp(secstrings+sechdrs[i].sh_name, ".useless") == 0) { + /* This is demonic; see vsyscall.lds.S; it puts the +* .got in a section named .useless */ + uint32_t *got = (void *)hdr + sechdrs[i].sh_offset; + *got += VDSO_HIGH_BASE; + } This won't get relocated with one of the other relocations? It's in the text phdr. Hmm, I can try that. Thanks for the suggestions / fixes. Zach - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Zachary Amsden wrote: > Invoke black magic to relocate the VDSO even when COMPAT_VDSO is enabled > by fixing up the ELF object. > So does it actually work? Can you boot the broken distros with this in place? > Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]> > > Index: linux-2.6.21/arch/i386/kernel/entry.S > === > --- linux-2.6.21.orig/arch/i386/kernel/entry.S2007-03-06 > 18:51:33.0 -0800 > +++ linux-2.6.21/arch/i386/kernel/entry.S 2007-03-15 18:14:11.0 > -0800 > @@ -305,16 +305,12 @@ sysenter_past_esp: > pushl $(__USER_CS) > CFI_ADJUST_CFA_OFFSET 4 > /*CFI_REL_OFFSET cs, 0*/ > -#ifndef CONFIG_COMPAT_VDSO > /* >* Push current_thread_info()->sysenter_return to the stack. >* A tiny bit of offset fixup is necessary - 4*4 means the 4 words >* pushed above; +8 corresponds to copy_thread's esp0 setting. >*/ > pushl (TI_sysenter_return-THREAD_SIZE+8+4*4)(%esp) > -#else > - pushl $SYSENTER_RETURN > -#endif > CFI_ADJUST_CFA_OFFSET 4 > CFI_REL_OFFSET eip, 0 > > Index: linux-2.6.21/arch/i386/kernel/sysenter.c > === > --- linux-2.6.21.orig/arch/i386/kernel/sysenter.c 2007-03-06 > 18:51:34.0 -0800 > +++ linux-2.6.21/arch/i386/kernel/sysenter.c 2007-03-15 18:27:43.0 > -0800 > @@ -72,6 +72,99 @@ extern const char vsyscall_int80_start, > extern const char vsyscall_sysenter_start, vsyscall_sysenter_end; > static struct page *syscall_pages[1]; > > +#ifdef CONFIG_COMPAT_VDSO > +static void fixup_vsyscall_elf(char *page) > +{ > + Elf32_Ehdr *hdr; > + Elf32_Shdr *sechdrs; > + Elf32_Phdr *phdr; > + char *secstrings; > + int i, j, n; > + > + hdr = (Elf32_Ehdr *)page; > + > + printk("Remapping vsyscall page to %08x\n", (unsigned > int)VDSO_HIGH_BASE); > + > + /* Sanity checks against insmoding binaries or wrong arch, > + weird elf version */ > + if (memcmp(hdr->e_ident, ELFMAG, 4) != 0 || > + !elf_check_arch(hdr) || > + hdr->e_type != ET_DYN) > + panic("Bogus ELF in vsyscall DSO\n"); > + > + hdr->e_entry += VDSO_HIGH_BASE; > + sechdrs = (void *)hdr + hdr->e_shoff; > + secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; > + > + for (i = 1; i < hdr->e_shnum; i++) { > Using sections is wrong; you should be going through the phdrs, and looking for PT_DYNAMIC for relocation. > + if (!(sechdrs[i].sh_flags & SHF_ALLOC)) > + continue; > + > + sechdrs[i].sh_addr += VDSO_HIGH_BASE; > + if (strcmp(secstrings+sechdrs[i].sh_name, ".dynsym") == 0) { > + Elf32_Sym *sym = (void *)hdr + sechdrs[i].sh_offset; > + n = sechdrs[i].sh_size / sizeof(*sym); > + for (j = 1; j < n; j++) { > + int ndx = sym[j].st_shndx; > + if (ndx == SHN_UNDEF || ndx == SHN_ABS) > + continue; > + sym[j].st_value += VDSO_HIGH_BASE; > + } > Does anyone expect the symbolic info to be correct? It might be better to just stomp it so nobody gets any ideas. On the other hand, we don't want to break compatibility with anything... > + } else if (strcmp(secstrings+sechdrs[i].sh_name, ".dynamic") == > 0) { > + Elf32_Dyn *dyn = (void *)hdr + sechdrs[i].sh_offset; > + int tag; > + while ((tag = (++dyn)->d_tag) != DT_NULL) > Um, no. > + } else if (strcmp(secstrings+sechdrs[i].sh_name, ".useless") == > 0) { > + /* This is demonic; see vsyscall.lds.S; it puts the > + * .got in a section named .useless */ > + uint32_t *got = (void *)hdr + sechdrs[i].sh_offset; > + *got += VDSO_HIGH_BASE; > + } > This won't get relocated with one of the other relocations? It's in the text phdr. > + } > + phdr = (void *)hdr + hdr->e_phoff; > + for (i = 0; i < hdr->e_phnum; i++) { > + phdr[i].p_vaddr += VDSO_HIGH_BASE; > + phdr[i].p_paddr += VDSO_HIGH_BASE; > + } > + > +#if 0 > +/* > + * To verify the binary image in memory is identical, linked in the VDSO page > + * from a COMPAT_VDSO compile without this patch; then diff the two. For a > + * non-relocated fixmap, the VDSO image is identical. > + */ > +{ > + extern const char vsyscall_orig_start, vsyscall_orig_end; > + int *l1 = (int *)page, *l2 = (int *)&vsyscall_orig_start; > + int foo = vsyscall_orig_end - vsyscall_orig_start / 4; > + for (i = 0; i < foo; i++) { > + if (l1[i] != l2[i]) { > + printk("vsyscall - delta [%03x] orig %08x
[RFC, PATCH] Fixup COMPAT_VDSO to work with CONFIG_PARAVIRT
Paravirt-ops guests which move the fixmap also end up moving the syscall VDSO. This fails if it is prelinked at a fixed address, which is why COMPAT_VDSO is broken under CONFIG_VMI (and also under CONFIG_XEN). Several options are available to try to address this. Jan had cooked up a patch for Xen that used build magic to find the parts of the VDSO that need relocation. I don't like the idea of having auto-generated relocations, as someday something could change between two linked objects (timestamp, elf notes perhaps) that is not a relocation. So I prefer human supervision over the relocation and explicitly fixing everything by hand. I'm not necessarily advocating one solution over the other; my way is more code to maintain if the VDSO linkage changes. I'm looking for feedback about which way is best. Also, it appears that COMPAT_VDSO could disappear entirely. Since this approach should work with older broken ld.so (2.3.2 is the version, I believe), we should be able to switch over completely to using the gate vma style of linking the vdso. One can even get the address randomization benefits by simply running fixup on the vdso if you are prepared to take the cost of allocating an extra page per process. Or you could randomize just once at boot, which makes the randomization per-machine, still sufficient to slow network based worm attacks which might rely on a fixed VDSO address. Clearly this patch needs more testing and feedback, which I'm sure it will get... Zach P.S. - Eric, I've copied you as you appear to be an ELF expert, or at least have a greater grasp of Elven Magic than me, and I'm hoping I got all the dynamic tags which need relocation right. Invoke black magic to relocate the VDSO even when COMPAT_VDSO is enabled by fixing up the ELF object. Signed-off-by: Zachary Amsden <[EMAIL PROTECTED]> Index: linux-2.6.21/arch/i386/kernel/entry.S === --- linux-2.6.21.orig/arch/i386/kernel/entry.S 2007-03-06 18:51:33.0 -0800 +++ linux-2.6.21/arch/i386/kernel/entry.S 2007-03-15 18:14:11.0 -0800 @@ -305,16 +305,12 @@ sysenter_past_esp: pushl $(__USER_CS) CFI_ADJUST_CFA_OFFSET 4 /*CFI_REL_OFFSET cs, 0*/ -#ifndef CONFIG_COMPAT_VDSO /* * Push current_thread_info()->sysenter_return to the stack. * A tiny bit of offset fixup is necessary - 4*4 means the 4 words * pushed above; +8 corresponds to copy_thread's esp0 setting. */ pushl (TI_sysenter_return-THREAD_SIZE+8+4*4)(%esp) -#else - pushl $SYSENTER_RETURN -#endif CFI_ADJUST_CFA_OFFSET 4 CFI_REL_OFFSET eip, 0 Index: linux-2.6.21/arch/i386/kernel/sysenter.c === --- linux-2.6.21.orig/arch/i386/kernel/sysenter.c 2007-03-06 18:51:34.0 -0800 +++ linux-2.6.21/arch/i386/kernel/sysenter.c2007-03-15 18:27:43.0 -0800 @@ -72,6 +72,99 @@ extern const char vsyscall_int80_start, extern const char vsyscall_sysenter_start, vsyscall_sysenter_end; static struct page *syscall_pages[1]; +#ifdef CONFIG_COMPAT_VDSO +static void fixup_vsyscall_elf(char *page) +{ + Elf32_Ehdr *hdr; + Elf32_Shdr *sechdrs; + Elf32_Phdr *phdr; + char *secstrings; + int i, j, n; + + hdr = (Elf32_Ehdr *)page; + + printk("Remapping vsyscall page to %08x\n", (unsigned int)VDSO_HIGH_BASE); + + /* Sanity checks against insmoding binaries or wrong arch, + weird elf version */ + if (memcmp(hdr->e_ident, ELFMAG, 4) != 0 || + !elf_check_arch(hdr) || + hdr->e_type != ET_DYN) + panic("Bogus ELF in vsyscall DSO\n"); + + hdr->e_entry += VDSO_HIGH_BASE; + sechdrs = (void *)hdr + hdr->e_shoff; + secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; + + for (i = 1; i < hdr->e_shnum; i++) { + if (!(sechdrs[i].sh_flags & SHF_ALLOC)) + continue; + + sechdrs[i].sh_addr += VDSO_HIGH_BASE; + if (strcmp(secstrings+sechdrs[i].sh_name, ".dynsym") == 0) { + Elf32_Sym *sym = (void *)hdr + sechdrs[i].sh_offset; + n = sechdrs[i].sh_size / sizeof(*sym); + for (j = 1; j < n; j++) { + int ndx = sym[j].st_shndx; + if (ndx == SHN_UNDEF || ndx == SHN_ABS) + continue; + sym[j].st_value += VDSO_HIGH_BASE; + } + } else if (strcmp(secstrings+sechdrs[i].sh_name, ".dynamic") == 0) { + Elf32_Dyn *dyn = (void *)hdr + sechdrs[i].sh_offset; + int tag; + while ((tag = (++dyn)->d_tag) != DT_NULL) + switch(tag)