Re: [RFC] kexec_file: Add support for purgatory built as PIE

2016-11-04 Thread Thiago Jung Bauermann
Hello Eric,

Am Freitag, 4. November 2016, 10:13:39 BRST schrieb Eric W. Biederman:
> Baoquan He  writes:
> > On 11/02/16 at 04:00am, Thiago Jung Bauermann wrote:
> >> Hello,
> >> 
> >> The kexec_file code currently builds the purgatory as a partially linked
> >> object (using ld -r). Is there a particular reason to use that instead
> >> of a position independent executable (PIE)?
> > 
> > It's taken as "-r", relocatable in user space kexec-tools too originally.
> > I think Vivek just keeps it the same when moving into kernel.
> 
> At least on x86 using just -r removed the need for a GOT and all of the
> other nasty dynamic relocatable bits, that are not needed when the you
> don't want to share your text bits with the page cache.
> 
> I can see reaons for refactoring code but I expect PIE expecutables need
> a GOT and all of that pain in the neck stuff that can just be avoided by
> building the code to run at an absolute address.

At least on powerpc, building the purgatory as PIE resulted in only the 
following differences:

1. A lot less relocation types to deal with.
2. __kexec_load_purgatory needs to use the program headers rather than the 
   section headers to figure out how to load the binary.
3. Symbol values are absolute addresses instead of relative to the start of 
   the section.

2. is an advantage too because it's actually easier to use the program headers 
because unlike section headers, the purpose of program headers is to provide 
the information needed by a program loader. You can see this by comparing the 
two implementations of __kexec_load_purgatory in the WIP patch I posted. The 
one using program headers is simpler.

3. isn't a problem, it's easy to convert the absolute addresses back into 
relative ones, as can be seen in my patch.

> So far I have not seen ELF relocations that are difficult to process.

The problem is not that it's difficult to process, but that on powerpc it 
takes a lot of code to implement that processing. In v9 of the kexec_file_load 
implementation for powerpc, the switch statement implementing all the 
relocation types (shared by powerpc's module_64.c and machine_kexec_file_64.c) 
has 200 lines. The switch statement implementing only the relocation types 
used by the PIE purgatory has 26 lines.

This is not a problem in x86, though: the purgatory built as a relocatable 
object has only two relocation types.

-- 
Thiago Jung Bauermann
IBM Linux Technology Center



Re: [RFC] kexec_file: Add support for purgatory built as PIE

2016-11-04 Thread Thiago Jung Bauermann
Hello Baoquan,

Am Freitag, 4. November 2016, 15:38:40 BRST schrieb Baoquan He:
> On 11/02/16 at 04:00am, Thiago Jung Bauermann wrote:
> > Hello,
> > 
> > The kexec_file code currently builds the purgatory as a partially linked
> > object (using ld -r). Is there a particular reason to use that instead of
> > a position independent executable (PIE)?
> 
> It's taken as "-r", relocatable in user space kexec-tools too originally.
> I think Vivek just keeps it the same when moving into kernel.

Ok. If that's the only reason then PIE is better suited at least for powerpc.

> > I found a discussion from 2013 in the archives but from what I understood
> > it was about the purgatory as a separate object vs having it linked into
> > the kernel, which is different from what I'm asking:
> > 
> > http://lists.infradead.org/pipermail/kexec/2013-December/010535.html
> > 
> > Here is my motivation for this question:
> >  On ppc64 purgatory.ro has 12 relocation types when built as a partially
> > 
> > linked object. This makes arch_kexec_apply_relocations_add duplicate a lot
> > of code with module_64.c:apply_relocate_add to implement these
> > relocations. The alternative is to do some refactoring so that both
> > functions can share the implementation of the relocations. This is done
> > in patches 5 and 6 of the
> > kexec_file_load implementation for powerpc:
> In user space kexec-tools utility, you also got this problem?

Yes, kexec-tools' purgatory.ro has 10 relocation types instead of 12 (I don't 
know why), but that's still a lot.

> > @@ -942,7 +1085,13 @@ static Elf_Sym *kexec_purgatory_find_symbol(struct
> > purgatory_info *pi,> 
> > /* Go through symbols for a match */
> > for (k = 0; k < sechdrs[i].sh_size/sizeof(Elf_Sym); k++) {
> > 
> > -   if (ELF_ST_BIND(syms[k].st_info) != STB_GLOBAL)
> > +   /*
> > +* FIXME: See if we can or should export the .TOC.
> > +* symbol as global instead of searching local symbols
> > +* here.
> > +*/
> > +   if (ELF_ST_BIND(syms[k].st_info) != STB_GLOBAL &&
> > +   ELF_ST_BIND(syms[k].st_info) != STB_LOCAL)
> > 
> > continue;
> > 
> > if (strcmp(strtab + syms[k].st_name, name) != 0)
> > 

I don't need the change above anymore. I found a way to obtain the TOC pointer 
without looking for the .TOC. symbol.

-- 
Thiago Jung Bauermann
IBM Linux Technology Center



Re: [RFC] kexec_file: Add support for purgatory built as PIE

2016-11-04 Thread Eric W. Biederman
Baoquan He  writes:

> On 11/02/16 at 04:00am, Thiago Jung Bauermann wrote:
>> Hello,
>> 
>> The kexec_file code currently builds the purgatory as a partially linked 
>> object 
>> (using ld -r). Is there a particular reason to use that instead of a 
>> position 
>> independent executable (PIE)?
>
> It's taken as "-r", relocatable in user space kexec-tools too originally.
> I think Vivek just keeps it the same when moving into kernel.

At least on x86 using just -r removed the need for a GOT and all of the
other nasty dynamic relocatable bits, that are not needed when the you
don't want to share your text bits with the page cache.

I can see reaons for refactoring code but I expect PIE expecutables need
a GOT and all of that pain in the neck stuff that can just be avoided by
building the code to run at an absolute address.

So far I have not seen ELF relocations that are difficult to process.

Eric


Re: [RFC] kexec_file: Add support for purgatory built as PIE

2016-11-04 Thread Baoquan He
On 11/02/16 at 04:00am, Thiago Jung Bauermann wrote:
> Hello,
> 
> The kexec_file code currently builds the purgatory as a partially linked 
> object 
> (using ld -r). Is there a particular reason to use that instead of a position 
> independent executable (PIE)?

It's taken as "-r", relocatable in user space kexec-tools too originally.
I think Vivek just keeps it the same when moving into kernel.

> 
> I found a discussion from 2013 in the archives but from what I understood it 
> was about the purgatory as a separate object vs having it linked into the 
> kernel, which is different from what I'm asking:
> 
> http://lists.infradead.org/pipermail/kexec/2013-December/010535.html
> 
> Here is my motivation for this question:
> 
>  On ppc64 purgatory.ro has 12 relocation types when built as a partially 
> linked object. This makes arch_kexec_apply_relocations_add duplicate a lot of 
> code with module_64.c:apply_relocate_add to implement these relocations. The 
> alternative is to do some refactoring so that both functions can share the 
> implementation of the relocations. This is done in patches 5 and 6 of the 
> kexec_file_load implementation for powerpc:

In user space kexec-tools utility, you also got this problem?

> 
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-October/149984.html
> 
> Michael Ellerman would prefer if module_64.c didn't need to be changed, and 
> suggested that the purgatory could be a position independent executable. 
> Indeed, in that case there are only 4 relocation types in purgatory.ro (which 
> aren't even implemented in module_64.c:apply_relocate_add), so the relocation 
> code for the purgatory can leave that file alone and have its own relocation 
> implementation.
> 
> Also, the purgatory is an executable and not an intermediary output from the 
> compiler, so in my mind it makes sense conceptually that it is easier to 
> build 
> it as a PIE than as a partially linked object.
> 
> The patch below adds the support needed in kexec_file.c to allow powerpc-
> specific code to load and relocate a purgatory binary built as PIE. This is 
> WIP 
> and can probably be refined a bit. Would you accept a change along these 
> lines?
> 
> Signed-off-by: Thiago Jung Bauermann 
> ---
>  arch/Kconfig|   3 +
>  kernel/kexec_file.c | 159 
> ++--
>  kernel/kexec_internal.h |  26 
>  3 files changed, 183 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 659bdd079277..7fd6879be222 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -5,6 +5,9 @@
>  config KEXEC_CORE
>   bool
>  
> +config HAVE_KEXEC_FILE_PIE_PURGATORY
> + bool
> +
>  config OPROFILE
>   tristate "OProfile system profiling"
>   depends on PROFILING
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 0c2df7f73792..dfc3e015160d 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -633,7 +633,149 @@ static int kexec_calculate_store_digests(struct kimage 
> *image)
>   return ret;
>  }
>  
> -/* Actually load purgatory. Lot of code taken from kexec-tools */
> +#ifdef CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY
> +/* Load PIE purgatory using the program header information. */
> +static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
> +   unsigned long max, int top_down)
> +{
> + struct purgatory_info *pi = >purgatory_info;
> + unsigned long first_offset;
> + unsigned long orig_load_addr = 0;
> + const void *src;
> + int i, ret;
> + const Elf_Phdr *phdrs = (const void *) pi->ehdr + pi->ehdr->e_phoff;
> + const Elf_Phdr *phdr;
> + const Elf_Shdr *sechdrs_c;
> + Elf_Shdr *sechdr;
> + Elf_Shdr *sechdrs = NULL;
> + struct kexec_buf kbuf = { .image = image, .bufsz = 0, .buf_align = 1,
> +   .buf_min = min, .buf_max = max,
> +   .top_down = top_down };
> +
> + /*
> +  * sechdrs_c points to section headers in purgatory and are read
> +  * only. No modifications allowed.
> +  */
> + sechdrs_c = (void *) pi->ehdr + pi->ehdr->e_shoff;
> +
> + /*
> +  * We can not modify sechdrs_c[] and its fields. It is read only.
> +  * Copy it over to a local copy where one can store some temporary
> +  * data and free it at the end. We need to modify ->sh_addr and
> +  * ->sh_offset fields to keep track of permanent and temporary
> +  * locations of sections.
> +  */
> + sechdrs = vzalloc(pi->ehdr->e_shnum * sizeof(Elf_Shdr));
> + if (!sechdrs)
> + return -ENOMEM;
> +
> + memcpy(sechdrs, sechdrs_c, pi->ehdr->e_shnum * sizeof(Elf_Shdr));
> +
> + /*
> +  * We seem to have multiple copies of sections. First copy is which
> +  * is embedded in kernel in read only section. Some of these sections
> +  * will be copied to a temporary buffer and 

[RFC] kexec_file: Add support for purgatory built as PIE

2016-11-02 Thread Thiago Jung Bauermann
Hello,

The kexec_file code currently builds the purgatory as a partially linked object 
(using ld -r). Is there a particular reason to use that instead of a position 
independent executable (PIE)?

I found a discussion from 2013 in the archives but from what I understood it 
was about the purgatory as a separate object vs having it linked into the 
kernel, which is different from what I'm asking:

http://lists.infradead.org/pipermail/kexec/2013-December/010535.html

Here is my motivation for this question:

 On ppc64 purgatory.ro has 12 relocation types when built as a partially 
linked object. This makes arch_kexec_apply_relocations_add duplicate a lot of 
code with module_64.c:apply_relocate_add to implement these relocations. The 
alternative is to do some refactoring so that both functions can share the 
implementation of the relocations. This is done in patches 5 and 6 of the 
kexec_file_load implementation for powerpc:

https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-October/149984.html

Michael Ellerman would prefer if module_64.c didn't need to be changed, and 
suggested that the purgatory could be a position independent executable. 
Indeed, in that case there are only 4 relocation types in purgatory.ro (which 
aren't even implemented in module_64.c:apply_relocate_add), so the relocation 
code for the purgatory can leave that file alone and have its own relocation 
implementation.

Also, the purgatory is an executable and not an intermediary output from the 
compiler, so in my mind it makes sense conceptually that it is easier to build 
it as a PIE than as a partially linked object.

The patch below adds the support needed in kexec_file.c to allow powerpc-
specific code to load and relocate a purgatory binary built as PIE. This is WIP 
and can probably be refined a bit. Would you accept a change along these lines?

Signed-off-by: Thiago Jung Bauermann 
---
 arch/Kconfig|   3 +
 kernel/kexec_file.c | 159 ++--
 kernel/kexec_internal.h |  26 
 3 files changed, 183 insertions(+), 5 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 659bdd079277..7fd6879be222 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -5,6 +5,9 @@
 config KEXEC_CORE
bool
 
+config HAVE_KEXEC_FILE_PIE_PURGATORY
+   bool
+
 config OPROFILE
tristate "OProfile system profiling"
depends on PROFILING
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 0c2df7f73792..dfc3e015160d 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -633,7 +633,149 @@ static int kexec_calculate_store_digests(struct kimage 
*image)
return ret;
 }
 
-/* Actually load purgatory. Lot of code taken from kexec-tools */
+#ifdef CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY
+/* Load PIE purgatory using the program header information. */
+static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
+ unsigned long max, int top_down)
+{
+   struct purgatory_info *pi = >purgatory_info;
+   unsigned long first_offset;
+   unsigned long orig_load_addr = 0;
+   const void *src;
+   int i, ret;
+   const Elf_Phdr *phdrs = (const void *) pi->ehdr + pi->ehdr->e_phoff;
+   const Elf_Phdr *phdr;
+   const Elf_Shdr *sechdrs_c;
+   Elf_Shdr *sechdr;
+   Elf_Shdr *sechdrs = NULL;
+   struct kexec_buf kbuf = { .image = image, .bufsz = 0, .buf_align = 1,
+ .buf_min = min, .buf_max = max,
+ .top_down = top_down };
+
+   /*
+* sechdrs_c points to section headers in purgatory and are read
+* only. No modifications allowed.
+*/
+   sechdrs_c = (void *) pi->ehdr + pi->ehdr->e_shoff;
+
+   /*
+* We can not modify sechdrs_c[] and its fields. It is read only.
+* Copy it over to a local copy where one can store some temporary
+* data and free it at the end. We need to modify ->sh_addr and
+* ->sh_offset fields to keep track of permanent and temporary
+* locations of sections.
+*/
+   sechdrs = vzalloc(pi->ehdr->e_shnum * sizeof(Elf_Shdr));
+   if (!sechdrs)
+   return -ENOMEM;
+
+   memcpy(sechdrs, sechdrs_c, pi->ehdr->e_shnum * sizeof(Elf_Shdr));
+
+   /*
+* We seem to have multiple copies of sections. First copy is which
+* is embedded in kernel in read only section. Some of these sections
+* will be copied to a temporary buffer and relocated. And these
+* sections will finally be copied to their final destination at
+* segment load time.
+*
+* Use ->sh_offset to reflect section address in memory. It will
+* point to original read only copy if section is not allocatable.
+* Otherwise it will point to temporary copy which will be relocated.
+*
+* Use ->sh_addr to contain final address