Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-11 Thread Nicholas Piggin
On Thu, 11 Aug 2016 15:04:00 +0200
Arnd Bergmann  wrote:

> On Thursday, August 11, 2016 10:43:20 PM CEST Nicholas Piggin wrote:
> > On Wed, 03 Aug 2016 22:13:28 +0200

> > Final ld time
> > inclink
> > real0m0.378s
> > user0m0.304s
> > sys 0m0.076s
> > 
> > thinarc
> > real0m0.894s
> > user0m0.684s
> > sys 0m0.200s  
> 
> This also still seems fine.
> 
> > For both cases final link gets slower with thin archives. I guess there is 
> > some
> > per-file overhead but I thought with --whole-archive it should not be that 
> > much
> > slower. Still, overall time for main ar/ld phases comes out about the same 
> > in
> > the end so I don't think it's too much problem. Unless ARM blows up 
> > significantly
> > worse with a bigger config.  
> 
> Unfortunately I think it does. I haven't tried your latest series yet,
> but I think the total time for removing built-in.o and relinking went
> up from around 4 minutes (already way too much) to 18 minutes for me.
> 
> > Linking with thin archives takes significantly more time in bfd hash lookup 
> > code.
> > I haven't dug much further yet.  
> 
> Can you try the ARM allyesconfig with thin archives? I'll follow up with two
> patches: one to get ARM to link without thin archives, and one that I used
> to get --gc-sections to work.

Okay send them over, I'll try digging into it. There is not much kbuild
code to maintain so we don't have to switch every arch. It would be nice
to though.

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-11 Thread Nicholas Piggin
On Thu, 11 Aug 2016 15:04:00 +0200
Arnd Bergmann  wrote:

> On Thursday, August 11, 2016 10:43:20 PM CEST Nicholas Piggin wrote:
> > On Wed, 03 Aug 2016 22:13:28 +0200

> > Final ld time
> > inclink
> > real0m0.378s
> > user0m0.304s
> > sys 0m0.076s
> > 
> > thinarc
> > real0m0.894s
> > user0m0.684s
> > sys 0m0.200s  
> 
> This also still seems fine.
> 
> > For both cases final link gets slower with thin archives. I guess there is 
> > some
> > per-file overhead but I thought with --whole-archive it should not be that 
> > much
> > slower. Still, overall time for main ar/ld phases comes out about the same 
> > in
> > the end so I don't think it's too much problem. Unless ARM blows up 
> > significantly
> > worse with a bigger config.  
> 
> Unfortunately I think it does. I haven't tried your latest series yet,
> but I think the total time for removing built-in.o and relinking went
> up from around 4 minutes (already way too much) to 18 minutes for me.
> 
> > Linking with thin archives takes significantly more time in bfd hash lookup 
> > code.
> > I haven't dug much further yet.  
> 
> Can you try the ARM allyesconfig with thin archives? I'll follow up with two
> patches: one to get ARM to link without thin archives, and one that I used
> to get --gc-sections to work.

Okay send them over, I'll try digging into it. There is not much kbuild
code to maintain so we don't have to switch every arch. It would be nice
to though.

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-11 Thread Arnd Bergmann
On Thursday, August 11, 2016 10:43:20 PM CEST Nicholas Piggin wrote:
> On Wed, 03 Aug 2016 22:13:28 +0200
> Arnd Bergmann  wrote:
> 
> > On Wednesday, August 3, 2016 2:44:29 PM CEST Segher Boessenkool wrote:
> > > Hi Arnd,
> > > 
> > > On Wed, Aug 03, 2016 at 08:52:48PM +0200, Arnd Bergmann wrote:  
> > > > From my first look, it seems that all of lib/*.o is now getting linked
> > > > into vmlinux, while we traditionally leave out everything from lib/
> > > > that is not referenced.
> > > > 
> > > > I also see a noticeable overhead in link time, the numbers are for
> > > > a cache-hot rebuild after a successful allyesconfig build, using a
> > > > 24-way Opteron@2.5Ghz, just relinking vmlinux:
> > > > 
> > > > $ time make skj30 vmlinux # before
> > > > real2m8.092s
> > > > user3m41.008s
> > > > sys 0m48.172s
> > > > 
> > > > $ time make skj30 vmlinux # after
> > > > real4m10.189s
> > > > user5m43.804s
> > > > sys 0m52.988s  
> > > 
> > > Is it better when using rcT instead of rcsT?  
> > 
> > It seems to be noticeably better for the clean rebuild case, though
> > not as good as the original:
> > 
> > real3m34.015s
> > user5m7.104s
> > sys 0m49.172s
> > 
> > I've also tried now with my own patch applied as well (linking
> > each drivers/*/built-in.o into vmlinux rather than having them
> > linked into drivers/built-in.o first), but that makes no
> > difference.
> 
> I just want to come back to this, because I've subbmitted the thin
> archives kbuild patch, I wanted to make sure we're doing okay on
> ARM/ARM64. I cross compiled with my laptop.
> 
> For ARM64 allyesconfig:
> 
> After building then removing all built-in.o then rebuilding vmlinux:
> inclink
> time make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j8 vmlinux
> real1m18.977s
> user2m14.512s
> sys 0m29.704s
> 
> thinarc
> time make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j8 vmlinux
> real1m18.433s
> user2m6.128s
> sys 0m28.372s
> 
> 
> Final ld time
> inclink
> real0m4.005s
> user0m3.464s
> sys 0m0.536s
> 
> thinarc
> real0m5.841s
> user0m4.916s
> sys 0m0.916s
> 
> 
> Build directory size is of course much better (3953MB vs 5519MB).

Ok, looks great. Some downsides and some upsides here, but overall
I think this is a win.

> 
> For ARM, defconfig
> 
> After building then removing all built-in.o then rebuilding vmlinux:
> inclink
> real  0m19.593s
> user  0m22.372s
> sys   0m6.428s
> 
> thinarc
> real  0m18.919s
> user  0m21.924s
> sys   0m6.400s
> 
> 
> Final ld time
> inclink
> real  0m0.378s
> user  0m0.304s
> sys   0m0.076s
> 
> thinarc
> real0m0.894s
> user0m0.684s
> sys 0m0.200s

This also still seems fine.

> For both cases final link gets slower with thin archives. I guess there is 
> some
> per-file overhead but I thought with --whole-archive it should not be that 
> much
> slower. Still, overall time for main ar/ld phases comes out about the same in
> the end so I don't think it's too much problem. Unless ARM blows up 
> significantly
> worse with a bigger config.

Unfortunately I think it does. I haven't tried your latest series yet,
but I think the total time for removing built-in.o and relinking went
up from around 4 minutes (already way too much) to 18 minutes for me.

> Linking with thin archives takes significantly more time in bfd hash lookup 
> code.
> I haven't dug much further yet.

Can you try the ARM allyesconfig with thin archives? I'll follow up with two
patches: one to get ARM to link without thin archives, and one that I used
to get --gc-sections to work.

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-11 Thread Arnd Bergmann
On Thursday, August 11, 2016 10:43:20 PM CEST Nicholas Piggin wrote:
> On Wed, 03 Aug 2016 22:13:28 +0200
> Arnd Bergmann  wrote:
> 
> > On Wednesday, August 3, 2016 2:44:29 PM CEST Segher Boessenkool wrote:
> > > Hi Arnd,
> > > 
> > > On Wed, Aug 03, 2016 at 08:52:48PM +0200, Arnd Bergmann wrote:  
> > > > From my first look, it seems that all of lib/*.o is now getting linked
> > > > into vmlinux, while we traditionally leave out everything from lib/
> > > > that is not referenced.
> > > > 
> > > > I also see a noticeable overhead in link time, the numbers are for
> > > > a cache-hot rebuild after a successful allyesconfig build, using a
> > > > 24-way Opteron@2.5Ghz, just relinking vmlinux:
> > > > 
> > > > $ time make skj30 vmlinux # before
> > > > real2m8.092s
> > > > user3m41.008s
> > > > sys 0m48.172s
> > > > 
> > > > $ time make skj30 vmlinux # after
> > > > real4m10.189s
> > > > user5m43.804s
> > > > sys 0m52.988s  
> > > 
> > > Is it better when using rcT instead of rcsT?  
> > 
> > It seems to be noticeably better for the clean rebuild case, though
> > not as good as the original:
> > 
> > real3m34.015s
> > user5m7.104s
> > sys 0m49.172s
> > 
> > I've also tried now with my own patch applied as well (linking
> > each drivers/*/built-in.o into vmlinux rather than having them
> > linked into drivers/built-in.o first), but that makes no
> > difference.
> 
> I just want to come back to this, because I've subbmitted the thin
> archives kbuild patch, I wanted to make sure we're doing okay on
> ARM/ARM64. I cross compiled with my laptop.
> 
> For ARM64 allyesconfig:
> 
> After building then removing all built-in.o then rebuilding vmlinux:
> inclink
> time make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j8 vmlinux
> real1m18.977s
> user2m14.512s
> sys 0m29.704s
> 
> thinarc
> time make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j8 vmlinux
> real1m18.433s
> user2m6.128s
> sys 0m28.372s
> 
> 
> Final ld time
> inclink
> real0m4.005s
> user0m3.464s
> sys 0m0.536s
> 
> thinarc
> real0m5.841s
> user0m4.916s
> sys 0m0.916s
> 
> 
> Build directory size is of course much better (3953MB vs 5519MB).

Ok, looks great. Some downsides and some upsides here, but overall
I think this is a win.

> 
> For ARM, defconfig
> 
> After building then removing all built-in.o then rebuilding vmlinux:
> inclink
> real  0m19.593s
> user  0m22.372s
> sys   0m6.428s
> 
> thinarc
> real  0m18.919s
> user  0m21.924s
> sys   0m6.400s
> 
> 
> Final ld time
> inclink
> real  0m0.378s
> user  0m0.304s
> sys   0m0.076s
> 
> thinarc
> real0m0.894s
> user0m0.684s
> sys 0m0.200s

This also still seems fine.

> For both cases final link gets slower with thin archives. I guess there is 
> some
> per-file overhead but I thought with --whole-archive it should not be that 
> much
> slower. Still, overall time for main ar/ld phases comes out about the same in
> the end so I don't think it's too much problem. Unless ARM blows up 
> significantly
> worse with a bigger config.

Unfortunately I think it does. I haven't tried your latest series yet,
but I think the total time for removing built-in.o and relinking went
up from around 4 minutes (already way too much) to 18 minutes for me.

> Linking with thin archives takes significantly more time in bfd hash lookup 
> code.
> I haven't dug much further yet.

Can you try the ARM allyesconfig with thin archives? I'll follow up with two
patches: one to get ARM to link without thin archives, and one that I used
to get --gc-sections to work.

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-11 Thread Nicholas Piggin
On Wed, 03 Aug 2016 22:13:28 +0200
Arnd Bergmann  wrote:

> On Wednesday, August 3, 2016 2:44:29 PM CEST Segher Boessenkool wrote:
> > Hi Arnd,
> > 
> > On Wed, Aug 03, 2016 at 08:52:48PM +0200, Arnd Bergmann wrote:  
> > > From my first look, it seems that all of lib/*.o is now getting linked
> > > into vmlinux, while we traditionally leave out everything from lib/
> > > that is not referenced.
> > > 
> > > I also see a noticeable overhead in link time, the numbers are for
> > > a cache-hot rebuild after a successful allyesconfig build, using a
> > > 24-way Opteron@2.5Ghz, just relinking vmlinux:
> > > 
> > > $ time make skj30 vmlinux # before
> > > real  2m8.092s
> > > user  3m41.008s
> > > sys   0m48.172s
> > > 
> > > $ time make skj30 vmlinux # after
> > > real  4m10.189s
> > > user  5m43.804s
> > > sys   0m52.988s  
> > 
> > Is it better when using rcT instead of rcsT?  
> 
> It seems to be noticeably better for the clean rebuild case, though
> not as good as the original:
> 
> real  3m34.015s
> user  5m7.104s
> sys   0m49.172s
> 
> I've also tried now with my own patch applied as well (linking
> each drivers/*/built-in.o into vmlinux rather than having them
> linked into drivers/built-in.o first), but that makes no
> difference.

I just want to come back to this, because I've subbmitted the thin
archives kbuild patch, I wanted to make sure we're doing okay on
ARM/ARM64. I cross compiled with my laptop.

For ARM64 allyesconfig:

After building then removing all built-in.o then rebuilding vmlinux:
inclink
time make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j8 vmlinux
real1m18.977s
user2m14.512s
sys 0m29.704s

thinarc
time make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j8 vmlinux
real1m18.433s
user2m6.128s
sys 0m28.372s


Final ld time
inclink
real0m4.005s
user0m3.464s
sys 0m0.536s

thinarc
real0m5.841s
user0m4.916s
sys 0m0.916s


Build directory size is of course much better (3953MB vs 5519MB).


For ARM, defconfig

After building then removing all built-in.o then rebuilding vmlinux:
inclink
real0m19.593s
user0m22.372s
sys 0m6.428s

thinarc
real0m18.919s
user0m21.924s
sys 0m6.400s


Final ld time
inclink
real0m0.378s
user0m0.304s
sys 0m0.076s

thinarc
real0m0.894s
user0m0.684s
sys 0m0.200s

For both cases final link gets slower with thin archives. I guess there is some
per-file overhead but I thought with --whole-archive it should not be that much
slower. Still, overall time for main ar/ld phases comes out about the same in
the end so I don't think it's too much problem. Unless ARM blows up 
significantly
worse with a bigger config.

Linking with thin archives takes significantly more time in bfd hash lookup 
code.
I haven't dug much further yet.

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-11 Thread Nicholas Piggin
On Wed, 03 Aug 2016 22:13:28 +0200
Arnd Bergmann  wrote:

> On Wednesday, August 3, 2016 2:44:29 PM CEST Segher Boessenkool wrote:
> > Hi Arnd,
> > 
> > On Wed, Aug 03, 2016 at 08:52:48PM +0200, Arnd Bergmann wrote:  
> > > From my first look, it seems that all of lib/*.o is now getting linked
> > > into vmlinux, while we traditionally leave out everything from lib/
> > > that is not referenced.
> > > 
> > > I also see a noticeable overhead in link time, the numbers are for
> > > a cache-hot rebuild after a successful allyesconfig build, using a
> > > 24-way Opteron@2.5Ghz, just relinking vmlinux:
> > > 
> > > $ time make skj30 vmlinux # before
> > > real  2m8.092s
> > > user  3m41.008s
> > > sys   0m48.172s
> > > 
> > > $ time make skj30 vmlinux # after
> > > real  4m10.189s
> > > user  5m43.804s
> > > sys   0m52.988s  
> > 
> > Is it better when using rcT instead of rcsT?  
> 
> It seems to be noticeably better for the clean rebuild case, though
> not as good as the original:
> 
> real  3m34.015s
> user  5m7.104s
> sys   0m49.172s
> 
> I've also tried now with my own patch applied as well (linking
> each drivers/*/built-in.o into vmlinux rather than having them
> linked into drivers/built-in.o first), but that makes no
> difference.

I just want to come back to this, because I've subbmitted the thin
archives kbuild patch, I wanted to make sure we're doing okay on
ARM/ARM64. I cross compiled with my laptop.

For ARM64 allyesconfig:

After building then removing all built-in.o then rebuilding vmlinux:
inclink
time make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j8 vmlinux
real1m18.977s
user2m14.512s
sys 0m29.704s

thinarc
time make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j8 vmlinux
real1m18.433s
user2m6.128s
sys 0m28.372s


Final ld time
inclink
real0m4.005s
user0m3.464s
sys 0m0.536s

thinarc
real0m5.841s
user0m4.916s
sys 0m0.916s


Build directory size is of course much better (3953MB vs 5519MB).


For ARM, defconfig

After building then removing all built-in.o then rebuilding vmlinux:
inclink
real0m19.593s
user0m22.372s
sys 0m6.428s

thinarc
real0m18.919s
user0m21.924s
sys 0m6.400s


Final ld time
inclink
real0m0.378s
user0m0.304s
sys 0m0.076s

thinarc
real0m0.894s
user0m0.684s
sys 0m0.200s

For both cases final link gets slower with thin archives. I guess there is some
per-file overhead but I thought with --whole-archive it should not be that much
slower. Still, overall time for main ar/ld phases comes out about the same in
the end so I don't think it's too much problem. Unless ARM blows up 
significantly
worse with a bigger config.

Linking with thin archives takes significantly more time in bfd hash lookup 
code.
I haven't dug much further yet.

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-06 Thread Arnd Bergmann
On Saturday, August 6, 2016 2:17:16 PM CEST Nicholas Piggin wrote:
> On Fri, 05 Aug 2016 21:16:00 +0200
> Arnd Bergmann  wrote:
> 
> > On Saturday, August 6, 2016 2:16:42 AM CEST Nicholas Piggin wrote:
> > > > 
> > > > diff --git a/include/asm-generic/vmlinux.lds.h 
> > > > b/include/asm-generic/vmlinux.lds.h
> > > > index 0ec807d69f18..7a3ad269fa23 100644
> > > > --- a/include/asm-generic/vmlinux.lds.h
> > > > +++ b/include/asm-generic/vmlinux.lds.h
> > > > @@ -433,7 +433,7 @@
> > > >   * during second ld run in second ld pass when generating System.map */
> > > >  #define TEXT_TEXT\
> > > >   ALIGN_FUNCTION();   \
> > > > - *(.text.hot .text .text.fixup .text.unlikely)   \
> > > > + *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
> > > >   *(.ref.text)\
> > > >   MEM_KEEP(init.text) \
> > > >   MEM_KEEP(exit.text) \
> > > > 
> > > > 
> > > > It also got much faster again, the link time for an allyesconfig
> > > > kernel is now 18 minutes instead of 10 hours, but it's still
> > > > much worse than the 2 minutes I had earlier or the four minutes
> > > > with the previous patch.  
> > > 
> > > Are you using the patches I just sent?  
> > 
> > Not yet, I was still busy with the older version, and trying to
> > figure out exactly what went wrong in ld.bfd. FWIW, I first tried
> > to see if the hash tables were just too small, but as it turned
> > out that was not the problem. When I tried to change the default
> > hash table sizes, making them bigger only made things slower.
> > 
> > I also found the --hash-size=xxx option, which has a significant
> > impact on runtime speed. Interestingly again, using sizes less
> > than the default made things faster in practice. If we can
> > work out the optimum size for the kernel build, that might
> > shave a few minutes off the total build time.
> > 
> > > Either way, you also need
> > > to do the same for data and bss sections as you are using
> > > -fdata-sections too.  
> > 
> > Right.
> > 
> > > I've found virtually no build time regression on powerpc or x86
> > > when those are taken care of properly (x86 numbers I sent are typo,
> > > it's not 5m20, it's 5m02).  
> > 
> > Interesting. I wonder if it's got something to do with the
> > generation of the branch trampolines on ARM, as we have a lot
> > of them on an allyesconfig.
> 
> Powerpc generates quite a few branch trampolines as well, so
> I'm not sure if that would be the issue. Can you get a profile
> of the link?


CPU: AMD64 family15h, speed 2600 MHz (estimated)
Counted CPU_CLK_UNHALTED events (CPU Clocks not Halted) with a unit mask of 
0x00 (No unit mask) count 10
samples  %image name   symbol name
1212556  63.6990  ld-new   bfd_hash_lookup
416050   21.8563  ld-new   bfd_hash_hash
64861 3.4073  no-vmlinux   /no-vmlinux
59038 3.1014  ld-new   bfd_hash_traverse
13873 0.7288  ld-new   bfd_get_next_section_by_name
9880  0.5190  ld-new   strrevcmp

I've manually marked bfd_hash_hash as __attribute__((noinline))
to see it separately from bfd_hash_lookup.

The vast majority of these calls seem to come from _bfd_elf_strtab_add
and from bfd_get_section_by_name/bfd_get_next_section_by_name.

While I first thought the hash tables were too slow, investigating
further showed that most of the hash tables are really small
(and appropriately sized), we just do a lot of lookups on them.

> Are you linking with archives? Do your input archives have a
> symbol index built?

yes, and don't know. I've moved on to your new patches now, will
see how that goes.

> > Is the 5m20 the total build time for the kernel, the time for
> > rebuilding after a trivial change, or the time to call 'ld.bfd'
> > once?
> 
> 5m02 was the total time for x86 defconfig. With the powerpc
> allyesconfig build, the final link:
> 
> $ time ld -EL -m elf64lppc -pie --emit-relocs --build-id --gc-sections -X -o 
> vmlinux -T ./arch/powerpc/kernel/vmlinux.lds --whole-archive built-in.o 
> .tmp_kallsyms2.o
> 
> real  0m15.556s
> user  0m13.288s
> sys   0m2.240s
> 
> $ ls -lh vmlinux
> -rwxrwxr-x 1 npiggin npiggin 279M Aug  6 14:02 vmlinux
> 
> Without -pie --emit-relocs it's 11.8s and 150M but I'm using
> emit-relocs for a post-link step.

Interesting, that does sound more like an ARM specific bug in ld
then. 

> > Are you using ld.bfd on x86 or ld.gold? For me ld.gold either
> > works and is really fast, or it crashes, depending on the
> > configuration. I also don't think it supports big-endian ARM
> > (which is what allyesconfig ends up using).
> 
> ld.bfd on both. Gold crashed on powerpc and I didn't try it on x86.


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-06 Thread Arnd Bergmann
On Saturday, August 6, 2016 2:17:16 PM CEST Nicholas Piggin wrote:
> On Fri, 05 Aug 2016 21:16:00 +0200
> Arnd Bergmann  wrote:
> 
> > On Saturday, August 6, 2016 2:16:42 AM CEST Nicholas Piggin wrote:
> > > > 
> > > > diff --git a/include/asm-generic/vmlinux.lds.h 
> > > > b/include/asm-generic/vmlinux.lds.h
> > > > index 0ec807d69f18..7a3ad269fa23 100644
> > > > --- a/include/asm-generic/vmlinux.lds.h
> > > > +++ b/include/asm-generic/vmlinux.lds.h
> > > > @@ -433,7 +433,7 @@
> > > >   * during second ld run in second ld pass when generating System.map */
> > > >  #define TEXT_TEXT\
> > > >   ALIGN_FUNCTION();   \
> > > > - *(.text.hot .text .text.fixup .text.unlikely)   \
> > > > + *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
> > > >   *(.ref.text)\
> > > >   MEM_KEEP(init.text) \
> > > >   MEM_KEEP(exit.text) \
> > > > 
> > > > 
> > > > It also got much faster again, the link time for an allyesconfig
> > > > kernel is now 18 minutes instead of 10 hours, but it's still
> > > > much worse than the 2 minutes I had earlier or the four minutes
> > > > with the previous patch.  
> > > 
> > > Are you using the patches I just sent?  
> > 
> > Not yet, I was still busy with the older version, and trying to
> > figure out exactly what went wrong in ld.bfd. FWIW, I first tried
> > to see if the hash tables were just too small, but as it turned
> > out that was not the problem. When I tried to change the default
> > hash table sizes, making them bigger only made things slower.
> > 
> > I also found the --hash-size=xxx option, which has a significant
> > impact on runtime speed. Interestingly again, using sizes less
> > than the default made things faster in practice. If we can
> > work out the optimum size for the kernel build, that might
> > shave a few minutes off the total build time.
> > 
> > > Either way, you also need
> > > to do the same for data and bss sections as you are using
> > > -fdata-sections too.  
> > 
> > Right.
> > 
> > > I've found virtually no build time regression on powerpc or x86
> > > when those are taken care of properly (x86 numbers I sent are typo,
> > > it's not 5m20, it's 5m02).  
> > 
> > Interesting. I wonder if it's got something to do with the
> > generation of the branch trampolines on ARM, as we have a lot
> > of them on an allyesconfig.
> 
> Powerpc generates quite a few branch trampolines as well, so
> I'm not sure if that would be the issue. Can you get a profile
> of the link?


CPU: AMD64 family15h, speed 2600 MHz (estimated)
Counted CPU_CLK_UNHALTED events (CPU Clocks not Halted) with a unit mask of 
0x00 (No unit mask) count 10
samples  %image name   symbol name
1212556  63.6990  ld-new   bfd_hash_lookup
416050   21.8563  ld-new   bfd_hash_hash
64861 3.4073  no-vmlinux   /no-vmlinux
59038 3.1014  ld-new   bfd_hash_traverse
13873 0.7288  ld-new   bfd_get_next_section_by_name
9880  0.5190  ld-new   strrevcmp

I've manually marked bfd_hash_hash as __attribute__((noinline))
to see it separately from bfd_hash_lookup.

The vast majority of these calls seem to come from _bfd_elf_strtab_add
and from bfd_get_section_by_name/bfd_get_next_section_by_name.

While I first thought the hash tables were too slow, investigating
further showed that most of the hash tables are really small
(and appropriately sized), we just do a lot of lookups on them.

> Are you linking with archives? Do your input archives have a
> symbol index built?

yes, and don't know. I've moved on to your new patches now, will
see how that goes.

> > Is the 5m20 the total build time for the kernel, the time for
> > rebuilding after a trivial change, or the time to call 'ld.bfd'
> > once?
> 
> 5m02 was the total time for x86 defconfig. With the powerpc
> allyesconfig build, the final link:
> 
> $ time ld -EL -m elf64lppc -pie --emit-relocs --build-id --gc-sections -X -o 
> vmlinux -T ./arch/powerpc/kernel/vmlinux.lds --whole-archive built-in.o 
> .tmp_kallsyms2.o
> 
> real  0m15.556s
> user  0m13.288s
> sys   0m2.240s
> 
> $ ls -lh vmlinux
> -rwxrwxr-x 1 npiggin npiggin 279M Aug  6 14:02 vmlinux
> 
> Without -pie --emit-relocs it's 11.8s and 150M but I'm using
> emit-relocs for a post-link step.

Interesting, that does sound more like an ARM specific bug in ld
then. 

> > Are you using ld.bfd on x86 or ld.gold? For me ld.gold either
> > works and is really fast, or it crashes, depending on the
> > configuration. I also don't think it supports big-endian ARM
> > (which is what allyesconfig ends up using).
> 
> ld.bfd on both. Gold crashed on powerpc and I didn't try it on x86.

Ok.

Arnd

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-06 Thread Nicholas Piggin
On Fri, 05 Aug 2016 21:16:00 +0200
Arnd Bergmann  wrote:

> On Saturday, August 6, 2016 2:16:42 AM CEST Nicholas Piggin wrote:
> > > 
> > > diff --git a/include/asm-generic/vmlinux.lds.h 
> > > b/include/asm-generic/vmlinux.lds.h
> > > index 0ec807d69f18..7a3ad269fa23 100644
> > > --- a/include/asm-generic/vmlinux.lds.h
> > > +++ b/include/asm-generic/vmlinux.lds.h
> > > @@ -433,7 +433,7 @@
> > >   * during second ld run in second ld pass when generating System.map */
> > >  #define TEXT_TEXT\
> > >   ALIGN_FUNCTION();   \
> > > - *(.text.hot .text .text.fixup .text.unlikely)   \
> > > + *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
> > >   *(.ref.text)\
> > >   MEM_KEEP(init.text) \
> > >   MEM_KEEP(exit.text) \
> > > 
> > > 
> > > It also got much faster again, the link time for an allyesconfig
> > > kernel is now 18 minutes instead of 10 hours, but it's still
> > > much worse than the 2 minutes I had earlier or the four minutes
> > > with the previous patch.  
> > 
> > Are you using the patches I just sent?  
> 
> Not yet, I was still busy with the older version, and trying to
> figure out exactly what went wrong in ld.bfd. FWIW, I first tried
> to see if the hash tables were just too small, but as it turned
> out that was not the problem. When I tried to change the default
> hash table sizes, making them bigger only made things slower.
> 
> I also found the --hash-size=xxx option, which has a significant
> impact on runtime speed. Interestingly again, using sizes less
> than the default made things faster in practice. If we can
> work out the optimum size for the kernel build, that might
> shave a few minutes off the total build time.
> 
> > Either way, you also need
> > to do the same for data and bss sections as you are using
> > -fdata-sections too.  
> 
> Right.
> 
> > I've found virtually no build time regression on powerpc or x86
> > when those are taken care of properly (x86 numbers I sent are typo,
> > it's not 5m20, it's 5m02).  
> 
> Interesting. I wonder if it's got something to do with the
> generation of the branch trampolines on ARM, as we have a lot
> of them on an allyesconfig.

Powerpc generates quite a few branch trampolines as well, so
I'm not sure if that would be the issue. Can you get a profile
of the link?

Are you linking with archives? Do your input archives have a
symbol index built?


> Is the 5m20 the total build time for the kernel, the time for
> rebuilding after a trivial change, or the time to call 'ld.bfd'
> once?

5m02 was the total time for x86 defconfig. With the powerpc
allyesconfig build, the final link:

$ time ld -EL -m elf64lppc -pie --emit-relocs --build-id --gc-sections -X -o 
vmlinux -T ./arch/powerpc/kernel/vmlinux.lds --whole-archive built-in.o 
.tmp_kallsyms2.o

real0m15.556s
user0m13.288s
sys 0m2.240s

$ ls -lh vmlinux
-rwxrwxr-x 1 npiggin npiggin 279M Aug  6 14:02 vmlinux

Without -pie --emit-relocs it's 11.8s and 150M but I'm using
emit-relocs for a post-link step.


> Are you using ld.bfd on x86 or ld.gold? For me ld.gold either
> works and is really fast, or it crashes, depending on the
> configuration. I also don't think it supports big-endian ARM
> (which is what allyesconfig ends up using).

ld.bfd on both. Gold crashed on powerpc and I didn't try it on x86.

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-06 Thread Nicholas Piggin
On Fri, 05 Aug 2016 21:16:00 +0200
Arnd Bergmann  wrote:

> On Saturday, August 6, 2016 2:16:42 AM CEST Nicholas Piggin wrote:
> > > 
> > > diff --git a/include/asm-generic/vmlinux.lds.h 
> > > b/include/asm-generic/vmlinux.lds.h
> > > index 0ec807d69f18..7a3ad269fa23 100644
> > > --- a/include/asm-generic/vmlinux.lds.h
> > > +++ b/include/asm-generic/vmlinux.lds.h
> > > @@ -433,7 +433,7 @@
> > >   * during second ld run in second ld pass when generating System.map */
> > >  #define TEXT_TEXT\
> > >   ALIGN_FUNCTION();   \
> > > - *(.text.hot .text .text.fixup .text.unlikely)   \
> > > + *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
> > >   *(.ref.text)\
> > >   MEM_KEEP(init.text) \
> > >   MEM_KEEP(exit.text) \
> > > 
> > > 
> > > It also got much faster again, the link time for an allyesconfig
> > > kernel is now 18 minutes instead of 10 hours, but it's still
> > > much worse than the 2 minutes I had earlier or the four minutes
> > > with the previous patch.  
> > 
> > Are you using the patches I just sent?  
> 
> Not yet, I was still busy with the older version, and trying to
> figure out exactly what went wrong in ld.bfd. FWIW, I first tried
> to see if the hash tables were just too small, but as it turned
> out that was not the problem. When I tried to change the default
> hash table sizes, making them bigger only made things slower.
> 
> I also found the --hash-size=xxx option, which has a significant
> impact on runtime speed. Interestingly again, using sizes less
> than the default made things faster in practice. If we can
> work out the optimum size for the kernel build, that might
> shave a few minutes off the total build time.
> 
> > Either way, you also need
> > to do the same for data and bss sections as you are using
> > -fdata-sections too.  
> 
> Right.
> 
> > I've found virtually no build time regression on powerpc or x86
> > when those are taken care of properly (x86 numbers I sent are typo,
> > it's not 5m20, it's 5m02).  
> 
> Interesting. I wonder if it's got something to do with the
> generation of the branch trampolines on ARM, as we have a lot
> of them on an allyesconfig.

Powerpc generates quite a few branch trampolines as well, so
I'm not sure if that would be the issue. Can you get a profile
of the link?

Are you linking with archives? Do your input archives have a
symbol index built?


> Is the 5m20 the total build time for the kernel, the time for
> rebuilding after a trivial change, or the time to call 'ld.bfd'
> once?

5m02 was the total time for x86 defconfig. With the powerpc
allyesconfig build, the final link:

$ time ld -EL -m elf64lppc -pie --emit-relocs --build-id --gc-sections -X -o 
vmlinux -T ./arch/powerpc/kernel/vmlinux.lds --whole-archive built-in.o 
.tmp_kallsyms2.o

real0m15.556s
user0m13.288s
sys 0m2.240s

$ ls -lh vmlinux
-rwxrwxr-x 1 npiggin npiggin 279M Aug  6 14:02 vmlinux

Without -pie --emit-relocs it's 11.8s and 150M but I'm using
emit-relocs for a post-link step.


> Are you using ld.bfd on x86 or ld.gold? For me ld.gold either
> works and is really fast, or it crashes, depending on the
> configuration. I also don't think it supports big-endian ARM
> (which is what allyesconfig ends up using).

ld.bfd on both. Gold crashed on powerpc and I didn't try it on x86.

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Arnd Bergmann
On Saturday, August 6, 2016 2:16:42 AM CEST Nicholas Piggin wrote:
> > 
> > diff --git a/include/asm-generic/vmlinux.lds.h 
> > b/include/asm-generic/vmlinux.lds.h
> > index 0ec807d69f18..7a3ad269fa23 100644
> > --- a/include/asm-generic/vmlinux.lds.h
> > +++ b/include/asm-generic/vmlinux.lds.h
> > @@ -433,7 +433,7 @@
> >   * during second ld run in second ld pass when generating System.map */
> >  #define TEXT_TEXT\
> >   ALIGN_FUNCTION();   \
> > - *(.text.hot .text .text.fixup .text.unlikely)   \
> > + *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
> >   *(.ref.text)\
> >   MEM_KEEP(init.text) \
> >   MEM_KEEP(exit.text) \
> > 
> > 
> > It also got much faster again, the link time for an allyesconfig
> > kernel is now 18 minutes instead of 10 hours, but it's still
> > much worse than the 2 minutes I had earlier or the four minutes
> > with the previous patch.
> 
> Are you using the patches I just sent?

Not yet, I was still busy with the older version, and trying to
figure out exactly what went wrong in ld.bfd. FWIW, I first tried
to see if the hash tables were just too small, but as it turned
out that was not the problem. When I tried to change the default
hash table sizes, making them bigger only made things slower.

I also found the --hash-size=xxx option, which has a significant
impact on runtime speed. Interestingly again, using sizes less
than the default made things faster in practice. If we can
work out the optimum size for the kernel build, that might
shave a few minutes off the total build time.

> Either way, you also need
> to do the same for data and bss sections as you are using
> -fdata-sections too.

Right.

> I've found virtually no build time regression on powerpc or x86
> when those are taken care of properly (x86 numbers I sent are typo,
> it's not 5m20, it's 5m02).

Interesting. I wonder if it's got something to do with the
generation of the branch trampolines on ARM, as we have a lot
of them on an allyesconfig.

Is the 5m20 the total build time for the kernel, the time for
rebuilding after a trivial change, or the time to call 'ld.bfd'
once?

Are you using ld.bfd on x86 or ld.gold? For me ld.gold either
works and is really fast, or it crashes, depending on the
configuration. I also don't think it supports big-endian ARM
(which is what allyesconfig ends up using).

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Arnd Bergmann
On Saturday, August 6, 2016 2:16:42 AM CEST Nicholas Piggin wrote:
> > 
> > diff --git a/include/asm-generic/vmlinux.lds.h 
> > b/include/asm-generic/vmlinux.lds.h
> > index 0ec807d69f18..7a3ad269fa23 100644
> > --- a/include/asm-generic/vmlinux.lds.h
> > +++ b/include/asm-generic/vmlinux.lds.h
> > @@ -433,7 +433,7 @@
> >   * during second ld run in second ld pass when generating System.map */
> >  #define TEXT_TEXT\
> >   ALIGN_FUNCTION();   \
> > - *(.text.hot .text .text.fixup .text.unlikely)   \
> > + *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
> >   *(.ref.text)\
> >   MEM_KEEP(init.text) \
> >   MEM_KEEP(exit.text) \
> > 
> > 
> > It also got much faster again, the link time for an allyesconfig
> > kernel is now 18 minutes instead of 10 hours, but it's still
> > much worse than the 2 minutes I had earlier or the four minutes
> > with the previous patch.
> 
> Are you using the patches I just sent?

Not yet, I was still busy with the older version, and trying to
figure out exactly what went wrong in ld.bfd. FWIW, I first tried
to see if the hash tables were just too small, but as it turned
out that was not the problem. When I tried to change the default
hash table sizes, making them bigger only made things slower.

I also found the --hash-size=xxx option, which has a significant
impact on runtime speed. Interestingly again, using sizes less
than the default made things faster in practice. If we can
work out the optimum size for the kernel build, that might
shave a few minutes off the total build time.

> Either way, you also need
> to do the same for data and bss sections as you are using
> -fdata-sections too.

Right.

> I've found virtually no build time regression on powerpc or x86
> when those are taken care of properly (x86 numbers I sent are typo,
> it's not 5m20, it's 5m02).

Interesting. I wonder if it's got something to do with the
generation of the branch trampolines on ARM, as we have a lot
of them on an allyesconfig.

Is the 5m20 the total build time for the kernel, the time for
rebuilding after a trivial change, or the time to call 'ld.bfd'
once?

Are you using ld.bfd on x86 or ld.gold? For me ld.gold either
works and is really fast, or it crashes, depending on the
configuration. I also don't think it supports big-endian ARM
(which is what allyesconfig ends up using).

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Nicholas Piggin
On Fri, 05 Aug 2016 18:01:13 +0200
Arnd Bergmann  wrote:

> On Friday, August 5, 2016 10:26:25 PM CEST Nicholas Piggin wrote:
> > On Fri, 05 Aug 2016 12:17:27 +0200
> > Arnd Bergmann  wrote:  
> 
> > > and I also get link errors for the .text.fixup section
> > > for any users of __put_user() in really large kernels:
> > > net/batman-adv/batman-adv.o:(.text.fixup+0x4): relocation truncated to 
> > > fit: R_ARM_JUMP24 against `.text.batadv_log_read'  
> > 
> > This may be fixed by fixing the linker script to bring in the new
> > sections properly (see new patchset).
> > 
> > If not, then if you can combine the sections rather than have them
> > consecutive in the output, e.g.,:
> > 
> > *(.text .text.fixup)
> > 
> > Rather than
> > 
> > *(.text)
> > *(.text.fixup)
> > 
> > Then the linker has more freedom to rearrange them. I realize it's
> > not that simple with ARM's .text.fixup, but maybe that helps you
> > get it to work.  
> 
> This did the trick:
> 
> diff --git a/include/asm-generic/vmlinux.lds.h 
> b/include/asm-generic/vmlinux.lds.h
> index 0ec807d69f18..7a3ad269fa23 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -433,7 +433,7 @@
>   * during second ld run in second ld pass when generating System.map */
>  #define TEXT_TEXT\
>   ALIGN_FUNCTION();   \
> - *(.text.hot .text .text.fixup .text.unlikely)   \
> + *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
>   *(.ref.text)\
>   MEM_KEEP(init.text) \
>   MEM_KEEP(exit.text) \
> 
> 
> It also got much faster again, the link time for an allyesconfig
> kernel is now 18 minutes instead of 10 hours, but it's still
> much worse than the 2 minutes I had earlier or the four minutes
> with the previous patch.

Are you using the patches I just sent? Either way, you also need
to do the same for data and bss sections as you are using
-fdata-sections too.

I've found virtually no build time regression on powerpc or x86
when those are taken care of properly (x86 numbers I sent are typo,
it's not 5m20, it's 5m02).

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Nicholas Piggin
On Fri, 05 Aug 2016 18:01:13 +0200
Arnd Bergmann  wrote:

> On Friday, August 5, 2016 10:26:25 PM CEST Nicholas Piggin wrote:
> > On Fri, 05 Aug 2016 12:17:27 +0200
> > Arnd Bergmann  wrote:  
> 
> > > and I also get link errors for the .text.fixup section
> > > for any users of __put_user() in really large kernels:
> > > net/batman-adv/batman-adv.o:(.text.fixup+0x4): relocation truncated to 
> > > fit: R_ARM_JUMP24 against `.text.batadv_log_read'  
> > 
> > This may be fixed by fixing the linker script to bring in the new
> > sections properly (see new patchset).
> > 
> > If not, then if you can combine the sections rather than have them
> > consecutive in the output, e.g.,:
> > 
> > *(.text .text.fixup)
> > 
> > Rather than
> > 
> > *(.text)
> > *(.text.fixup)
> > 
> > Then the linker has more freedom to rearrange them. I realize it's
> > not that simple with ARM's .text.fixup, but maybe that helps you
> > get it to work.  
> 
> This did the trick:
> 
> diff --git a/include/asm-generic/vmlinux.lds.h 
> b/include/asm-generic/vmlinux.lds.h
> index 0ec807d69f18..7a3ad269fa23 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -433,7 +433,7 @@
>   * during second ld run in second ld pass when generating System.map */
>  #define TEXT_TEXT\
>   ALIGN_FUNCTION();   \
> - *(.text.hot .text .text.fixup .text.unlikely)   \
> + *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
>   *(.ref.text)\
>   MEM_KEEP(init.text) \
>   MEM_KEEP(exit.text) \
> 
> 
> It also got much faster again, the link time for an allyesconfig
> kernel is now 18 minutes instead of 10 hours, but it's still
> much worse than the 2 minutes I had earlier or the four minutes
> with the previous patch.

Are you using the patches I just sent? Either way, you also need
to do the same for data and bss sections as you are using
-fdata-sections too.

I've found virtually no build time regression on powerpc or x86
when those are taken care of properly (x86 numbers I sent are typo,
it's not 5m20, it's 5m02).

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Arnd Bergmann
On Friday, August 5, 2016 10:26:25 PM CEST Nicholas Piggin wrote:
> On Fri, 05 Aug 2016 12:17:27 +0200
> Arnd Bergmann  wrote:

> > and I also get link errors for the .text.fixup section
> > for any users of __put_user() in really large kernels:
> > net/batman-adv/batman-adv.o:(.text.fixup+0x4): relocation truncated to fit: 
> > R_ARM_JUMP24 against `.text.batadv_log_read'
> 
> This may be fixed by fixing the linker script to bring in the new
> sections properly (see new patchset).
> 
> If not, then if you can combine the sections rather than have them
> consecutive in the output, e.g.,:
> 
> *(.text .text.fixup)
> 
> Rather than
> 
> *(.text)
> *(.text.fixup)
> 
> Then the linker has more freedom to rearrange them. I realize it's
> not that simple with ARM's .text.fixup, but maybe that helps you
> get it to work.

This did the trick:

diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 0ec807d69f18..7a3ad269fa23 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -433,7 +433,7 @@
  * during second ld run in second ld pass when generating System.map */
 #define TEXT_TEXT  \
ALIGN_FUNCTION();   \
-   *(.text.hot .text .text.fixup .text.unlikely)   \
+   *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
*(.ref.text)\
MEM_KEEP(init.text) \
MEM_KEEP(exit.text) \


It also got much faster again, the link time for an allyesconfig
kernel is now 18 minutes instead of 10 hours, but it's still
much worse than the 2 minutes I had earlier or the four minutes
with the previous patch.

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Arnd Bergmann
On Friday, August 5, 2016 10:26:25 PM CEST Nicholas Piggin wrote:
> On Fri, 05 Aug 2016 12:17:27 +0200
> Arnd Bergmann  wrote:

> > and I also get link errors for the .text.fixup section
> > for any users of __put_user() in really large kernels:
> > net/batman-adv/batman-adv.o:(.text.fixup+0x4): relocation truncated to fit: 
> > R_ARM_JUMP24 against `.text.batadv_log_read'
> 
> This may be fixed by fixing the linker script to bring in the new
> sections properly (see new patchset).
> 
> If not, then if you can combine the sections rather than have them
> consecutive in the output, e.g.,:
> 
> *(.text .text.fixup)
> 
> Rather than
> 
> *(.text)
> *(.text.fixup)
> 
> Then the linker has more freedom to rearrange them. I realize it's
> not that simple with ARM's .text.fixup, but maybe that helps you
> get it to work.

This did the trick:

diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 0ec807d69f18..7a3ad269fa23 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -433,7 +433,7 @@
  * during second ld run in second ld pass when generating System.map */
 #define TEXT_TEXT  \
ALIGN_FUNCTION();   \
-   *(.text.hot .text .text.fixup .text.unlikely)   \
+   *(.text.hot .text .text.* .text.fixup .text.unlikely)   \
*(.ref.text)\
MEM_KEEP(init.text) \
MEM_KEEP(exit.text) \


It also got much faster again, the link time for an allyesconfig
kernel is now 18 minutes instead of 10 hours, but it's still
much worse than the 2 minutes I had earlier or the four minutes
with the previous patch.

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Nicholas Piggin
On Fri, 05 Aug 2016 12:17:27 +0200
Arnd Bergmann  wrote:

> On Friday, August 5, 2016 6:41:08 PM CEST Nicholas Piggin wrote:
> > On Thu, 4 Aug 2016 12:06:41 -0500
> > Segher Boessenkool  wrote:
> >   
> > > On Thu, Aug 04, 2016 at 06:10:57PM +0200, Arnd Bergmann wrote:  
> > > > On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> > > > 
> > > > > + __used  \
> > > > > + __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), 
> > > > > used)) \
> > > > 
> > > > 
> > > > I've just started testing this, but the first problem I ran into
> > > > is that @ and # are special characters that have an architecture
> > > > specific meaning to the assembler. On ARM, you need "%note @" instead
> > > > of "@note #".
> > > 
> > > That comment trick (I still feel guilty about it) causes more problems
> > > than it solves.  Please don't try to use it :-)  
> > 
> > Yeah that's a funny hack. I don't think it's required though, but I'm just
> > running through some more tests.
> > 
> > I think I found an improvement with the thin archives as well -- we were
> > still building symbol table after removing the s option (that only avoids
> > index). "S" is required to not build symbol table.
> > 
> > I'll send out an RFC on a slightly more polished patch series shortly.  
> 
> 
> I could not find Nico's patches, but based on the information in his
> presentation at
> 
> https://www.linuxplumbersconf.org/2015/ocw//system/presentations/3369/original/slides.html#(1)
> 
> I created a patch for ARM that mirrors what you have for powerpc, see
> below.

Great, thanks for jumping in. I posted another set which is a lot improved
you should pick up.


> I have successfully built normal-sized kernels with this (not tried
> running them). Unfortunately, the build time for "allyesconfig"
> kernel explodes, the final link time is now in the hours instead of
> minutes (no exact numbers unfortunately, it takes too long to
> reproduce),

That's becase we need to coalesce the new sections properly into the
output file. binutils does not cope with vast number of sections in
final linked file and spends all its time in hash lookup then explodes
usually.


> and I also get link errors for the .text.fixup section
> for any users of __put_user() in really large kernels:
> net/batman-adv/batman-adv.o:(.text.fixup+0x4): relocation truncated to fit: 
> R_ARM_JUMP24 against `.text.batadv_log_read'

This may be fixed by fixing the linker script to bring in the new
sections properly (see new patchset).

If not, then if you can combine the sections rather than have them
consecutive in the output, e.g.,:

*(.text .text.fixup)

Rather than

*(.text)
*(.text.fixup)

Then the linker has more freedom to rearrange them. I realize it's
not that simple with ARM's .text.fixup, but maybe that helps you
get it to work.

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Nicholas Piggin
On Fri, 05 Aug 2016 12:17:27 +0200
Arnd Bergmann  wrote:

> On Friday, August 5, 2016 6:41:08 PM CEST Nicholas Piggin wrote:
> > On Thu, 4 Aug 2016 12:06:41 -0500
> > Segher Boessenkool  wrote:
> >   
> > > On Thu, Aug 04, 2016 at 06:10:57PM +0200, Arnd Bergmann wrote:  
> > > > On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> > > > 
> > > > > + __used  \
> > > > > + __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), 
> > > > > used)) \
> > > > 
> > > > 
> > > > I've just started testing this, but the first problem I ran into
> > > > is that @ and # are special characters that have an architecture
> > > > specific meaning to the assembler. On ARM, you need "%note @" instead
> > > > of "@note #".
> > > 
> > > That comment trick (I still feel guilty about it) causes more problems
> > > than it solves.  Please don't try to use it :-)  
> > 
> > Yeah that's a funny hack. I don't think it's required though, but I'm just
> > running through some more tests.
> > 
> > I think I found an improvement with the thin archives as well -- we were
> > still building symbol table after removing the s option (that only avoids
> > index). "S" is required to not build symbol table.
> > 
> > I'll send out an RFC on a slightly more polished patch series shortly.  
> 
> 
> I could not find Nico's patches, but based on the information in his
> presentation at
> 
> https://www.linuxplumbersconf.org/2015/ocw//system/presentations/3369/original/slides.html#(1)
> 
> I created a patch for ARM that mirrors what you have for powerpc, see
> below.

Great, thanks for jumping in. I posted another set which is a lot improved
you should pick up.


> I have successfully built normal-sized kernels with this (not tried
> running them). Unfortunately, the build time for "allyesconfig"
> kernel explodes, the final link time is now in the hours instead of
> minutes (no exact numbers unfortunately, it takes too long to
> reproduce),

That's becase we need to coalesce the new sections properly into the
output file. binutils does not cope with vast number of sections in
final linked file and spends all its time in hash lookup then explodes
usually.


> and I also get link errors for the .text.fixup section
> for any users of __put_user() in really large kernels:
> net/batman-adv/batman-adv.o:(.text.fixup+0x4): relocation truncated to fit: 
> R_ARM_JUMP24 against `.text.batadv_log_read'

This may be fixed by fixing the linker script to bring in the new
sections properly (see new patchset).

If not, then if you can combine the sections rather than have them
consecutive in the output, e.g.,:

*(.text .text.fixup)

Rather than

*(.text)
*(.text.fixup)

Then the linker has more freedom to rearrange them. I realize it's
not that simple with ARM's .text.fixup, but maybe that helps you
get it to work.

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Arnd Bergmann
On Friday, August 5, 2016 6:41:08 PM CEST Nicholas Piggin wrote:
> On Thu, 4 Aug 2016 12:06:41 -0500
> Segher Boessenkool  wrote:
> 
> > On Thu, Aug 04, 2016 at 06:10:57PM +0200, Arnd Bergmann wrote:
> > > On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> > >   
> > > > +   __used  \
> > > > +   __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), 
> > > > used)) \  
> > > 
> > > 
> > > I've just started testing this, but the first problem I ran into
> > > is that @ and # are special characters that have an architecture
> > > specific meaning to the assembler. On ARM, you need "%note @" instead
> > > of "@note #".  
> > 
> > That comment trick (I still feel guilty about it) causes more problems
> > than it solves.  Please don't try to use it :-)
> 
> Yeah that's a funny hack. I don't think it's required though, but I'm just
> running through some more tests.
> 
> I think I found an improvement with the thin archives as well -- we were
> still building symbol table after removing the s option (that only avoids
> index). "S" is required to not build symbol table.
> 
> I'll send out an RFC on a slightly more polished patch series shortly.


I could not find Nico's patches, but based on the information in his
presentation at

https://www.linuxplumbersconf.org/2015/ocw//system/presentations/3369/original/slides.html#(1)

I created a patch for ARM that mirrors what you have for powerpc, see
below.

I have successfully built normal-sized kernels with this (not tried
running them). Unfortunately, the build time for "allyesconfig"
kernel explodes, the final link time is now in the hours instead of
minutes (no exact numbers unfortunately, it takes too long to
reproduce), and I also get link errors for the .text.fixup section
for any users of __put_user() in really large kernels:

net/batman-adv/batman-adv.o:(.text.fixup+0x4): relocation truncated to fit: 
R_ARM_JUMP24 against `.text.batadv_log_read'
...
drivers/scsi/sg.o:(.text.fixup+0x4): relocation truncated to fit: 
R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.text.fixup+0xc): relocation truncated to fit: 
R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.text.fixup+0x14): relocation truncated to fit: 
R_ARM_THM_JUMP24 against `.text.sg_ioctl'
...

This originates from

#define __put_user_asm(x, __pu_addr, err, instr)\
__asm__ __volatile__(   \
"1: " TUSER(instr) " %1, [%2], #0\n"\
"2:\n"  \
"   .pushsection .text.fixup,\"ax\"\n"  \
"   .align  2\n"\
"3: mov %0, %3\n"   \
"   b   2b\n"   \
"   .popsection\n"  \
"   .pushsection __ex_table,\"a\"\n"\
"   .align  3\n"\
"   .long   1b, 3b\n"   \
"   .popsection"\
: "+r" (err)\
: "r" (x), "r" (__pu_addr), "i" (-EFAULT)   \
: "cc")

Arnd

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 842f46af5b9d..b4fc91603429 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -362,6 +362,8 @@ archclean:
 # My testing targets (bypasses dependencies)
 bp:;   $(Q)$(MAKE) $(build)=$(boot) MACHINE=$(MACHINE) $(boot)/bootpImage
 
+KBUILD_CFLAGS  += -ffunction-sections -fdata-sections
+LDFLAGS_vmlinux+= --gc-sections
 
 define archhelp
   echo  '* zImage- Compressed kernel image (arch/$(ARCH)/boot/zImage)'
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index ad325a8c7e1e..f0eca9a96005 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -11,6 +11,9 @@ CFLAGS_REMOVE_insn.o = -pg
 CFLAGS_REMOVE_patch.o = -pg
 endif
 
+ccflags-y  += -fno-function-sections -fno-data-sections
+subdir-ccflags-y   += -fno-function-sections -fno-data-sections
+
 CFLAGS_REMOVE_return_address.o = -pg
 
 # Object file lists.
diff --git a/arch/arm/kernel/vmlinux-xip.lds.S 
b/arch/arm/kernel/vmlinux-xip.lds.S
index 56c8bdf776bd..ef7d8d7a997b 100644
--- a/arch/arm/kernel/vmlinux-xip.lds.S
+++ b/arch/arm/kernel/vmlinux-xip.lds.S
@@ -12,17 +12,17 @@
 #define PROC_INFO  \
. = ALIGN(4);   \
VMLINUX_SYMBOL(__proc_info_begin) = .;  \
-   *(.proc.info.init)  \
+   KEEP(*(.proc.info.init))\
VMLINUX_SYMBOL(__proc_info_end) = .;
 
 

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Arnd Bergmann
On Friday, August 5, 2016 6:41:08 PM CEST Nicholas Piggin wrote:
> On Thu, 4 Aug 2016 12:06:41 -0500
> Segher Boessenkool  wrote:
> 
> > On Thu, Aug 04, 2016 at 06:10:57PM +0200, Arnd Bergmann wrote:
> > > On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> > >   
> > > > +   __used  \
> > > > +   __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), 
> > > > used)) \  
> > > 
> > > 
> > > I've just started testing this, but the first problem I ran into
> > > is that @ and # are special characters that have an architecture
> > > specific meaning to the assembler. On ARM, you need "%note @" instead
> > > of "@note #".  
> > 
> > That comment trick (I still feel guilty about it) causes more problems
> > than it solves.  Please don't try to use it :-)
> 
> Yeah that's a funny hack. I don't think it's required though, but I'm just
> running through some more tests.
> 
> I think I found an improvement with the thin archives as well -- we were
> still building symbol table after removing the s option (that only avoids
> index). "S" is required to not build symbol table.
> 
> I'll send out an RFC on a slightly more polished patch series shortly.


I could not find Nico's patches, but based on the information in his
presentation at

https://www.linuxplumbersconf.org/2015/ocw//system/presentations/3369/original/slides.html#(1)

I created a patch for ARM that mirrors what you have for powerpc, see
below.

I have successfully built normal-sized kernels with this (not tried
running them). Unfortunately, the build time for "allyesconfig"
kernel explodes, the final link time is now in the hours instead of
minutes (no exact numbers unfortunately, it takes too long to
reproduce), and I also get link errors for the .text.fixup section
for any users of __put_user() in really large kernels:

net/batman-adv/batman-adv.o:(.text.fixup+0x4): relocation truncated to fit: 
R_ARM_JUMP24 against `.text.batadv_log_read'
...
drivers/scsi/sg.o:(.text.fixup+0x4): relocation truncated to fit: 
R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.text.fixup+0xc): relocation truncated to fit: 
R_ARM_THM_JUMP24 against `.text.sg_ioctl'
drivers/scsi/sg.o:(.text.fixup+0x14): relocation truncated to fit: 
R_ARM_THM_JUMP24 against `.text.sg_ioctl'
...

This originates from

#define __put_user_asm(x, __pu_addr, err, instr)\
__asm__ __volatile__(   \
"1: " TUSER(instr) " %1, [%2], #0\n"\
"2:\n"  \
"   .pushsection .text.fixup,\"ax\"\n"  \
"   .align  2\n"\
"3: mov %0, %3\n"   \
"   b   2b\n"   \
"   .popsection\n"  \
"   .pushsection __ex_table,\"a\"\n"\
"   .align  3\n"\
"   .long   1b, 3b\n"   \
"   .popsection"\
: "+r" (err)\
: "r" (x), "r" (__pu_addr), "i" (-EFAULT)   \
: "cc")

Arnd

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 842f46af5b9d..b4fc91603429 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -362,6 +362,8 @@ archclean:
 # My testing targets (bypasses dependencies)
 bp:;   $(Q)$(MAKE) $(build)=$(boot) MACHINE=$(MACHINE) $(boot)/bootpImage
 
+KBUILD_CFLAGS  += -ffunction-sections -fdata-sections
+LDFLAGS_vmlinux+= --gc-sections
 
 define archhelp
   echo  '* zImage- Compressed kernel image (arch/$(ARCH)/boot/zImage)'
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index ad325a8c7e1e..f0eca9a96005 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -11,6 +11,9 @@ CFLAGS_REMOVE_insn.o = -pg
 CFLAGS_REMOVE_patch.o = -pg
 endif
 
+ccflags-y  += -fno-function-sections -fno-data-sections
+subdir-ccflags-y   += -fno-function-sections -fno-data-sections
+
 CFLAGS_REMOVE_return_address.o = -pg
 
 # Object file lists.
diff --git a/arch/arm/kernel/vmlinux-xip.lds.S 
b/arch/arm/kernel/vmlinux-xip.lds.S
index 56c8bdf776bd..ef7d8d7a997b 100644
--- a/arch/arm/kernel/vmlinux-xip.lds.S
+++ b/arch/arm/kernel/vmlinux-xip.lds.S
@@ -12,17 +12,17 @@
 #define PROC_INFO  \
. = ALIGN(4);   \
VMLINUX_SYMBOL(__proc_info_begin) = .;  \
-   *(.proc.info.init)  \
+   KEEP(*(.proc.info.init))\
VMLINUX_SYMBOL(__proc_info_end) = .;
 
 #define IDMAP_TEXT   

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Nicholas Piggin
On Thu, 4 Aug 2016 12:06:41 -0500
Segher Boessenkool  wrote:

> On Thu, Aug 04, 2016 at 06:10:57PM +0200, Arnd Bergmann wrote:
> > On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> >   
> > > + __used  \
> > > + __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), used)) \ 
> > >  
> > 
> > 
> > I've just started testing this, but the first problem I ran into
> > is that @ and # are special characters that have an architecture
> > specific meaning to the assembler. On ARM, you need "%note @" instead
> > of "@note #".  
> 
> That comment trick (I still feel guilty about it) causes more problems
> than it solves.  Please don't try to use it :-)

Yeah that's a funny hack. I don't think it's required though, but I'm just
running through some more tests.

I think I found an improvement with the thin archives as well -- we were
still building symbol table after removing the s option (that only avoids
index). "S" is required to not build symbol table.

I'll send out an RFC on a slightly more polished patch series shortly.

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-05 Thread Nicholas Piggin
On Thu, 4 Aug 2016 12:06:41 -0500
Segher Boessenkool  wrote:

> On Thu, Aug 04, 2016 at 06:10:57PM +0200, Arnd Bergmann wrote:
> > On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> >   
> > > + __used  \
> > > + __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), used)) \ 
> > >  
> > 
> > 
> > I've just started testing this, but the first problem I ran into
> > is that @ and # are special characters that have an architecture
> > specific meaning to the assembler. On ARM, you need "%note @" instead
> > of "@note #".  
> 
> That comment trick (I still feel guilty about it) causes more problems
> than it solves.  Please don't try to use it :-)

Yeah that's a funny hack. I don't think it's required though, but I'm just
running through some more tests.

I think I found an improvement with the thin archives as well -- we were
still building symbol table after removing the s option (that only avoids
index). "S" is required to not build symbol table.

I'll send out an RFC on a slightly more polished patch series shortly.

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Segher Boessenkool
On Thu, Aug 04, 2016 at 06:10:57PM +0200, Arnd Bergmann wrote:
> On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> 
> > +   __used  \
> > +   __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), used)) \
> 
> 
> I've just started testing this, but the first problem I ran into
> is that @ and # are special characters that have an architecture
> specific meaning to the assembler. On ARM, you need "%note @" instead
> of "@note #".

That comment trick (I still feel guilty about it) causes more problems
than it solves.  Please don't try to use it :-)


Segher


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Segher Boessenkool
On Thu, Aug 04, 2016 at 06:10:57PM +0200, Arnd Bergmann wrote:
> On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> 
> > +   __used  \
> > +   __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), used)) \
> 
> 
> I've just started testing this, but the first problem I ran into
> is that @ and # are special characters that have an architecture
> specific meaning to the assembler. On ARM, you need "%note @" instead
> of "@note #".

That comment trick (I still feel guilty about it) causes more problems
than it solves.  Please don't try to use it :-)


Segher


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:

> + __used  \
> + __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), used)) \


I've just started testing this, but the first problem I ran into
is that @ and # are special characters that have an architecture
specific meaning to the assembler. On ARM, you need "%note @" instead
of "@note #".

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:

> + __used  \
> + __attribute__((section("___kentry" "+" #sym ",\"a\",@note #"), used)) \


I've just started testing this, but the first problem I ran into
is that @ and # are special characters that have an architecture
specific meaning to the assembler. On ARM, you need "%note @" instead
of "@note #".

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Nicholas Piggin
On Thu, 4 Aug 2016 22:31:39 +1000
Nicholas Piggin  wrote:
> On Thu, 04 Aug 2016 14:09:02 +0200
> Arnd Bergmann  wrote:
> > Nicolas Pitre has done some related work, adding him to Cc. IIRC we have
> > actually had multiple implementations of -ffunction-sections/--gc-sections
> > in the past that people have used in production, but none of them
> > ever made it upstream.  

After some googling around it seems lto has been difficult to
get in and it was agreed this gc-sections should be done first
anyway (although it may indeed provide a superset of DCE, but
it's always going to be more costly and complicated). Lto would
have the same issue with liveness of entry points, which is
really the only thing you need change in the kernel as far as I
can see.

I didn't really see what problems people were having with it
though, so maybe it's architecture specific or something I
haven't run into yet.

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Nicholas Piggin
On Thu, 4 Aug 2016 22:31:39 +1000
Nicholas Piggin  wrote:
> On Thu, 04 Aug 2016 14:09:02 +0200
> Arnd Bergmann  wrote:
> > Nicolas Pitre has done some related work, adding him to Cc. IIRC we have
> > actually had multiple implementations of -ffunction-sections/--gc-sections
> > in the past that people have used in production, but none of them
> > ever made it upstream.  

After some googling around it seems lto has been difficult to
get in and it was agreed this gc-sections should be done first
anyway (although it may indeed provide a superset of DCE, but
it's always going to be more costly and complicated). Lto would
have the same issue with liveness of entry points, which is
really the only thing you need change in the kernel as far as I
can see.

I didn't really see what problems people were having with it
though, so maybe it's architecture specific or something I
haven't run into yet.

Thanks,
Nick


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 11:54:18 PM CEST Nicholas Piggin wrote:
> On Thu, 4 Aug 2016 22:31:39 +1000
> Nicholas Piggin  wrote:
> > On Thu, 04 Aug 2016 14:09:02 +0200
> > Arnd Bergmann  wrote:
> > > Nicolas Pitre has done some related work, adding him to Cc. IIRC we have
> > > actually had multiple implementations of -ffunction-sections/--gc-sections
> > > in the past that people have used in production, but none of them
> > > ever made it upstream.  
> 
> After some googling around it seems lto has been difficult to
> get in and it was agreed this gc-sections should be done first
> anyway (although it may indeed provide a superset of DCE, but
> it's always going to be more costly and complicated). Lto would
> have the same issue with liveness of entry points, which is
> really the only thing you need change in the kernel as far as I
> can see.

Ok, good.

> I didn't really see what problems people were having with it
> though, so maybe it's architecture specific or something I
> haven't run into yet.

I remember trying it a few years ago without success, it's possible
that old binutils versions were more problematic.

I'm happy to test your patches on ARM, with my randconfig builder
I tend to find obscure bugs in corner cases that you might not
normally find with just defconfig/allmodconfig builds.

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 11:54:18 PM CEST Nicholas Piggin wrote:
> On Thu, 4 Aug 2016 22:31:39 +1000
> Nicholas Piggin  wrote:
> > On Thu, 04 Aug 2016 14:09:02 +0200
> > Arnd Bergmann  wrote:
> > > Nicolas Pitre has done some related work, adding him to Cc. IIRC we have
> > > actually had multiple implementations of -ffunction-sections/--gc-sections
> > > in the past that people have used in production, but none of them
> > > ever made it upstream.  
> 
> After some googling around it seems lto has been difficult to
> get in and it was agreed this gc-sections should be done first
> anyway (although it may indeed provide a superset of DCE, but
> it's always going to be more costly and complicated). Lto would
> have the same issue with liveness of entry points, which is
> really the only thing you need change in the kernel as far as I
> can see.

Ok, good.

> I didn't really see what problems people were having with it
> though, so maybe it's architecture specific or something I
> haven't run into yet.

I remember trying it a few years ago without success, it's possible
that old binutils versions were more problematic.

I'm happy to test your patches on ARM, with my randconfig builder
I tend to find obscure bugs in corner cases that you might not
normally find with just defconfig/allmodconfig builds.

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Nicholas Piggin
On Thu, 04 Aug 2016 14:09:02 +0200
Arnd Bergmann  wrote:

> On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> > On Thu, 04 Aug 2016 12:37:41 +0200 Arnd Bergmann  wrote:  
> > > On Thursday, August 4, 2016 11:00:49 AM CEST Arnd Bergmann wrote:  
> > > > I tried this
> > > > 
> > > > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > > > index b5e40ed86e60..89bca1a25916 100755
> > > > --- a/scripts/link-vmlinux.sh
> > > > +++ b/scripts/link-vmlinux.sh
> > > > @@ -44,7 +44,7 @@ modpost_link()
> > > > local objects
> > > >  
> > > > if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
> > > > -   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
> > > > ${KBUILD_VMLINUX_MAIN} --no-whole-archive"
> > > > +   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
> > > > else
> > > > objects="${KBUILD_VMLINUX_INIT} --start-group 
> > > > ${KBUILD_VMLINUX_MAIN} --end-group"
> > > > fi
> > > > 
> > > > but that did not seem to change anything, the extra symbols are
> > > > still there. I have not tried to understand what that actually
> > > > does, so maybe I misunderstood your suggestion.
> > > > 
> > > 
> > > On a second attempt, I did the same change for vmlinux instead of the
> > > module (d'oh), and got a link failure instead:
> > > 
> > > 
> > > arch/arm/mm/proc-xscale.o: In function `cpu_xscale_do_resume':
> > > (.text+0x3d4): undefined reference to `cpu_resume_mmu'
> > > arch/arm/kernel/setup.o: In function `setup_arch':
> > > ...
> > > 
> > > However, I also see a link failure in some rare configurations
> > > with just your patch:
> > > 
> > > arch/arm/lib/lib.a(io-acorn.o): In function `outsl':
> > > (.text+0x38): undefined reference to `printk'
> > > 
> > > The problem being a file in a library object that is not referenced,
> > > but that references another symbol that is not defined
> > > (CONFIG_PRINTK=n).  
> > 
> > The first problem is the existing link system is buggy. I think an
> > unconditional switch to --whole-archive (at least for modular kernels)
> > should probably be done anyway. For example, on powerpc when building
> > with --whole-archive, I have:
> > 
> > +dma_noop_alloc
> > +dma_noop_free
> > +dma_noop_map_page
> > +dma_noop_mapping_error
> > +dma_noop_map_sg
> > +dma_noop_ops
> > +dma_noop_supported
> > +fdt_add_reservemap_entry
> > +fdt_begin_node
> > +fdt_create
> > +fdt_create_empty_tree
> > +fdt_end_node
> > +fdt_errtable
> > +find_cpio_data
> > +ioremap_page_range
> > 
> > find_cpio_data is unnecessary and it's a codesize regression to link it.
> > But dma_noop_ops and ioremap_page_range are exported symbols. If I
> > reference dma_noop_ops from some random module with otherwise unpatched
> > kernel:
> > 
> > ERROR: "dma_noop_ops" [drivers/char/bsr.ko] undefined!  
> 
> Right, but only on s390, which is the one architecture using this.
> I think we should just have a Kconfig symbol for this file that
> gets selected by any architecture that needs it.

No, the problem is that the module is being selected and built
but it is missing from the vmlinux despite being exported.


> This is also what we have ended up doing for almost all other
> files in lib/
> 
> > The real problem is that our linkage requirements are like a shared
> > library when we build modular.
> > 
> > We could build a list of exports and make it link objects with those
> > symbols, to solve this, but IMO that's just wasting lipstick on a pig.
> > But I will to propose a patch to always use --whole-archive, thin
> > archives or not, and transition all archs over to it in a few release
> > cycles. It just works by luck right now.
> >
> > Why is it a pig? Because having the linker to notice no external
> > references and just skipping the .o completely is trying to use a hammer
> > as a scalpel. It's just not a very effective way to eliminate dead code
> > --  I pulled in only a handful of unneeded functions by switching it.  
> 
> If we do that, we may just as well get rid of $(lib-y) in the process and
> always use $(obj-y).

Sure, after we switch everybody over.


> > I mean it is a quick simple feature that probably works well enough with
> > simple build systems. But not an advanced one that builds almost
> > everything on demand and also has loadable modules and must act like a
> > shared library.
> > 
> > Real linker DCE is a valid optimisation that can't be replaced by the
> > build system of course, but we need to do it properly. Here's what I'm
> > working on.
> > 
> > It applies on top of the previous patch I sent, plus some powerpc stuff
> > I'm working on that you should be able to just ignore for another arch.
> > it's a WIP, but if you can see if it works for arm that would be cool.
> > 
> > It doesn't actually build allyesconfig after this,
> > ld: .tmp_vmlinux1: Too many sections: 220655 (>= 65280)
> > 
> > But on a more reasonable configuration (ppc64le)
> > text 

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Nicholas Piggin
On Thu, 04 Aug 2016 14:09:02 +0200
Arnd Bergmann  wrote:

> On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> > On Thu, 04 Aug 2016 12:37:41 +0200 Arnd Bergmann  wrote:  
> > > On Thursday, August 4, 2016 11:00:49 AM CEST Arnd Bergmann wrote:  
> > > > I tried this
> > > > 
> > > > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > > > index b5e40ed86e60..89bca1a25916 100755
> > > > --- a/scripts/link-vmlinux.sh
> > > > +++ b/scripts/link-vmlinux.sh
> > > > @@ -44,7 +44,7 @@ modpost_link()
> > > > local objects
> > > >  
> > > > if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
> > > > -   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
> > > > ${KBUILD_VMLINUX_MAIN} --no-whole-archive"
> > > > +   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
> > > > else
> > > > objects="${KBUILD_VMLINUX_INIT} --start-group 
> > > > ${KBUILD_VMLINUX_MAIN} --end-group"
> > > > fi
> > > > 
> > > > but that did not seem to change anything, the extra symbols are
> > > > still there. I have not tried to understand what that actually
> > > > does, so maybe I misunderstood your suggestion.
> > > > 
> > > 
> > > On a second attempt, I did the same change for vmlinux instead of the
> > > module (d'oh), and got a link failure instead:
> > > 
> > > 
> > > arch/arm/mm/proc-xscale.o: In function `cpu_xscale_do_resume':
> > > (.text+0x3d4): undefined reference to `cpu_resume_mmu'
> > > arch/arm/kernel/setup.o: In function `setup_arch':
> > > ...
> > > 
> > > However, I also see a link failure in some rare configurations
> > > with just your patch:
> > > 
> > > arch/arm/lib/lib.a(io-acorn.o): In function `outsl':
> > > (.text+0x38): undefined reference to `printk'
> > > 
> > > The problem being a file in a library object that is not referenced,
> > > but that references another symbol that is not defined
> > > (CONFIG_PRINTK=n).  
> > 
> > The first problem is the existing link system is buggy. I think an
> > unconditional switch to --whole-archive (at least for modular kernels)
> > should probably be done anyway. For example, on powerpc when building
> > with --whole-archive, I have:
> > 
> > +dma_noop_alloc
> > +dma_noop_free
> > +dma_noop_map_page
> > +dma_noop_mapping_error
> > +dma_noop_map_sg
> > +dma_noop_ops
> > +dma_noop_supported
> > +fdt_add_reservemap_entry
> > +fdt_begin_node
> > +fdt_create
> > +fdt_create_empty_tree
> > +fdt_end_node
> > +fdt_errtable
> > +find_cpio_data
> > +ioremap_page_range
> > 
> > find_cpio_data is unnecessary and it's a codesize regression to link it.
> > But dma_noop_ops and ioremap_page_range are exported symbols. If I
> > reference dma_noop_ops from some random module with otherwise unpatched
> > kernel:
> > 
> > ERROR: "dma_noop_ops" [drivers/char/bsr.ko] undefined!  
> 
> Right, but only on s390, which is the one architecture using this.
> I think we should just have a Kconfig symbol for this file that
> gets selected by any architecture that needs it.

No, the problem is that the module is being selected and built
but it is missing from the vmlinux despite being exported.


> This is also what we have ended up doing for almost all other
> files in lib/
> 
> > The real problem is that our linkage requirements are like a shared
> > library when we build modular.
> > 
> > We could build a list of exports and make it link objects with those
> > symbols, to solve this, but IMO that's just wasting lipstick on a pig.
> > But I will to propose a patch to always use --whole-archive, thin
> > archives or not, and transition all archs over to it in a few release
> > cycles. It just works by luck right now.
> >
> > Why is it a pig? Because having the linker to notice no external
> > references and just skipping the .o completely is trying to use a hammer
> > as a scalpel. It's just not a very effective way to eliminate dead code
> > --  I pulled in only a handful of unneeded functions by switching it.  
> 
> If we do that, we may just as well get rid of $(lib-y) in the process and
> always use $(obj-y).

Sure, after we switch everybody over.


> > I mean it is a quick simple feature that probably works well enough with
> > simple build systems. But not an advanced one that builds almost
> > everything on demand and also has loadable modules and must act like a
> > shared library.
> > 
> > Real linker DCE is a valid optimisation that can't be replaced by the
> > build system of course, but we need to do it properly. Here's what I'm
> > working on.
> > 
> > It applies on top of the previous patch I sent, plus some powerpc stuff
> > I'm working on that you should be able to just ignore for another arch.
> > it's a WIP, but if you can see if it works for arm that would be cool.
> > 
> > It doesn't actually build allyesconfig after this,
> > ld: .tmp_vmlinux1: Too many sections: 220655 (>= 65280)
> > 
> > But on a more reasonable configuration (ppc64le)
> > text  data   bss

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> On Thu, 04 Aug 2016 12:37:41 +0200 Arnd Bergmann  wrote:
> > On Thursday, August 4, 2016 11:00:49 AM CEST Arnd Bergmann wrote:
> > > I tried this
> > > 
> > > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > > index b5e40ed86e60..89bca1a25916 100755
> > > --- a/scripts/link-vmlinux.sh
> > > +++ b/scripts/link-vmlinux.sh
> > > @@ -44,7 +44,7 @@ modpost_link()
> > > local objects
> > >  
> > > if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
> > > -   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
> > > ${KBUILD_VMLINUX_MAIN} --no-whole-archive"
> > > +   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
> > > else
> > > objects="${KBUILD_VMLINUX_INIT} --start-group 
> > > ${KBUILD_VMLINUX_MAIN} --end-group"
> > > fi
> > > 
> > > but that did not seem to change anything, the extra symbols are
> > > still there. I have not tried to understand what that actually
> > > does, so maybe I misunderstood your suggestion.
> > >   
> > 
> > On a second attempt, I did the same change for vmlinux instead of the
> > module (d'oh), and got a link failure instead:
> > 
> > 
> > arch/arm/mm/proc-xscale.o: In function `cpu_xscale_do_resume':
> > (.text+0x3d4): undefined reference to `cpu_resume_mmu'
> > arch/arm/kernel/setup.o: In function `setup_arch':
> > ...
> > 
> > However, I also see a link failure in some rare configurations
> > with just your patch:
> > 
> > arch/arm/lib/lib.a(io-acorn.o): In function `outsl':
> > (.text+0x38): undefined reference to `printk'
> > 
> > The problem being a file in a library object that is not referenced,
> > but that references another symbol that is not defined
> > (CONFIG_PRINTK=n).
> 
> The first problem is the existing link system is buggy. I think an
> unconditional switch to --whole-archive (at least for modular kernels)
> should probably be done anyway. For example, on powerpc when building
> with --whole-archive, I have:
> 
> +dma_noop_alloc
> +dma_noop_free
> +dma_noop_map_page
> +dma_noop_mapping_error
> +dma_noop_map_sg
> +dma_noop_ops
> +dma_noop_supported
> +fdt_add_reservemap_entry
> +fdt_begin_node
> +fdt_create
> +fdt_create_empty_tree
> +fdt_end_node
> +fdt_errtable
> +find_cpio_data
> +ioremap_page_range
> 
> find_cpio_data is unnecessary and it's a codesize regression to link it.
> But dma_noop_ops and ioremap_page_range are exported symbols. If I
> reference dma_noop_ops from some random module with otherwise unpatched
> kernel:
> 
> ERROR: "dma_noop_ops" [drivers/char/bsr.ko] undefined!

Right, but only on s390, which is the one architecture using this.
I think we should just have a Kconfig symbol for this file that
gets selected by any architecture that needs it.

This is also what we have ended up doing for almost all other
files in lib/

> The real problem is that our linkage requirements are like a shared
> library when we build modular.
> 
> We could build a list of exports and make it link objects with those
> symbols, to solve this, but IMO that's just wasting lipstick on a pig.
> But I will to propose a patch to always use --whole-archive, thin
> archives or not, and transition all archs over to it in a few release
> cycles. It just works by luck right now.
>
> Why is it a pig? Because having the linker to notice no external
> references and just skipping the .o completely is trying to use a hammer
> as a scalpel. It's just not a very effective way to eliminate dead code
> --  I pulled in only a handful of unneeded functions by switching it.

If we do that, we may just as well get rid of $(lib-y) in the process and
always use $(obj-y).

> I mean it is a quick simple feature that probably works well enough with
> simple build systems. But not an advanced one that builds almost
> everything on demand and also has loadable modules and must act like a
> shared library.
> 
> Real linker DCE is a valid optimisation that can't be replaced by the
> build system of course, but we need to do it properly. Here's what I'm
> working on.
> 
> It applies on top of the previous patch I sent, plus some powerpc stuff
> I'm working on that you should be able to just ignore for another arch.
> it's a WIP, but if you can see if it works for arm that would be cool.
> 
> It doesn't actually build allyesconfig after this,
> ld: .tmp_vmlinux1: Too many sections: 220655 (>= 65280)
> 
> But on a more reasonable configuration (ppc64le)
> text  data   bssdec   filename
> 11191672   1183536   1923820   14299028   vmlinux
> 10625528861895   1919707   13407130 vmlinux.thin+gc
> 
> 10M-552K   1M-314K ~   13M-870K

Nice!

> And it actually boots too, which is fairly astounding considering that
> it lost half a meg of code and 1/3 of its data. I'm not completely sure
> I've not done something wrong...

Nicolas Pitre has done some related work, adding him to Cc. IIRC we have

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 9:47:13 PM CEST Nicholas Piggin wrote:
> On Thu, 04 Aug 2016 12:37:41 +0200 Arnd Bergmann  wrote:
> > On Thursday, August 4, 2016 11:00:49 AM CEST Arnd Bergmann wrote:
> > > I tried this
> > > 
> > > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > > index b5e40ed86e60..89bca1a25916 100755
> > > --- a/scripts/link-vmlinux.sh
> > > +++ b/scripts/link-vmlinux.sh
> > > @@ -44,7 +44,7 @@ modpost_link()
> > > local objects
> > >  
> > > if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
> > > -   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
> > > ${KBUILD_VMLINUX_MAIN} --no-whole-archive"
> > > +   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
> > > else
> > > objects="${KBUILD_VMLINUX_INIT} --start-group 
> > > ${KBUILD_VMLINUX_MAIN} --end-group"
> > > fi
> > > 
> > > but that did not seem to change anything, the extra symbols are
> > > still there. I have not tried to understand what that actually
> > > does, so maybe I misunderstood your suggestion.
> > >   
> > 
> > On a second attempt, I did the same change for vmlinux instead of the
> > module (d'oh), and got a link failure instead:
> > 
> > 
> > arch/arm/mm/proc-xscale.o: In function `cpu_xscale_do_resume':
> > (.text+0x3d4): undefined reference to `cpu_resume_mmu'
> > arch/arm/kernel/setup.o: In function `setup_arch':
> > ...
> > 
> > However, I also see a link failure in some rare configurations
> > with just your patch:
> > 
> > arch/arm/lib/lib.a(io-acorn.o): In function `outsl':
> > (.text+0x38): undefined reference to `printk'
> > 
> > The problem being a file in a library object that is not referenced,
> > but that references another symbol that is not defined
> > (CONFIG_PRINTK=n).
> 
> The first problem is the existing link system is buggy. I think an
> unconditional switch to --whole-archive (at least for modular kernels)
> should probably be done anyway. For example, on powerpc when building
> with --whole-archive, I have:
> 
> +dma_noop_alloc
> +dma_noop_free
> +dma_noop_map_page
> +dma_noop_mapping_error
> +dma_noop_map_sg
> +dma_noop_ops
> +dma_noop_supported
> +fdt_add_reservemap_entry
> +fdt_begin_node
> +fdt_create
> +fdt_create_empty_tree
> +fdt_end_node
> +fdt_errtable
> +find_cpio_data
> +ioremap_page_range
> 
> find_cpio_data is unnecessary and it's a codesize regression to link it.
> But dma_noop_ops and ioremap_page_range are exported symbols. If I
> reference dma_noop_ops from some random module with otherwise unpatched
> kernel:
> 
> ERROR: "dma_noop_ops" [drivers/char/bsr.ko] undefined!

Right, but only on s390, which is the one architecture using this.
I think we should just have a Kconfig symbol for this file that
gets selected by any architecture that needs it.

This is also what we have ended up doing for almost all other
files in lib/

> The real problem is that our linkage requirements are like a shared
> library when we build modular.
> 
> We could build a list of exports and make it link objects with those
> symbols, to solve this, but IMO that's just wasting lipstick on a pig.
> But I will to propose a patch to always use --whole-archive, thin
> archives or not, and transition all archs over to it in a few release
> cycles. It just works by luck right now.
>
> Why is it a pig? Because having the linker to notice no external
> references and just skipping the .o completely is trying to use a hammer
> as a scalpel. It's just not a very effective way to eliminate dead code
> --  I pulled in only a handful of unneeded functions by switching it.

If we do that, we may just as well get rid of $(lib-y) in the process and
always use $(obj-y).

> I mean it is a quick simple feature that probably works well enough with
> simple build systems. But not an advanced one that builds almost
> everything on demand and also has loadable modules and must act like a
> shared library.
> 
> Real linker DCE is a valid optimisation that can't be replaced by the
> build system of course, but we need to do it properly. Here's what I'm
> working on.
> 
> It applies on top of the previous patch I sent, plus some powerpc stuff
> I'm working on that you should be able to just ignore for another arch.
> it's a WIP, but if you can see if it works for arm that would be cool.
> 
> It doesn't actually build allyesconfig after this,
> ld: .tmp_vmlinux1: Too many sections: 220655 (>= 65280)
> 
> But on a more reasonable configuration (ppc64le)
> text  data   bssdec   filename
> 11191672   1183536   1923820   14299028   vmlinux
> 10625528861895   1919707   13407130 vmlinux.thin+gc
> 
> 10M-552K   1M-314K ~   13M-870K

Nice!

> And it actually boots too, which is fairly astounding considering that
> it lost half a meg of code and 1/3 of its data. I'm not completely sure
> I've not done something wrong...

Nicolas Pitre has done some related work, adding him to Cc. IIRC we have
actually had 

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Nicholas Piggin
On Thu, 04 Aug 2016 12:37:41 +0200
Arnd Bergmann  wrote:

> On Thursday, August 4, 2016 11:00:49 AM CEST Arnd Bergmann wrote:
> > I tried this
> > 
> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > index b5e40ed86e60..89bca1a25916 100755
> > --- a/scripts/link-vmlinux.sh
> > +++ b/scripts/link-vmlinux.sh
> > @@ -44,7 +44,7 @@ modpost_link()
> > local objects
> >  
> > if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
> > -   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
> > ${KBUILD_VMLINUX_MAIN} --no-whole-archive"
> > +   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
> > else
> > objects="${KBUILD_VMLINUX_INIT} --start-group 
> > ${KBUILD_VMLINUX_MAIN} --end-group"
> > fi
> > 
> > but that did not seem to change anything, the extra symbols are
> > still there. I have not tried to understand what that actually
> > does, so maybe I misunderstood your suggestion.
> >   
> 
> On a second attempt, I did the same change for vmlinux instead of the
> module (d'oh), and got a link failure instead:
> 
> 
> arch/arm/mm/proc-xscale.o: In function `cpu_xscale_do_resume':
> (.text+0x3d4): undefined reference to `cpu_resume_mmu'
> arch/arm/kernel/setup.o: In function `setup_arch':
> setup.c:(.init.text+0x910): undefined reference to `init_uts_ns'
> kernel/nsproxy.o:(.data+0x4): undefined reference to `init_uts_ns'
> kernel/sched/core.o: In function `update_rq_clock':
> core.c:(.text+0x6d8): undefined reference to `paravirt_steal_rq_enabled'
> core.c:(.text+0x6dc): undefined reference to `pv_time_ops'
> kernel/sched/cputime.o: In function `account_process_tick':
> cputime.c:(.text+0x794): undefined reference to `paravirt_steal_enabled'
> cputime.c:(.text+0x7a0): undefined reference to `pv_time_ops'
> kernel/locking/lockdep.o: In function `save_trace':
> lockdep.c:(.text+0xfe8): undefined reference to `save_stack_trace'
> kernel/module.o: In function `load_module':
> module.c:(.text+0x1b54): undefined reference to `elf_check_arch'
> module.c:(.text+0x2024): undefined reference to `apply_relocate'
> kernel/debug/debug_core.o: In function `kgdb_unregister_io_module':
> debug_core.c:(.text+0x2e4): undefined reference to `kgdb_arch_exit'
> kernel/debug/debug_core.o: In function `kgdb_arch_set_breakpoint':
> debug_core.c:(.text+0x3bc): undefined reference to `arch_kgdb_ops'
> kernel/debug/debug_core.o: In function `dbg_remove_all_break':
> debug_core.c:(.text+0x6d0): undefined reference to `arch_kgdb_ops'
> ...
> 
> However, I also see a link failure in some rare configurations
> with just your patch:
> 
> arch/arm/lib/lib.a(io-acorn.o): In function `outsl':
> (.text+0x38): undefined reference to `printk'
> 
> The problem being a file in a library object that is not referenced,
> but that references another symbol that is not defined
> (CONFIG_PRINTK=n).

The first problem is the existing link system is buggy. I think an
unconditional switch to --whole-archive (at least for modular kernels)
should probably be done anyway. For example, on powerpc when building
with --whole-archive, I have:

+dma_noop_alloc
+dma_noop_free
+dma_noop_map_page
+dma_noop_mapping_error
+dma_noop_map_sg
+dma_noop_ops
+dma_noop_supported
+fdt_add_reservemap_entry
+fdt_begin_node
+fdt_create
+fdt_create_empty_tree
+fdt_end_node
+fdt_errtable
+find_cpio_data
+ioremap_page_range

find_cpio_data is unnecessary and it's a codesize regression to link it.
But dma_noop_ops and ioremap_page_range are exported symbols. If I
reference dma_noop_ops from some random module with otherwise unpatched
kernel:

ERROR: "dma_noop_ops" [drivers/char/bsr.ko] undefined!

The real problem is that our linkage requirements are like a shared
library when we build modular.

We could build a list of exports and make it link objects with those
symbols, to solve this, but IMO that's just wasting lipstick on a pig.
But I will to propose a patch to always use --whole-archive, thin
archives or not, and transition all archs over to it in a few release
cycles. It just works by luck right now.

Why is it a pig? Because having the linker to notice no external
references and just skipping the .o completely is trying to use a hammer
as a scalpel. It's just not a very effective way to eliminate dead code
--  I pulled in only a handful of unneeded functions by switching it.

I mean it is a quick simple feature that probably works well enough with
simple build systems. But not an advanced one that builds almost
everything on demand and also has loadable modules and must act like a
shared library.

Real linker DCE is a valid optimisation that can't be replaced by the
build system of course, but we need to do it properly. Here's what I'm
working on.

It applies on top of the previous patch I sent, plus some powerpc stuff
I'm working on that you should be able to just ignore for another arch.
it's a WIP, but if you can see if it works for arm that would be cool.

It doesn't actually 

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Nicholas Piggin
On Thu, 04 Aug 2016 12:37:41 +0200
Arnd Bergmann  wrote:

> On Thursday, August 4, 2016 11:00:49 AM CEST Arnd Bergmann wrote:
> > I tried this
> > 
> > diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> > index b5e40ed86e60..89bca1a25916 100755
> > --- a/scripts/link-vmlinux.sh
> > +++ b/scripts/link-vmlinux.sh
> > @@ -44,7 +44,7 @@ modpost_link()
> > local objects
> >  
> > if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
> > -   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
> > ${KBUILD_VMLINUX_MAIN} --no-whole-archive"
> > +   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
> > else
> > objects="${KBUILD_VMLINUX_INIT} --start-group 
> > ${KBUILD_VMLINUX_MAIN} --end-group"
> > fi
> > 
> > but that did not seem to change anything, the extra symbols are
> > still there. I have not tried to understand what that actually
> > does, so maybe I misunderstood your suggestion.
> >   
> 
> On a second attempt, I did the same change for vmlinux instead of the
> module (d'oh), and got a link failure instead:
> 
> 
> arch/arm/mm/proc-xscale.o: In function `cpu_xscale_do_resume':
> (.text+0x3d4): undefined reference to `cpu_resume_mmu'
> arch/arm/kernel/setup.o: In function `setup_arch':
> setup.c:(.init.text+0x910): undefined reference to `init_uts_ns'
> kernel/nsproxy.o:(.data+0x4): undefined reference to `init_uts_ns'
> kernel/sched/core.o: In function `update_rq_clock':
> core.c:(.text+0x6d8): undefined reference to `paravirt_steal_rq_enabled'
> core.c:(.text+0x6dc): undefined reference to `pv_time_ops'
> kernel/sched/cputime.o: In function `account_process_tick':
> cputime.c:(.text+0x794): undefined reference to `paravirt_steal_enabled'
> cputime.c:(.text+0x7a0): undefined reference to `pv_time_ops'
> kernel/locking/lockdep.o: In function `save_trace':
> lockdep.c:(.text+0xfe8): undefined reference to `save_stack_trace'
> kernel/module.o: In function `load_module':
> module.c:(.text+0x1b54): undefined reference to `elf_check_arch'
> module.c:(.text+0x2024): undefined reference to `apply_relocate'
> kernel/debug/debug_core.o: In function `kgdb_unregister_io_module':
> debug_core.c:(.text+0x2e4): undefined reference to `kgdb_arch_exit'
> kernel/debug/debug_core.o: In function `kgdb_arch_set_breakpoint':
> debug_core.c:(.text+0x3bc): undefined reference to `arch_kgdb_ops'
> kernel/debug/debug_core.o: In function `dbg_remove_all_break':
> debug_core.c:(.text+0x6d0): undefined reference to `arch_kgdb_ops'
> ...
> 
> However, I also see a link failure in some rare configurations
> with just your patch:
> 
> arch/arm/lib/lib.a(io-acorn.o): In function `outsl':
> (.text+0x38): undefined reference to `printk'
> 
> The problem being a file in a library object that is not referenced,
> but that references another symbol that is not defined
> (CONFIG_PRINTK=n).

The first problem is the existing link system is buggy. I think an
unconditional switch to --whole-archive (at least for modular kernels)
should probably be done anyway. For example, on powerpc when building
with --whole-archive, I have:

+dma_noop_alloc
+dma_noop_free
+dma_noop_map_page
+dma_noop_mapping_error
+dma_noop_map_sg
+dma_noop_ops
+dma_noop_supported
+fdt_add_reservemap_entry
+fdt_begin_node
+fdt_create
+fdt_create_empty_tree
+fdt_end_node
+fdt_errtable
+find_cpio_data
+ioremap_page_range

find_cpio_data is unnecessary and it's a codesize regression to link it.
But dma_noop_ops and ioremap_page_range are exported symbols. If I
reference dma_noop_ops from some random module with otherwise unpatched
kernel:

ERROR: "dma_noop_ops" [drivers/char/bsr.ko] undefined!

The real problem is that our linkage requirements are like a shared
library when we build modular.

We could build a list of exports and make it link objects with those
symbols, to solve this, but IMO that's just wasting lipstick on a pig.
But I will to propose a patch to always use --whole-archive, thin
archives or not, and transition all archs over to it in a few release
cycles. It just works by luck right now.

Why is it a pig? Because having the linker to notice no external
references and just skipping the .o completely is trying to use a hammer
as a scalpel. It's just not a very effective way to eliminate dead code
--  I pulled in only a handful of unneeded functions by switching it.

I mean it is a quick simple feature that probably works well enough with
simple build systems. But not an advanced one that builds almost
everything on demand and also has loadable modules and must act like a
shared library.

Real linker DCE is a valid optimisation that can't be replaced by the
build system of course, but we need to do it properly. Here's what I'm
working on.

It applies on top of the previous patch I sent, plus some powerpc stuff
I'm working on that you should be able to just ignore for another arch.
it's a WIP, but if you can see if it works for arm that would be cool.

It doesn't actually build 

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 11:00:49 AM CEST Arnd Bergmann wrote:
> I tried this
> 
> diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> index b5e40ed86e60..89bca1a25916 100755
> --- a/scripts/link-vmlinux.sh
> +++ b/scripts/link-vmlinux.sh
> @@ -44,7 +44,7 @@ modpost_link()
> local objects
>  
> if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
> -   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
> ${KBUILD_VMLINUX_MAIN} --no-whole-archive"
> +   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
> else
> objects="${KBUILD_VMLINUX_INIT} --start-group 
> ${KBUILD_VMLINUX_MAIN} --end-group"
> fi
> 
> but that did not seem to change anything, the extra symbols are
> still there. I have not tried to understand what that actually
> does, so maybe I misunderstood your suggestion.
> 

On a second attempt, I did the same change for vmlinux instead of the
module (d'oh), and got a link failure instead:


arch/arm/mm/proc-xscale.o: In function `cpu_xscale_do_resume':
(.text+0x3d4): undefined reference to `cpu_resume_mmu'
arch/arm/kernel/setup.o: In function `setup_arch':
setup.c:(.init.text+0x910): undefined reference to `init_uts_ns'
kernel/nsproxy.o:(.data+0x4): undefined reference to `init_uts_ns'
kernel/sched/core.o: In function `update_rq_clock':
core.c:(.text+0x6d8): undefined reference to `paravirt_steal_rq_enabled'
core.c:(.text+0x6dc): undefined reference to `pv_time_ops'
kernel/sched/cputime.o: In function `account_process_tick':
cputime.c:(.text+0x794): undefined reference to `paravirt_steal_enabled'
cputime.c:(.text+0x7a0): undefined reference to `pv_time_ops'
kernel/locking/lockdep.o: In function `save_trace':
lockdep.c:(.text+0xfe8): undefined reference to `save_stack_trace'
kernel/module.o: In function `load_module':
module.c:(.text+0x1b54): undefined reference to `elf_check_arch'
module.c:(.text+0x2024): undefined reference to `apply_relocate'
kernel/debug/debug_core.o: In function `kgdb_unregister_io_module':
debug_core.c:(.text+0x2e4): undefined reference to `kgdb_arch_exit'
kernel/debug/debug_core.o: In function `kgdb_arch_set_breakpoint':
debug_core.c:(.text+0x3bc): undefined reference to `arch_kgdb_ops'
kernel/debug/debug_core.o: In function `dbg_remove_all_break':
debug_core.c:(.text+0x6d0): undefined reference to `arch_kgdb_ops'
...

However, I also see a link failure in some rare configurations
with just your patch:

arch/arm/lib/lib.a(io-acorn.o): In function `outsl':
(.text+0x38): undefined reference to `printk'

The problem being a file in a library object that is not referenced,
but that references another symbol that is not defined
(CONFIG_PRINTK=n).

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 11:00:49 AM CEST Arnd Bergmann wrote:
> I tried this
> 
> diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
> index b5e40ed86e60..89bca1a25916 100755
> --- a/scripts/link-vmlinux.sh
> +++ b/scripts/link-vmlinux.sh
> @@ -44,7 +44,7 @@ modpost_link()
> local objects
>  
> if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
> -   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
> ${KBUILD_VMLINUX_MAIN} --no-whole-archive"
> +   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
> else
> objects="${KBUILD_VMLINUX_INIT} --start-group 
> ${KBUILD_VMLINUX_MAIN} --end-group"
> fi
> 
> but that did not seem to change anything, the extra symbols are
> still there. I have not tried to understand what that actually
> does, so maybe I misunderstood your suggestion.
> 

On a second attempt, I did the same change for vmlinux instead of the
module (d'oh), and got a link failure instead:


arch/arm/mm/proc-xscale.o: In function `cpu_xscale_do_resume':
(.text+0x3d4): undefined reference to `cpu_resume_mmu'
arch/arm/kernel/setup.o: In function `setup_arch':
setup.c:(.init.text+0x910): undefined reference to `init_uts_ns'
kernel/nsproxy.o:(.data+0x4): undefined reference to `init_uts_ns'
kernel/sched/core.o: In function `update_rq_clock':
core.c:(.text+0x6d8): undefined reference to `paravirt_steal_rq_enabled'
core.c:(.text+0x6dc): undefined reference to `pv_time_ops'
kernel/sched/cputime.o: In function `account_process_tick':
cputime.c:(.text+0x794): undefined reference to `paravirt_steal_enabled'
cputime.c:(.text+0x7a0): undefined reference to `pv_time_ops'
kernel/locking/lockdep.o: In function `save_trace':
lockdep.c:(.text+0xfe8): undefined reference to `save_stack_trace'
kernel/module.o: In function `load_module':
module.c:(.text+0x1b54): undefined reference to `elf_check_arch'
module.c:(.text+0x2024): undefined reference to `apply_relocate'
kernel/debug/debug_core.o: In function `kgdb_unregister_io_module':
debug_core.c:(.text+0x2e4): undefined reference to `kgdb_arch_exit'
kernel/debug/debug_core.o: In function `kgdb_arch_set_breakpoint':
debug_core.c:(.text+0x3bc): undefined reference to `arch_kgdb_ops'
kernel/debug/debug_core.o: In function `dbg_remove_all_break':
debug_core.c:(.text+0x6d0): undefined reference to `arch_kgdb_ops'
...

However, I also see a link failure in some rare configurations
with just your patch:

arch/arm/lib/lib.a(io-acorn.o): In function `outsl':
(.text+0x38): undefined reference to `printk'

The problem being a file in a library object that is not referenced,
but that references another symbol that is not defined
(CONFIG_PRINTK=n).

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 10:10:51 AM CEST Stephen Rothwell wrote:
> Hi Arnd,
> 
> On Wed, 03 Aug 2016 20:52:48 +0200 Arnd Bergmann  wrote:
> >
> > Most of the difference appears to be in branch trampolines (634 added,
> > 559 removed, 14837 unchanged) as you suspect, but I also see a couple
> > of symbols show up in vmlinux that were not there before:
> > 
> > -A __crc_dma_noop_ops
> > -D dma_noop_ops
> > -R __clz_tab
> > -r fdt_errtable
> > -r __kcrctab_dma_noop_ops
> > -r __kstrtab_dma_noop_ops
> > -R __ksymtab_dma_noop_ops
> > -t dma_noop_alloc
> > -t dma_noop_free
> > -t dma_noop_map_page
> > -t dma_noop_mapping_error
> > -t dma_noop_map_sg
> > -t dma_noop_supported
> > -T fdt_add_reservemap_entry
> > -T fdt_begin_node
> > -T fdt_create
> > -T fdt_create_empty_tree
> > -T fdt_end_node
> > -T fdt_finish
> > -T fdt_finish_reservemap
> > -T fdt_property
> > -T fdt_resize
> > -T fdt_strerror
> > -T find_cpio_data
> > 
> > From my first look, it seems that all of lib/*.o is now getting linked
> > into vmlinux, while we traditionally leave out everything from lib/
> > that is not referenced.
> 
> You could try removing the --{,no-}whole-archive arguments to ld in
> scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh.  Last time I did
> that, though, a whole lot of stuff failed to be linked in. (Especially
> stuff only referenced by EXPORT_SYMBOL()s, bu that may have been fixed).

I tried this

diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index b5e40ed86e60..89bca1a25916 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -44,7 +44,7 @@ modpost_link()
local objects
 
if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
-   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
${KBUILD_VMLINUX_MAIN} --no-whole-archive"
+   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
else
objects="${KBUILD_VMLINUX_INIT} --start-group 
${KBUILD_VMLINUX_MAIN} --end-group"
fi

but that did not seem to change anything, the extra symbols are
still there. I have not tried to understand what that actually
does, so maybe I misunderstood your suggestion.

> > I also see a noticeable overhead in link time, the numbers are for
> > a cache-hot rebuild after a successful allyesconfig build, using a
> > 24-way Opteron@2.5Ghz, just relinking vmlinux:
> 
> I was afraid of that, but it is offset by the time saved by not doing
> the "ld -r"s along the way?  It may also be that (for powerpc anyway)
> the linker is doing a better job.

At least on a big SMP system, it doesn't seem to make much difference,
as the "ld -r" steps are easily parallized

$ find build/ -name built-in.o | xargs rm ; time make -skj30 vmlinux
real2m12.092s
user3m52.932s
sys 0m51.248s

$ time make -skj30 vmlinux
real2m12.162s
user3m44.788s
sys 0m47.788s

I tried this twice with identical results: "user" time increases
by eight seconds today when we have to rebuild all "built-in.o"
files rather than just relinking vmlinux, but elapsed time
is unchanged.

After your patch that difference becomes smaller (three seconds
in one run, could be within the noise), but we still have the
extra two minutes for the total build time:

$ find build/ -name built-in.o | xargs rm ; time make -skj30 vmlinux
real4m20.717s
user5m47.556s
sys 0m54.128s

$ time make -skj30 vmlinux
real4m18.835s
user5m44.552s
sys 0m53.152s

FWIW, here is a sample build output I get on an allyesconfig build,
with timestamps added:

$ time make W= -kj30 vmlinux 
make[1]: Entering directory '/git/arm-soc'
make[2]: Entering directory '/git/arm-soc/build/tmp'
10:46:12   CHK include/config/kernel.release
10:46:13   GEN ./Makefile
10:46:13   CHK include/generated/uapi/linux/version.h
  Using /git/arm-soc as source for kernel
10:46:13   CHK include/generated/utsrelease.h
10:46:13   CHK include/generated/timeconst.h
10:46:13   CHK include/generated/bounds.h
10:46:13   CHK include/generated/asm-offsets.h
10:46:13   CALL/git/arm-soc/scripts/checksyscalls.sh
10:46:14   CHK include/generated/compile.h
10:46:18   CHK kernel/config_data.h
10:46:20   CC  drivers/misc/lkdtm_rodata.o
10:46:20   OBJCOPY drivers/misc/lkdtm_rodata_objcopy.o
10:46:20   LD  drivers/misc/lkdtm.o
10:46:20   LD  drivers/misc/built-in.o
10:46:20   DTC drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb
10:46:20   DTB drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb.S
10:46:20   AS  drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb.o
10:46:20   LD  drivers/gpu/drm/tilcdc/built-in.o
rm drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb.S 
drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb
10:46:33   LD  drivers/gpu/drm/built-in.o
10:46:33   LD  drivers/gpu/built-in.o
10:46:36   CHK include/generated/uapi/linux/version.h
10:46:36   LINKvmlinux
10:46:37   LD  vmlinux.o
10:47:14   MODPOST vmlinux.o
10:47:16   GEN .version

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-04 Thread Arnd Bergmann
On Thursday, August 4, 2016 10:10:51 AM CEST Stephen Rothwell wrote:
> Hi Arnd,
> 
> On Wed, 03 Aug 2016 20:52:48 +0200 Arnd Bergmann  wrote:
> >
> > Most of the difference appears to be in branch trampolines (634 added,
> > 559 removed, 14837 unchanged) as you suspect, but I also see a couple
> > of symbols show up in vmlinux that were not there before:
> > 
> > -A __crc_dma_noop_ops
> > -D dma_noop_ops
> > -R __clz_tab
> > -r fdt_errtable
> > -r __kcrctab_dma_noop_ops
> > -r __kstrtab_dma_noop_ops
> > -R __ksymtab_dma_noop_ops
> > -t dma_noop_alloc
> > -t dma_noop_free
> > -t dma_noop_map_page
> > -t dma_noop_mapping_error
> > -t dma_noop_map_sg
> > -t dma_noop_supported
> > -T fdt_add_reservemap_entry
> > -T fdt_begin_node
> > -T fdt_create
> > -T fdt_create_empty_tree
> > -T fdt_end_node
> > -T fdt_finish
> > -T fdt_finish_reservemap
> > -T fdt_property
> > -T fdt_resize
> > -T fdt_strerror
> > -T find_cpio_data
> > 
> > From my first look, it seems that all of lib/*.o is now getting linked
> > into vmlinux, while we traditionally leave out everything from lib/
> > that is not referenced.
> 
> You could try removing the --{,no-}whole-archive arguments to ld in
> scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh.  Last time I did
> that, though, a whole lot of stuff failed to be linked in. (Especially
> stuff only referenced by EXPORT_SYMBOL()s, bu that may have been fixed).

I tried this

diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index b5e40ed86e60..89bca1a25916 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -44,7 +44,7 @@ modpost_link()
local objects
 
if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
-   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
${KBUILD_VMLINUX_MAIN} --no-whole-archive"
+   objects="${KBUILD_VMLINUX_INIT} ${KBUILD_VMLINUX_MAIN}"
else
objects="${KBUILD_VMLINUX_INIT} --start-group 
${KBUILD_VMLINUX_MAIN} --end-group"
fi

but that did not seem to change anything, the extra symbols are
still there. I have not tried to understand what that actually
does, so maybe I misunderstood your suggestion.

> > I also see a noticeable overhead in link time, the numbers are for
> > a cache-hot rebuild after a successful allyesconfig build, using a
> > 24-way Opteron@2.5Ghz, just relinking vmlinux:
> 
> I was afraid of that, but it is offset by the time saved by not doing
> the "ld -r"s along the way?  It may also be that (for powerpc anyway)
> the linker is doing a better job.

At least on a big SMP system, it doesn't seem to make much difference,
as the "ld -r" steps are easily parallized

$ find build/ -name built-in.o | xargs rm ; time make -skj30 vmlinux
real2m12.092s
user3m52.932s
sys 0m51.248s

$ time make -skj30 vmlinux
real2m12.162s
user3m44.788s
sys 0m47.788s

I tried this twice with identical results: "user" time increases
by eight seconds today when we have to rebuild all "built-in.o"
files rather than just relinking vmlinux, but elapsed time
is unchanged.

After your patch that difference becomes smaller (three seconds
in one run, could be within the noise), but we still have the
extra two minutes for the total build time:

$ find build/ -name built-in.o | xargs rm ; time make -skj30 vmlinux
real4m20.717s
user5m47.556s
sys 0m54.128s

$ time make -skj30 vmlinux
real4m18.835s
user5m44.552s
sys 0m53.152s

FWIW, here is a sample build output I get on an allyesconfig build,
with timestamps added:

$ time make W= -kj30 vmlinux 
make[1]: Entering directory '/git/arm-soc'
make[2]: Entering directory '/git/arm-soc/build/tmp'
10:46:12   CHK include/config/kernel.release
10:46:13   GEN ./Makefile
10:46:13   CHK include/generated/uapi/linux/version.h
  Using /git/arm-soc as source for kernel
10:46:13   CHK include/generated/utsrelease.h
10:46:13   CHK include/generated/timeconst.h
10:46:13   CHK include/generated/bounds.h
10:46:13   CHK include/generated/asm-offsets.h
10:46:13   CALL/git/arm-soc/scripts/checksyscalls.sh
10:46:14   CHK include/generated/compile.h
10:46:18   CHK kernel/config_data.h
10:46:20   CC  drivers/misc/lkdtm_rodata.o
10:46:20   OBJCOPY drivers/misc/lkdtm_rodata_objcopy.o
10:46:20   LD  drivers/misc/lkdtm.o
10:46:20   LD  drivers/misc/built-in.o
10:46:20   DTC drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb
10:46:20   DTB drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb.S
10:46:20   AS  drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb.o
10:46:20   LD  drivers/gpu/drm/tilcdc/built-in.o
rm drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb.S 
drivers/gpu/drm/tilcdc/tilcdc_slave_compat.dtb
10:46:33   LD  drivers/gpu/drm/built-in.o
10:46:33   LD  drivers/gpu/built-in.o
10:46:36   CHK include/generated/uapi/linux/version.h
10:46:36   LINKvmlinux
10:46:37   LD  vmlinux.o
10:47:14   MODPOST vmlinux.o
10:47:16   GEN .version
10:47:17   CHK 

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Stephen Rothwell
Hi Arnd,

On Wed, 03 Aug 2016 20:52:48 +0200 Arnd Bergmann  wrote:
>
> Most of the difference appears to be in branch trampolines (634 added,
> 559 removed, 14837 unchanged) as you suspect, but I also see a couple
> of symbols show up in vmlinux that were not there before:
> 
> -A __crc_dma_noop_ops
> -D dma_noop_ops
> -R __clz_tab
> -r fdt_errtable
> -r __kcrctab_dma_noop_ops
> -r __kstrtab_dma_noop_ops
> -R __ksymtab_dma_noop_ops
> -t dma_noop_alloc
> -t dma_noop_free
> -t dma_noop_map_page
> -t dma_noop_mapping_error
> -t dma_noop_map_sg
> -t dma_noop_supported
> -T fdt_add_reservemap_entry
> -T fdt_begin_node
> -T fdt_create
> -T fdt_create_empty_tree
> -T fdt_end_node
> -T fdt_finish
> -T fdt_finish_reservemap
> -T fdt_property
> -T fdt_resize
> -T fdt_strerror
> -T find_cpio_data
> 
> From my first look, it seems that all of lib/*.o is now getting linked
> into vmlinux, while we traditionally leave out everything from lib/
> that is not referenced.

You could try removing the --{,no-}whole-archive arguments to ld in
scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh.  Last time I did
that, though, a whole lot of stuff failed to be linked in. (Especially
stuff only referenced by EXPORT_SYMBOL()s, bu that may have been fixed).

> I also see a noticeable overhead in link time, the numbers are for
> a cache-hot rebuild after a successful allyesconfig build, using a
> 24-way Opteron@2.5Ghz, just relinking vmlinux:

I was afraid of that, but it is offset by the time saved by not doing
the "ld -r"s along the way?  It may also be that (for powerpc anyway)
the linker is doing a better job.

-- 
Cheers,
Stephen Rothwell


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Stephen Rothwell
Hi Arnd,

On Wed, 03 Aug 2016 20:52:48 +0200 Arnd Bergmann  wrote:
>
> Most of the difference appears to be in branch trampolines (634 added,
> 559 removed, 14837 unchanged) as you suspect, but I also see a couple
> of symbols show up in vmlinux that were not there before:
> 
> -A __crc_dma_noop_ops
> -D dma_noop_ops
> -R __clz_tab
> -r fdt_errtable
> -r __kcrctab_dma_noop_ops
> -r __kstrtab_dma_noop_ops
> -R __ksymtab_dma_noop_ops
> -t dma_noop_alloc
> -t dma_noop_free
> -t dma_noop_map_page
> -t dma_noop_mapping_error
> -t dma_noop_map_sg
> -t dma_noop_supported
> -T fdt_add_reservemap_entry
> -T fdt_begin_node
> -T fdt_create
> -T fdt_create_empty_tree
> -T fdt_end_node
> -T fdt_finish
> -T fdt_finish_reservemap
> -T fdt_property
> -T fdt_resize
> -T fdt_strerror
> -T find_cpio_data
> 
> From my first look, it seems that all of lib/*.o is now getting linked
> into vmlinux, while we traditionally leave out everything from lib/
> that is not referenced.

You could try removing the --{,no-}whole-archive arguments to ld in
scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh.  Last time I did
that, though, a whole lot of stuff failed to be linked in. (Especially
stuff only referenced by EXPORT_SYMBOL()s, bu that may have been fixed).

> I also see a noticeable overhead in link time, the numbers are for
> a cache-hot rebuild after a successful allyesconfig build, using a
> 24-way Opteron@2.5Ghz, just relinking vmlinux:

I was afraid of that, but it is offset by the time saved by not doing
the "ld -r"s along the way?  It may also be that (for powerpc anyway)
the linker is doing a better job.

-- 
Cheers,
Stephen Rothwell


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Segher Boessenkool
Hi Arnd,

On Wed, Aug 03, 2016 at 08:52:48PM +0200, Arnd Bergmann wrote:
> From my first look, it seems that all of lib/*.o is now getting linked
> into vmlinux, while we traditionally leave out everything from lib/
> that is not referenced.
> 
> I also see a noticeable overhead in link time, the numbers are for
> a cache-hot rebuild after a successful allyesconfig build, using a
> 24-way Opteron@2.5Ghz, just relinking vmlinux:
> 
> $ time make skj30 vmlinux # before
> real  2m8.092s
> user  3m41.008s
> sys   0m48.172s
> 
> $ time make skj30 vmlinux # after
> real  4m10.189s
> user  5m43.804s
> sys   0m52.988s

Is it better when using rcT instead of rcsT?


Segher


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Segher Boessenkool
Hi Arnd,

On Wed, Aug 03, 2016 at 08:52:48PM +0200, Arnd Bergmann wrote:
> From my first look, it seems that all of lib/*.o is now getting linked
> into vmlinux, while we traditionally leave out everything from lib/
> that is not referenced.
> 
> I also see a noticeable overhead in link time, the numbers are for
> a cache-hot rebuild after a successful allyesconfig build, using a
> 24-way Opteron@2.5Ghz, just relinking vmlinux:
> 
> $ time make skj30 vmlinux # before
> real  2m8.092s
> user  3m41.008s
> sys   0m48.172s
> 
> $ time make skj30 vmlinux # after
> real  4m10.189s
> user  5m43.804s
> sys   0m52.988s

Is it better when using rcT instead of rcsT?


Segher


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Arnd Bergmann
On Wednesday, August 3, 2016 2:44:29 PM CEST Segher Boessenkool wrote:
> Hi Arnd,
> 
> On Wed, Aug 03, 2016 at 08:52:48PM +0200, Arnd Bergmann wrote:
> > From my first look, it seems that all of lib/*.o is now getting linked
> > into vmlinux, while we traditionally leave out everything from lib/
> > that is not referenced.
> > 
> > I also see a noticeable overhead in link time, the numbers are for
> > a cache-hot rebuild after a successful allyesconfig build, using a
> > 24-way Opteron@2.5Ghz, just relinking vmlinux:
> > 
> > $ time make skj30 vmlinux # before
> > real2m8.092s
> > user3m41.008s
> > sys 0m48.172s
> > 
> > $ time make skj30 vmlinux # after
> > real4m10.189s
> > user5m43.804s
> > sys 0m52.988s
> 
> Is it better when using rcT instead of rcsT?

It seems to be noticeably better for the clean rebuild case, though
not as good as the original:

real3m34.015s
user5m7.104s
sys 0m49.172s

I've also tried now with my own patch applied as well (linking
each drivers/*/built-in.o into vmlinux rather than having them
linked into drivers/built-in.o first), but that makes no
difference.

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Arnd Bergmann
On Wednesday, August 3, 2016 2:44:29 PM CEST Segher Boessenkool wrote:
> Hi Arnd,
> 
> On Wed, Aug 03, 2016 at 08:52:48PM +0200, Arnd Bergmann wrote:
> > From my first look, it seems that all of lib/*.o is now getting linked
> > into vmlinux, while we traditionally leave out everything from lib/
> > that is not referenced.
> > 
> > I also see a noticeable overhead in link time, the numbers are for
> > a cache-hot rebuild after a successful allyesconfig build, using a
> > 24-way Opteron@2.5Ghz, just relinking vmlinux:
> > 
> > $ time make skj30 vmlinux # before
> > real2m8.092s
> > user3m41.008s
> > sys 0m48.172s
> > 
> > $ time make skj30 vmlinux # after
> > real4m10.189s
> > user5m43.804s
> > sys 0m52.988s
> 
> Is it better when using rcT instead of rcsT?

It seems to be noticeably better for the clean rebuild case, though
not as good as the original:

real3m34.015s
user5m7.104s
sys 0m49.172s

I've also tried now with my own patch applied as well (linking
each drivers/*/built-in.o into vmlinux rather than having them
linked into drivers/built-in.o first), but that makes no
difference.

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Arnd Bergmann
On Thursday, August 4, 2016 1:37:29 AM CEST Nicholas Piggin wrote:
> 
> I've attached what I'm using, which builds and runs for me without
> any work. Your arch obviously has to select the option to use it.
> 
> text  data bss  dec   hex filename
> 11196784  1185024  1923820  14305628  da495c  vmlinuxppc64.before
> 11187536  1181848  1923176  14292560  da1650  vmlinuxppc64.after
> 
> ~9K text saving, ~3K data saving. I assume this comes from fewer
> branch trampolines and toc entries, but haven't verified exactly.

The patch seems to work great, but for me it's getting bigger
(compared to my older patch, mainline allyesconfig doesn't build):

   textdata bss dec hex filename
512998684259955923362148117261575   6fd4507 
vmlinuxarm.before
513025454259501523361884117259444   6fd3cb4 
vmlinuxarm.after

Most of the difference appears to be in branch trampolines (634 added,
559 removed, 14837 unchanged) as you suspect, but I also see a couple
of symbols show up in vmlinux that were not there before:

-A __crc_dma_noop_ops
-D dma_noop_ops
-R __clz_tab
-r fdt_errtable
-r __kcrctab_dma_noop_ops
-r __kstrtab_dma_noop_ops
-R __ksymtab_dma_noop_ops
-t dma_noop_alloc
-t dma_noop_free
-t dma_noop_map_page
-t dma_noop_mapping_error
-t dma_noop_map_sg
-t dma_noop_supported
-T fdt_add_reservemap_entry
-T fdt_begin_node
-T fdt_create
-T fdt_create_empty_tree
-T fdt_end_node
-T fdt_finish
-T fdt_finish_reservemap
-T fdt_property
-T fdt_resize
-T fdt_strerror
-T find_cpio_data

>From my first look, it seems that all of lib/*.o is now getting linked
into vmlinux, while we traditionally leave out everything from lib/
that is not referenced.

I also see a noticeable overhead in link time, the numbers are for
a cache-hot rebuild after a successful allyesconfig build, using a
24-way Opteron@2.5Ghz, just relinking vmlinux:

$ time make skj30 vmlinux # before
real2m8.092s
user3m41.008s
sys 0m48.172s

$ time make skj30 vmlinux # after
real4m10.189s
user5m43.804s
sys 0m52.988s

That is clearly a very sharp difference. Fortunately for the defconfig
build, the times are much lower, and I see no real difference other
than the noise between subsequent runs:

$ time make skj30 vmlinux # before
real0m5.415s
user0m19.716s
sys 0m9.356s
$ time make skj30 vmlinux # before
real0m9.536s
user0m21.320s
sys 0m9.224s


$ time make skj30 vmlinux # after
real0m5.539s
user0m20.360s
sys 0m9.224s

$ time make skj30 vmlinux # after
real0m9.138s
user0m21.932s
sys 0m8.988s

$ time make skj30 vmlinux # after
real0m5.659s
user0m20.332s
sys 0m9.620s

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Arnd Bergmann
On Thursday, August 4, 2016 1:37:29 AM CEST Nicholas Piggin wrote:
> 
> I've attached what I'm using, which builds and runs for me without
> any work. Your arch obviously has to select the option to use it.
> 
> text  data bss  dec   hex filename
> 11196784  1185024  1923820  14305628  da495c  vmlinuxppc64.before
> 11187536  1181848  1923176  14292560  da1650  vmlinuxppc64.after
> 
> ~9K text saving, ~3K data saving. I assume this comes from fewer
> branch trampolines and toc entries, but haven't verified exactly.

The patch seems to work great, but for me it's getting bigger
(compared to my older patch, mainline allyesconfig doesn't build):

   textdata bss dec hex filename
512998684259955923362148117261575   6fd4507 
vmlinuxarm.before
513025454259501523361884117259444   6fd3cb4 
vmlinuxarm.after

Most of the difference appears to be in branch trampolines (634 added,
559 removed, 14837 unchanged) as you suspect, but I also see a couple
of symbols show up in vmlinux that were not there before:

-A __crc_dma_noop_ops
-D dma_noop_ops
-R __clz_tab
-r fdt_errtable
-r __kcrctab_dma_noop_ops
-r __kstrtab_dma_noop_ops
-R __ksymtab_dma_noop_ops
-t dma_noop_alloc
-t dma_noop_free
-t dma_noop_map_page
-t dma_noop_mapping_error
-t dma_noop_map_sg
-t dma_noop_supported
-T fdt_add_reservemap_entry
-T fdt_begin_node
-T fdt_create
-T fdt_create_empty_tree
-T fdt_end_node
-T fdt_finish
-T fdt_finish_reservemap
-T fdt_property
-T fdt_resize
-T fdt_strerror
-T find_cpio_data

>From my first look, it seems that all of lib/*.o is now getting linked
into vmlinux, while we traditionally leave out everything from lib/
that is not referenced.

I also see a noticeable overhead in link time, the numbers are for
a cache-hot rebuild after a successful allyesconfig build, using a
24-way Opteron@2.5Ghz, just relinking vmlinux:

$ time make skj30 vmlinux # before
real2m8.092s
user3m41.008s
sys 0m48.172s

$ time make skj30 vmlinux # after
real4m10.189s
user5m43.804s
sys 0m52.988s

That is clearly a very sharp difference. Fortunately for the defconfig
build, the times are much lower, and I see no real difference other
than the noise between subsequent runs:

$ time make skj30 vmlinux # before
real0m5.415s
user0m19.716s
sys 0m9.356s
$ time make skj30 vmlinux # before
real0m9.536s
user0m21.320s
sys 0m9.224s


$ time make skj30 vmlinux # after
real0m5.539s
user0m20.360s
sys 0m9.224s

$ time make skj30 vmlinux # after
real0m9.138s
user0m21.932s
sys 0m8.988s

$ time make skj30 vmlinux # after
real0m5.659s
user0m20.332s
sys 0m9.620s

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Nicholas Piggin
On Wed, 03 Aug 2016 14:29:13 +0200
Arnd Bergmann  wrote:

> On Wednesday, August 3, 2016 10:19:11 PM CEST Stephen Rothwell wrote:
> > Hi Arnd,
> > 
> > On Wed, 03 Aug 2016 09:52:23 +0200 Arnd Bergmann  wrote:  
> > >
> > > Using a different way to link the kernel would also help us with
> > > the remaining allyesconfig problem on ARM, as the problem is only in
> > > 'ld -r' not producing trampolines for symbols that later cannot get
> > > them any more. It would probably also help building with ld.gold,
> > > which is currently not working.
> > > 
> > > What is your suggested alternative?  
> > 
> > I have a patch that make the built-in.o files into thin archives (same
> > as archives, but the actual objects are replaced with the name of the
> > original object file).  That way the final link has all the original
> > objects.  I haven't checked to see what the overheads of doing it this
> > way is.
> > 
> > Nick Piggin has just today taken my old patch (it was last rebased to
> > v4.4-rc1) and tried it on a recent kernel and it still seems to mostly
> > work.  It probably needs some tidying up, but you are welcome to test
> > it if you want to.  
> 
> Sure, I'll certainly give it a try on ARM when you send me a copy.

I've attached what I'm using, which builds and runs for me without
any work. Your arch obviously has to select the option to use it.

text  data bss  dec   hex filename
11196784  1185024  1923820  14305628  da495c  vmlinuxppc64.before
11187536  1181848  1923176  14292560  da1650  vmlinuxppc64.after

~9K text saving, ~3K data saving. I assume this comes from fewer
branch trampolines and toc entries, but haven't verified exactly.



commit 8bc3ca4798c215e9a9107b6d44408f0af259f84f
Author: Stephen Rothwell 
Date:   Tue Oct 30 12:14:18 2012 +1100

kbuild: allow architectures to use thin archives instead of ld -r

Alan Modra has been trying to convince the kernel developers that ld -r
is "evil" for many years.  This is an alternative and means that the
linker has much more information available to it when it links the
kernel.

Signed-off-by: Stephen Rothwell 

diff --git a/arch/Kconfig b/arch/Kconfig
index d794384..1330bf4 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -424,6 +424,12 @@ config CC_STACKPROTECTOR_STRONG
 
 endchoice
 
+config THIN_ARCHIVES
+   bool
+   help
+ Select this if the architecture wants to use thin archives
+ instead of ld -r to create the built-in.o files.
+
 config HAVE_CONTEXT_TRACKING
bool
help
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 0d1ca5b..bbf60b3 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -358,10 +358,15 @@ $(sort $(subdir-obj-y)): $(subdir-ym) ;
 # Rule to compile a set of .o files into one .o file
 #
 ifdef builtin-target
+ifdef CONFIG_THIN_ARCHIVES
+  cmd_make_builtin = rm -f $@; $(AR) rcsT$(KBUILD_ARFLAGS)
+else
+  cmd_make_builtin = $(LD) $(ld_flags) -r -o
+endif
 quiet_cmd_link_o_target = LD  $@
 # If the list of objects to link is empty, just create an empty built-in.o
 cmd_link_o_target = $(if $(strip $(obj-y)),\
- $(LD) $(ld_flags) -r -o $@ $(filter $(obj-y), $^) \
+ $(cmd_make_builtin) $@ $(filter $(obj-y), $^) \
  $(cmd_secanalysis),\
  rm -f $@; $(AR) rcs$(KBUILD_ARFLAGS) $@)
 
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index f0f6d9d..ef4658f 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -41,8 +41,14 @@ info()
 # ${1} output file
 modpost_link()
 {
-   ${LD} ${LDFLAGS} -r -o ${1} ${KBUILD_VMLINUX_INIT}   \
-   --start-group ${KBUILD_VMLINUX_MAIN} --end-group
+   local objects
+
+   if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
+   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
${KBUILD_VMLINUX_MAIN} --no-whole-archive"
+   else
+   objects="${KBUILD_VMLINUX_INIT} --start-group 
${KBUILD_VMLINUX_MAIN} --end-group"
+   fi
+   ${LD} ${LDFLAGS} -r -o ${1} ${objects}
 }
 
 # Link of vmlinux
@@ -51,11 +57,16 @@ modpost_link()
 vmlinux_link()
 {
local lds="${objtree}/${KBUILD_LDS}"
+   local objects
 
if [ "${SRCARCH}" != "um" ]; then
+   if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
+   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
${KBUILD_VMLINUX_MAIN} --no-whole-archive"
+   else
+   objects="${KBUILD_VMLINUX_INIT} --start-group 
${KBUILD_VMLINUX_MAIN} --end-group"
+   fi
${LD} ${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2}  \
-   -T ${lds} ${KBUILD_VMLINUX_INIT} \
-   --start-group ${KBUILD_VMLINUX_MAIN} --end-group ${1}
+   -T 

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Nicholas Piggin
On Wed, 03 Aug 2016 14:29:13 +0200
Arnd Bergmann  wrote:

> On Wednesday, August 3, 2016 10:19:11 PM CEST Stephen Rothwell wrote:
> > Hi Arnd,
> > 
> > On Wed, 03 Aug 2016 09:52:23 +0200 Arnd Bergmann  wrote:  
> > >
> > > Using a different way to link the kernel would also help us with
> > > the remaining allyesconfig problem on ARM, as the problem is only in
> > > 'ld -r' not producing trampolines for symbols that later cannot get
> > > them any more. It would probably also help building with ld.gold,
> > > which is currently not working.
> > > 
> > > What is your suggested alternative?  
> > 
> > I have a patch that make the built-in.o files into thin archives (same
> > as archives, but the actual objects are replaced with the name of the
> > original object file).  That way the final link has all the original
> > objects.  I haven't checked to see what the overheads of doing it this
> > way is.
> > 
> > Nick Piggin has just today taken my old patch (it was last rebased to
> > v4.4-rc1) and tried it on a recent kernel and it still seems to mostly
> > work.  It probably needs some tidying up, but you are welcome to test
> > it if you want to.  
> 
> Sure, I'll certainly give it a try on ARM when you send me a copy.

I've attached what I'm using, which builds and runs for me without
any work. Your arch obviously has to select the option to use it.

text  data bss  dec   hex filename
11196784  1185024  1923820  14305628  da495c  vmlinuxppc64.before
11187536  1181848  1923176  14292560  da1650  vmlinuxppc64.after

~9K text saving, ~3K data saving. I assume this comes from fewer
branch trampolines and toc entries, but haven't verified exactly.



commit 8bc3ca4798c215e9a9107b6d44408f0af259f84f
Author: Stephen Rothwell 
Date:   Tue Oct 30 12:14:18 2012 +1100

kbuild: allow architectures to use thin archives instead of ld -r

Alan Modra has been trying to convince the kernel developers that ld -r
is "evil" for many years.  This is an alternative and means that the
linker has much more information available to it when it links the
kernel.

Signed-off-by: Stephen Rothwell 

diff --git a/arch/Kconfig b/arch/Kconfig
index d794384..1330bf4 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -424,6 +424,12 @@ config CC_STACKPROTECTOR_STRONG
 
 endchoice
 
+config THIN_ARCHIVES
+   bool
+   help
+ Select this if the architecture wants to use thin archives
+ instead of ld -r to create the built-in.o files.
+
 config HAVE_CONTEXT_TRACKING
bool
help
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index 0d1ca5b..bbf60b3 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -358,10 +358,15 @@ $(sort $(subdir-obj-y)): $(subdir-ym) ;
 # Rule to compile a set of .o files into one .o file
 #
 ifdef builtin-target
+ifdef CONFIG_THIN_ARCHIVES
+  cmd_make_builtin = rm -f $@; $(AR) rcsT$(KBUILD_ARFLAGS)
+else
+  cmd_make_builtin = $(LD) $(ld_flags) -r -o
+endif
 quiet_cmd_link_o_target = LD  $@
 # If the list of objects to link is empty, just create an empty built-in.o
 cmd_link_o_target = $(if $(strip $(obj-y)),\
- $(LD) $(ld_flags) -r -o $@ $(filter $(obj-y), $^) \
+ $(cmd_make_builtin) $@ $(filter $(obj-y), $^) \
  $(cmd_secanalysis),\
  rm -f $@; $(AR) rcs$(KBUILD_ARFLAGS) $@)
 
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index f0f6d9d..ef4658f 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -41,8 +41,14 @@ info()
 # ${1} output file
 modpost_link()
 {
-   ${LD} ${LDFLAGS} -r -o ${1} ${KBUILD_VMLINUX_INIT}   \
-   --start-group ${KBUILD_VMLINUX_MAIN} --end-group
+   local objects
+
+   if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
+   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
${KBUILD_VMLINUX_MAIN} --no-whole-archive"
+   else
+   objects="${KBUILD_VMLINUX_INIT} --start-group 
${KBUILD_VMLINUX_MAIN} --end-group"
+   fi
+   ${LD} ${LDFLAGS} -r -o ${1} ${objects}
 }
 
 # Link of vmlinux
@@ -51,11 +57,16 @@ modpost_link()
 vmlinux_link()
 {
local lds="${objtree}/${KBUILD_LDS}"
+   local objects
 
if [ "${SRCARCH}" != "um" ]; then
+   if [ -n "${CONFIG_THIN_ARCHIVES}" ]; then
+   objects="--whole-archive ${KBUILD_VMLINUX_INIT} 
${KBUILD_VMLINUX_MAIN} --no-whole-archive"
+   else
+   objects="${KBUILD_VMLINUX_INIT} --start-group 
${KBUILD_VMLINUX_MAIN} --end-group"
+   fi
${LD} ${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2}  \
-   -T ${lds} ${KBUILD_VMLINUX_INIT} \
-   --start-group ${KBUILD_VMLINUX_MAIN} --end-group ${1}
+   -T ${lds} ${objects} ${1}
else
${CC} ${CFLAGS_vmlinux} -o 

Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Arnd Bergmann
On Wednesday, August 3, 2016 10:19:11 PM CEST Stephen Rothwell wrote:
> Hi Arnd,
> 
> On Wed, 03 Aug 2016 09:52:23 +0200 Arnd Bergmann  wrote:
> >
> > Using a different way to link the kernel would also help us with
> > the remaining allyesconfig problem on ARM, as the problem is only in
> > 'ld -r' not producing trampolines for symbols that later cannot get
> > them any more. It would probably also help building with ld.gold,
> > which is currently not working.
> > 
> > What is your suggested alternative?
> 
> I have a patch that make the built-in.o files into thin archives (same
> as archives, but the actual objects are replaced with the name of the
> original object file).  That way the final link has all the original
> objects.  I haven't checked to see what the overheads of doing it this
> way is.
> 
> Nick Piggin has just today taken my old patch (it was last rebased to
> v4.4-rc1) and tried it on a recent kernel and it still seems to mostly
> work.  It probably needs some tidying up, but you are welcome to test
> it if you want to.

Sure, I'll certainly give it a try on ARM when you send me a copy.

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Arnd Bergmann
On Wednesday, August 3, 2016 10:19:11 PM CEST Stephen Rothwell wrote:
> Hi Arnd,
> 
> On Wed, 03 Aug 2016 09:52:23 +0200 Arnd Bergmann  wrote:
> >
> > Using a different way to link the kernel would also help us with
> > the remaining allyesconfig problem on ARM, as the problem is only in
> > 'ld -r' not producing trampolines for symbols that later cannot get
> > them any more. It would probably also help building with ld.gold,
> > which is currently not working.
> > 
> > What is your suggested alternative?
> 
> I have a patch that make the built-in.o files into thin archives (same
> as archives, but the actual objects are replaced with the name of the
> original object file).  That way the final link has all the original
> objects.  I haven't checked to see what the overheads of doing it this
> way is.
> 
> Nick Piggin has just today taken my old patch (it was last rebased to
> v4.4-rc1) and tried it on a recent kernel and it still seems to mostly
> work.  It probably needs some tidying up, but you are welcome to test
> it if you want to.

Sure, I'll certainly give it a try on ARM when you send me a copy.

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Stephen Rothwell
Hi Arnd,

On Wed, 03 Aug 2016 09:52:23 +0200 Arnd Bergmann  wrote:
>
> Using a different way to link the kernel would also help us with
> the remaining allyesconfig problem on ARM, as the problem is only in
> 'ld -r' not producing trampolines for symbols that later cannot get
> them any more. It would probably also help building with ld.gold,
> which is currently not working.
> 
> What is your suggested alternative?

I have a patch that make the built-in.o files into thin archives (same
as archives, but the actual objects are replaced with the name of the
original object file).  That way the final link has all the original
objects.  I haven't checked to see what the overheads of doing it this
way is.

Nick Piggin has just today taken my old patch (it was last rebased to
v4.4-rc1) and tried it on a recent kernel and it still seems to mostly
work.  It probably needs some tidying up, but you are welcome to test
it if you want to.

-- 
Cheers,
Stephen Rothwell


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Stephen Rothwell
Hi Arnd,

On Wed, 03 Aug 2016 09:52:23 +0200 Arnd Bergmann  wrote:
>
> Using a different way to link the kernel would also help us with
> the remaining allyesconfig problem on ARM, as the problem is only in
> 'ld -r' not producing trampolines for symbols that later cannot get
> them any more. It would probably also help building with ld.gold,
> which is currently not working.
> 
> What is your suggested alternative?

I have a patch that make the built-in.o files into thin archives (same
as archives, but the actual objects are replaced with the name of the
original object file).  That way the final link has all the original
objects.  I haven't checked to see what the overheads of doing it this
way is.

Nick Piggin has just today taken my old patch (it was last rebased to
v4.4-rc1) and tried it on a recent kernel and it still seems to mostly
work.  It probably needs some tidying up, but you are welcome to test
it if you want to.

-- 
Cheers,
Stephen Rothwell


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Arnd Bergmann
On Wednesday, August 3, 2016 10:23:24 AM CEST Stephen Rothwell wrote:
> Hi Luis,
> 
> On Wed, 3 Aug 2016 00:02:43 +0200 "Luis R. Rodriguez"  
> wrote:
> >
> > Thanks for the confirmation. For how long is it known this is broken?
> > Does anyone care and fix these ? Or is this best effort?
> 
> This has been broken for many years 
> 
> I have a couple of times almost fixed it, but it requires that we
> change from using "ld -r" to build the built-in.o objects and some
> changes to the powerpc head.S code ... I will give it another shot now
> that the merge window is almost over (and linux-next goes into its
> quieter time).

Using a different way to link the kernel would also help us with
the remaining allyesconfig problem on ARM, as the problem is only in
'ld -r' not producing trampolines for symbols that later cannot get
them any more. It would probably also help building with ld.gold,
which is currently not working.

What is your suggested alternative?

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-03 Thread Arnd Bergmann
On Wednesday, August 3, 2016 10:23:24 AM CEST Stephen Rothwell wrote:
> Hi Luis,
> 
> On Wed, 3 Aug 2016 00:02:43 +0200 "Luis R. Rodriguez"  
> wrote:
> >
> > Thanks for the confirmation. For how long is it known this is broken?
> > Does anyone care and fix these ? Or is this best effort?
> 
> This has been broken for many years 
> 
> I have a couple of times almost fixed it, but it requires that we
> change from using "ld -r" to build the built-in.o objects and some
> changes to the powerpc head.S code ... I will give it another shot now
> that the merge window is almost over (and linux-next goes into its
> quieter time).

Using a different way to link the kernel would also help us with
the remaining allyesconfig problem on ARM, as the problem is only in
'ld -r' not producing trampolines for symbols that later cannot get
them any more. It would probably also help building with ld.gold,
which is currently not working.

What is your suggested alternative?

Arnd


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-02 Thread Michael Ellerman
"Luis R. Rodriguez"  writes:

> Are linux-next builds being tested for powerpc with allyesconfig and
> allmodconfig ?

Yes, every single version:

  http://kisskb.ellerman.id.au/kisskb/target/2659/

> I have some changes I'm making and while debugging my
> build issues I decided to give a clean build a shot and see linux-next
> next-20160729 up to next-20160729 all have build failures without my
> changes. I get:
>
> /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> drivers/built-in.o: .opd is not a regular array of opd entries
>   MODPOST vmlinux.o
>   GEN .version
>   CHK include/generated/compile.h
>   UPD include/generated/compile.h
>   CC  init/version.o
>   LD  init/built-in.o
> /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> drivers/built-in.o: .opd is not a regular array of opd entries
> drivers/built-in.o: In function `.ipw2100_up':
> ipw2100.c:(.text+0x1ff9c90): relocation truncated to fit:
> R_PPC64_REL24 (stub) against symbol `.round_jiffies_relative' defined
> in .text section in kernel/built-in.o

And yes this is a known problem, there have been attempts to fix it, but
none that quite got working.

In fact it's our bug #1 :)

  https://github.com/linuxppc/linux/issues/1


Please use allmodconfig, which should build in general. Or one of our
other defconfigs, eg. ppc64/ppc64le defconfig.

cheers


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-02 Thread Michael Ellerman
"Luis R. Rodriguez"  writes:

> Are linux-next builds being tested for powerpc with allyesconfig and
> allmodconfig ?

Yes, every single version:

  http://kisskb.ellerman.id.au/kisskb/target/2659/

> I have some changes I'm making and while debugging my
> build issues I decided to give a clean build a shot and see linux-next
> next-20160729 up to next-20160729 all have build failures without my
> changes. I get:
>
> /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> drivers/built-in.o: .opd is not a regular array of opd entries
>   MODPOST vmlinux.o
>   GEN .version
>   CHK include/generated/compile.h
>   UPD include/generated/compile.h
>   CC  init/version.o
>   LD  init/built-in.o
> /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> drivers/built-in.o: .opd is not a regular array of opd entries
> drivers/built-in.o: In function `.ipw2100_up':
> ipw2100.c:(.text+0x1ff9c90): relocation truncated to fit:
> R_PPC64_REL24 (stub) against symbol `.round_jiffies_relative' defined
> in .text section in kernel/built-in.o

And yes this is a known problem, there have been attempts to fix it, but
none that quite got working.

In fact it's our bug #1 :)

  https://github.com/linuxppc/linux/issues/1


Please use allmodconfig, which should build in general. Or one of our
other defconfigs, eg. ppc64/ppc64le defconfig.

cheers


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-02 Thread Stephen Rothwell
Hi Luis,

On Wed, 3 Aug 2016 00:02:43 +0200 "Luis R. Rodriguez"  wrote:
>
> Thanks for the confirmation. For how long is it known this is broken?
> Does anyone care and fix these ? Or is this best effort?

This has been broken for many years :-(

I have a couple of times almost fixed it, but it requires that we
change from using "ld -r" to build the built-in.o objects and some
changes to the powerpc head.S code ... I will give it another shot now
that the merge window is almost over (and linux-next goes into its
quieter time).

-- 
Cheers,
Stephen Rothwell


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-02 Thread Stephen Rothwell
Hi Luis,

On Wed, 3 Aug 2016 00:02:43 +0200 "Luis R. Rodriguez"  wrote:
>
> Thanks for the confirmation. For how long is it known this is broken?
> Does anyone care and fix these ? Or is this best effort?

This has been broken for many years :-(

I have a couple of times almost fixed it, but it requires that we
change from using "ld -r" to build the built-in.o objects and some
changes to the powerpc head.S code ... I will give it another shot now
that the merge window is almost over (and linux-next goes into its
quieter time).

-- 
Cheers,
Stephen Rothwell


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-02 Thread Arnd Bergmann
On Wednesday, August 3, 2016 12:02:43 AM CEST Luis R. Rodriguez wrote:
> On Tue, Aug 02, 2016 at 02:58:39PM -0700, Guenter Roeck wrote:
> > On Tue, Aug 02, 2016 at 01:07:09PM -0700, Luis R. Rodriguez wrote:
> > > Are linux-next builds being tested for powerpc with allyesconfig and
> > > allmodconfig ? I have some changes I'm making and while debugging my
> > > build issues I decided to give a clean build a shot and see linux-next
> > > next-20160729 up to next-20160729 all have build failures without my
> > > changes. I get:
> > > 
> > > /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> > > drivers/built-in.o: .opd is not a regular array of opd entries
> > >   MODPOST vmlinux.o
> > >   GEN .version
> > >   CHK include/generated/compile.h
> > >   UPD include/generated/compile.h
> > >   CC  init/version.o
> > >   LD  init/built-in.o
> > > /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> > > drivers/built-in.o: .opd is not a regular array of opd entries
> > > drivers/built-in.o: In function `.ipw2100_up':
> > > ipw2100.c:(.text+0x1ff9c90): relocation truncated to fit:
> > 
> > "relocation truncated to fit"  errors are typical for ppc:allyesconfig.
> 
> Thanks for the confirmation. For how long is it known this is broken?
> Does anyone care and fix these ? Or is this best effort?

We used to have the same thing on ARM, but it's (mostly) fixed now.
In case of ARM, the solution was to ensure that all sections that
have long jumps or targets of long jumps are marked as executable
in the ELF headers, so the linker can insert trampolines.

The one remaining problem at the moment is related to recursive
linking of the drivers/ directory, which has .text section that
is larger than 32MB by itself. There is a patch to solve this by
linking each drivers/*/built-in.o object directly into vmlinux,
but that is a rather drastic change.

Arnd



Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-02 Thread Arnd Bergmann
On Wednesday, August 3, 2016 12:02:43 AM CEST Luis R. Rodriguez wrote:
> On Tue, Aug 02, 2016 at 02:58:39PM -0700, Guenter Roeck wrote:
> > On Tue, Aug 02, 2016 at 01:07:09PM -0700, Luis R. Rodriguez wrote:
> > > Are linux-next builds being tested for powerpc with allyesconfig and
> > > allmodconfig ? I have some changes I'm making and while debugging my
> > > build issues I decided to give a clean build a shot and see linux-next
> > > next-20160729 up to next-20160729 all have build failures without my
> > > changes. I get:
> > > 
> > > /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> > > drivers/built-in.o: .opd is not a regular array of opd entries
> > >   MODPOST vmlinux.o
> > >   GEN .version
> > >   CHK include/generated/compile.h
> > >   UPD include/generated/compile.h
> > >   CC  init/version.o
> > >   LD  init/built-in.o
> > > /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> > > drivers/built-in.o: .opd is not a regular array of opd entries
> > > drivers/built-in.o: In function `.ipw2100_up':
> > > ipw2100.c:(.text+0x1ff9c90): relocation truncated to fit:
> > 
> > "relocation truncated to fit"  errors are typical for ppc:allyesconfig.
> 
> Thanks for the confirmation. For how long is it known this is broken?
> Does anyone care and fix these ? Or is this best effort?

We used to have the same thing on ARM, but it's (mostly) fixed now.
In case of ARM, the solution was to ensure that all sections that
have long jumps or targets of long jumps are marked as executable
in the ELF headers, so the linker can insert trampolines.

The one remaining problem at the moment is related to recursive
linking of the drivers/ directory, which has .text section that
is larger than 32MB by itself. There is a patch to solve this by
linking each drivers/*/built-in.o object directly into vmlinux,
but that is a rather drastic change.

Arnd



Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-02 Thread Luis R. Rodriguez
On Tue, Aug 02, 2016 at 02:58:39PM -0700, Guenter Roeck wrote:
> On Tue, Aug 02, 2016 at 01:07:09PM -0700, Luis R. Rodriguez wrote:
> > Are linux-next builds being tested for powerpc with allyesconfig and
> > allmodconfig ? I have some changes I'm making and while debugging my
> > build issues I decided to give a clean build a shot and see linux-next
> > next-20160729 up to next-20160729 all have build failures without my
> > changes. I get:
> > 
> > /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> > drivers/built-in.o: .opd is not a regular array of opd entries
> >   MODPOST vmlinux.o
> >   GEN .version
> >   CHK include/generated/compile.h
> >   UPD include/generated/compile.h
> >   CC  init/version.o
> >   LD  init/built-in.o
> > /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> > drivers/built-in.o: .opd is not a regular array of opd entries
> > drivers/built-in.o: In function `.ipw2100_up':
> > ipw2100.c:(.text+0x1ff9c90): relocation truncated to fit:
> 
> "relocation truncated to fit"  errors are typical for ppc:allyesconfig.

Thanks for the confirmation. For how long is it known this is broken?
Does anyone care and fix these ? Or is this best effort?

> allmodconfig should work, though.

OK thanks.

  Luis


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-02 Thread Luis R. Rodriguez
On Tue, Aug 02, 2016 at 02:58:39PM -0700, Guenter Roeck wrote:
> On Tue, Aug 02, 2016 at 01:07:09PM -0700, Luis R. Rodriguez wrote:
> > Are linux-next builds being tested for powerpc with allyesconfig and
> > allmodconfig ? I have some changes I'm making and while debugging my
> > build issues I decided to give a clean build a shot and see linux-next
> > next-20160729 up to next-20160729 all have build failures without my
> > changes. I get:
> > 
> > /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> > drivers/built-in.o: .opd is not a regular array of opd entries
> >   MODPOST vmlinux.o
> >   GEN .version
> >   CHK include/generated/compile.h
> >   UPD include/generated/compile.h
> >   CC  init/version.o
> >   LD  init/built-in.o
> > /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> > drivers/built-in.o: .opd is not a regular array of opd entries
> > drivers/built-in.o: In function `.ipw2100_up':
> > ipw2100.c:(.text+0x1ff9c90): relocation truncated to fit:
> 
> "relocation truncated to fit"  errors are typical for ppc:allyesconfig.

Thanks for the confirmation. For how long is it known this is broken?
Does anyone care and fix these ? Or is this best effort?

> allmodconfig should work, though.

OK thanks.

  Luis


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-02 Thread Guenter Roeck
On Tue, Aug 02, 2016 at 01:07:09PM -0700, Luis R. Rodriguez wrote:
> Are linux-next builds being tested for powerpc with allyesconfig and
> allmodconfig ? I have some changes I'm making and while debugging my
> build issues I decided to give a clean build a shot and see linux-next
> next-20160729 up to next-20160729 all have build failures without my
> changes. I get:
> 
> /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> drivers/built-in.o: .opd is not a regular array of opd entries
>   MODPOST vmlinux.o
>   GEN .version
>   CHK include/generated/compile.h
>   UPD include/generated/compile.h
>   CC  init/version.o
>   LD  init/built-in.o
> /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> drivers/built-in.o: .opd is not a regular array of opd entries
> drivers/built-in.o: In function `.ipw2100_up':
> ipw2100.c:(.text+0x1ff9c90): relocation truncated to fit:

"relocation truncated to fit"  errors are typical for ppc:allyesconfig.
allmodconfig should work, though.

Guenter


Re: powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-02 Thread Guenter Roeck
On Tue, Aug 02, 2016 at 01:07:09PM -0700, Luis R. Rodriguez wrote:
> Are linux-next builds being tested for powerpc with allyesconfig and
> allmodconfig ? I have some changes I'm making and while debugging my
> build issues I decided to give a clean build a shot and see linux-next
> next-20160729 up to next-20160729 all have build failures without my
> changes. I get:
> 
> /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> drivers/built-in.o: .opd is not a regular array of opd entries
>   MODPOST vmlinux.o
>   GEN .version
>   CHK include/generated/compile.h
>   UPD include/generated/compile.h
>   CC  init/version.o
>   LD  init/built-in.o
> /opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
> drivers/built-in.o: .opd is not a regular array of opd entries
> drivers/built-in.o: In function `.ipw2100_up':
> ipw2100.c:(.text+0x1ff9c90): relocation truncated to fit:

"relocation truncated to fit"  errors are typical for ppc:allyesconfig.
allmodconfig should work, though.

Guenter


powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-02 Thread Luis R. Rodriguez
Are linux-next builds being tested for powerpc with allyesconfig and
allmodconfig ? I have some changes I'm making and while debugging my
build issues I decided to give a clean build a shot and see linux-next
next-20160729 up to next-20160729 all have build failures without my
changes. I get:

/opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
drivers/built-in.o: .opd is not a regular array of opd entries
  MODPOST vmlinux.o
  GEN .version
  CHK include/generated/compile.h
  UPD include/generated/compile.h
  CC  init/version.o
  LD  init/built-in.o
/opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
drivers/built-in.o: .opd is not a regular array of opd entries
drivers/built-in.o: In function `.ipw2100_up':
ipw2100.c:(.text+0x1ff9c90): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.round_jiffies_relative' defined
in .text section in kernel/built-in.o
drivers/built-in.o: In function `.ipw2100_reset_adapter':
ipw2100.c:(.text+0x1ffa500): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `._raw_spin_lock_irqsave' defined
in .spinlock.text section in kernel/built-in.o
drivers/built-in.o: In function `.ipw2100_irq_tasklet':
ipw2100.c:(.text+0x1ffa7cc): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `._raw_spin_lock_irqsave' defined
in .spinlock.text section in kernel/built-in.o
ipw2100.c:(.text+0x1ffb6c8): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.printk' defined in
.text.unlikely section in kernel/built-in.o
ipw2100.c:(.text+0x1ffb6d8): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.printk' defined in
.text.unlikely section in kernel/built-in.o
ipw2100.c:(.text+0x1ffb740): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.printk' defined in
.text.unlikely section in kernel/built-in.o
ipw2100.c:(.text+0x1ffb750): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.printk' defined in
.text.unlikely section in kernel/built-in.o
ipw2100.c:(.text+0x1ffb7ec): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.debug_dma_unmap_page' defined in
.text section in lib/built-in.o
ipw2100.c:(.text+0x1ffb88c): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.__dev_kfree_skb_any' defined in
.text section in net/built-in.o
ipw2100.c:(.text+0x1ffb8b8): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.printk' defined in
.text.unlikely section in kernel/built-in.o
ipw2100.c:(.text+0x1ffb8f4): additional relocation overflows omitted
from the output
scripts/link-vmlinux.sh: line 52: 14580 Segmentation fault  (core
dumped) ${LD} ${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2} -T ${lds}
${KBUILD_VMLINUX_INIT} --start-group ${KBUILD_VMLINUX_MAIN}
--end-group ${1}
make: *** [Makefile:952: vmlinux] Error 139

  Luis


powerpc allyesconfig / allmodconfig linux-next next-20160729 - next-20160729 build failures

2016-08-02 Thread Luis R. Rodriguez
Are linux-next builds being tested for powerpc with allyesconfig and
allmodconfig ? I have some changes I'm making and while debugging my
build issues I decided to give a clean build a shot and see linux-next
next-20160729 up to next-20160729 all have build failures without my
changes. I get:

/opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
drivers/built-in.o: .opd is not a regular array of opd entries
  MODPOST vmlinux.o
  GEN .version
  CHK include/generated/compile.h
  UPD include/generated/compile.h
  CC  init/version.o
  LD  init/built-in.o
/opt/gcc-4.9.0-nolibc/powerpc64-linux/bin/powerpc64-linux-ld:
drivers/built-in.o: .opd is not a regular array of opd entries
drivers/built-in.o: In function `.ipw2100_up':
ipw2100.c:(.text+0x1ff9c90): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.round_jiffies_relative' defined
in .text section in kernel/built-in.o
drivers/built-in.o: In function `.ipw2100_reset_adapter':
ipw2100.c:(.text+0x1ffa500): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `._raw_spin_lock_irqsave' defined
in .spinlock.text section in kernel/built-in.o
drivers/built-in.o: In function `.ipw2100_irq_tasklet':
ipw2100.c:(.text+0x1ffa7cc): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `._raw_spin_lock_irqsave' defined
in .spinlock.text section in kernel/built-in.o
ipw2100.c:(.text+0x1ffb6c8): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.printk' defined in
.text.unlikely section in kernel/built-in.o
ipw2100.c:(.text+0x1ffb6d8): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.printk' defined in
.text.unlikely section in kernel/built-in.o
ipw2100.c:(.text+0x1ffb740): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.printk' defined in
.text.unlikely section in kernel/built-in.o
ipw2100.c:(.text+0x1ffb750): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.printk' defined in
.text.unlikely section in kernel/built-in.o
ipw2100.c:(.text+0x1ffb7ec): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.debug_dma_unmap_page' defined in
.text section in lib/built-in.o
ipw2100.c:(.text+0x1ffb88c): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.__dev_kfree_skb_any' defined in
.text section in net/built-in.o
ipw2100.c:(.text+0x1ffb8b8): relocation truncated to fit:
R_PPC64_REL24 (stub) against symbol `.printk' defined in
.text.unlikely section in kernel/built-in.o
ipw2100.c:(.text+0x1ffb8f4): additional relocation overflows omitted
from the output
scripts/link-vmlinux.sh: line 52: 14580 Segmentation fault  (core
dumped) ${LD} ${LDFLAGS} ${LDFLAGS_vmlinux} -o ${2} -T ${lds}
${KBUILD_VMLINUX_INIT} --start-group ${KBUILD_VMLINUX_MAIN}
--end-group ${1}
make: *** [Makefile:952: vmlinux] Error 139

  Luis