Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-26 Thread Luke Kenneth Casson Leighton
On Mon, Jan 7, 2019 at 11:30 PM Mike Hommey  wrote:

> > it would be extremely useful to confirm that 32-bit builds can in fact
> > be completed, simply by adding "-Wl no-keep-memory" to any 32-bit
> > builds that are failing at the linker phase due to lack of memory.
>
> Note that Firefox is built with --no-keep-memory
> --reduce-memory-overheads, and that was still not enough for 32-bts
> builds. GNU gold instead of BFD ld was also given a shot. That didn't
> work either. Presently, to make things link at all on 32-bits platforms,
> debug info is entirely disabled. I still need to figure out what minimal
> debug info can be enabled without incurring too much memory usage
> during linking.

 hi mike, hi steve, i did not receive a response on the queries about
the additional recommended options [1], so rather than lose track i
raised a bugreport and cross-referenced this discussion:

 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=919882

 personally, after using the ld-evil-linker.py tool i do not expect
the recommended options to work on 32-bit, as i suspect that, despite
the options saying that they do not use mmap, the investigation that i
did provides some empirical evidence to the contrary, whereas ld-bfd
does *not*.

 so, ironically, on ld-bfd you run into one bug, and on ld-gold you
run into another :)

l.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=22831#c25



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-09 Thread Luke Kenneth Casson Leighton
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 sorry using phone to
type, mike, comment 25 shows some important options to ld gold would it be
possible to retry with those? 32 bit. Disabling mmap looks really important
as clearly a 4gb+ binary is guaranteed going to fail to fit into 32bit mmap.

On Tuesday, January 8, 2019, Mike Hommey  wrote:

>
> Note that Firefox is built with --no-keep-memory
> --reduce-memory-overheads, and that was still not enough for 32-bits
> builds. GNU gold instead of BFD ld was also given a shot. That didn't
> work either. Presently, to make things link at all on 32-bits platforms,
> debug info is entirely disabled. I still need to figure out what minimal
> debug info can be enabled without incurring too much memory usage
> during linking.
>
> Mike
>


-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68


Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-08 Thread Luke Kenneth Casson Leighton
On Tue, Jan 8, 2019 at 7:26 AM Luke Kenneth Casson Leighton
 wrote:
>
> On Tue, Jan 8, 2019 at 7:01 AM Luke Kenneth Casson Leighton
>  wrote:

> trying this:
>
> $ python evil_linker_torture.py 3000 400 200 50
>
> running with "make -j4" is going to take a few hours.

 ok so that did the trick: got to 4.3gb total resident memory even
with --no-keep-memory tacked on to the link.  fortunately it bombed
out (below) before it could get to the (assumed) point where it would
double the amount of resident RAM (8.6GB) and cause my laptop to go
into complete thrashing meltdown.

hypothetically it should have created an 18 GB executable.  3000 times
500,000 static chars isn't the only reason this is failing, because
when restricted to only 100 functions and 100 random calls per
function, it worked.

ok so i'm retrying without --no-keep-memory... and it's now gone
beyond the 5GB mark.  backgrounding it and letting it progress a few
seconds at a time... that's interesting up to 8GB...  9.5GB ok
that's enough: any more than that and i really will trash the laptop.

ok so the above settings will definitely do the job (and seem to have
thrown up a repro candidate for the issue you were experiencing with
firefox builds, mike).

i apologise that it takes about 3 hours to build all 3,000 6mb object
files, even with a quad-core 3.6ghz i7.  they're a bit monstrous.

will find this post somewhere on debian-devel archives and
cross-reference it here
https://sourceware.org/bugzilla/show_bug.cgi?id=22831


ld: warning: cannot find entry symbol _start; defaulting to 00401000
ld: src9.o: in function `fn_9_0':
/home/lkcl/src/ld_torture/src9.c:3006:(.text+0x27): relocation
truncated to fit: R_X86_64_PLT32 against symbol `fn_1149_322' defined
in .text section in src1149.o
ld: /home/lkcl/src/ld_torture/src9.c:3008:(.text+0x41): relocation
truncated to fit: R_X86_64_PLT32 against symbol `fn_1387_379' defined
in .text section in src1387.o
ld: /home/lkcl/src/ld_torture/src9.c:3014:(.text+0x8f): relocation
truncated to fit: R_X86_64_PLT32 against symbol `fn_1821_295' defined
in .text section in src1821.o
ld: /home/lkcl/src/ld_torture/src9.c:3015:(.text+0x9c): relocation
truncated to fit: R_X86_64_PLT32 against symbol `fn_1082_189' defined
in .text section in src1082.o
ld: /home/lkcl/src/ld_torture/src9.c:3016:(.text+0xa9): relocation
truncated to fit: R_X86_64_PLT32 against symbol `fn_183_330' defined
in .text section in src183.o
ld: /home/lkcl/src/ld_torture/src9.c:3024:(.text+0x111): relocation
truncated to fit: R_X86_64_PLT32 against symbol `fn_162_394' defined
in .text section in src162.o
ld: /home/lkcl/src/ld_torture/src9.c:3026:(.text+0x12b): relocation
truncated to fit: R_X86_64_PLT32 against symbol `fn_132_235' defined
in .text section in src132.o
ld: /home/lkcl/src/ld_torture/src9.c:3028:(.text+0x145): relocation
truncated to fit: R_X86_64_PLT32 against symbol `fn_1528_316' defined
in .text section in src1528.o
ld: /home/lkcl/src/ld_torture/src9.c:3029:(.text+0x152): relocation
truncated to fit: R_X86_64_PLT32 against symbol `fn_1178_357' defined
in .text section in src1178.o
ld: /home/lkcl/src/ld_torture/src9.c:3031:(.text+0x16c): relocation
truncated to fit: R_X86_64_PLT32 against symbol `fn_1180_278' defined
in .text section in src1180.o
ld: /home/lkcl/src/ld_torture/src9.c:3035:(.text+0x1a0): additional
relocation overflows omitted from the output
^Cmake: *** Deleting file `main'
make: *** [main] Interrupt



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Luke Kenneth Casson Leighton
On Tue, Jan 8, 2019 at 7:01 AM Luke Kenneth Casson Leighton
 wrote:

> i'm going to see if i can get above the 4GB mark by modifying the
> Makefile to do 3,000 shared libraries instead of 3,000 static object
> files.

 fail.  shared libraries link extremely quickly.  reverted to static,
trying this:

$ python evil_linker_torture.py 3000 400 200 50

so that's 4x the number of functions per file, and 2x the number of
calls *in* each function.

just the compile phase requires 1GB per object file (gcc 7.3.0-29),
which, on "make -j8" ratched up the loadavg to the point where...
well.. *when* it recovered it reported a loadavg of over 35, with 95%
usage of the 16GB swap space...

running with "make -j4" is going to take a few hours.

l.



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Luke Kenneth Casson Leighton
$ python evil_linker_torture.py 3000 100 100 50

ok so that managed to get up to 1.8GB resident memory, paused for a
bit, then doubled it to 3.6GB, and a few seconds later successfully
outputted a binary.

i'm going to see if i can get above the 4GB mark by modifying the
Makefile to do 3,000 shared libraries instead of 3,000 static object
files.

l.



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Luke Kenneth Casson Leighton
On Tue, Jan 8, 2019 at 6:27 AM Luke Kenneth Casson Leighton
 wrote:

> i'm just running the above, will hit "send" now in case i can't hit
> ctrl-c in time on the linker phase... goodbye world... :)

$ python evil_linker_torture.py 2000 50 100 200
$ make -j8

oh, err... whoopsie... is this normal? :)  it was only showing around
600mb during the linker phase anyway. will keep hunting. where is this
best discussed (i.e. not such a massive cc list)?

/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/7/crtbeginS.o: in function
`deregister_tm_clones':
crtstuff.c:(.text+0x3): relocation truncated to fit: R_X86_64_PC32
against `.tm_clone_table'
/usr/bin/ld: crtstuff.c:(.text+0xb): relocation truncated to fit:
R_X86_64_PC32 against symbol `__TMC_END__' defined in .data section in
main
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/7/crtbeginS.o: in function
`register_tm_clones':
crtstuff.c:(.text+0x43): relocation truncated to fit: R_X86_64_PC32
against `.tm_clone_table'
/usr/bin/ld: crtstuff.c:(.text+0x4a): relocation truncated to fit:
R_X86_64_PC32 against symbol `__TMC_END__' defined in .data section in
main
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/7/crtbeginS.o: in function
`__do_global_dtors_aux':
crtstuff.c:(.text+0x92): relocation truncated to fit: R_X86_64_PC32
against `.bss'
/usr/bin/ld: crtstuff.c:(.text+0xba): relocation truncated to fit:
R_X86_64_PC32 against `.bss'
collect2: error: ld returned 1 exit status
make: *** [main] Error 1



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Luke Kenneth Casson Leighton
$ python evil_linker_torture.py 2000 50 100 200

ok so it's pretty basic, and arguments of "2000 50 10 100"
resulted in around a 10-15 second linker phase, which top showed to be
getting up to around the 2-3GB resident memory range.  "2000 50 100
200" should start to make even a system with 64GB RAM start to
feel the pain.

evil_linker_torture.py N M O P generates N files with M functions
calling O randomly-selected functions where each file contains a
static char of size P that is *deliberately* put into the code segment
by being initialised with a non-zero value, exactly and precisely as
you should never do because... surpriiise! it adversely impacts the
binary size.

i'm just running the above, will hit "send" now in case i can't hit
ctrl-c in time on the linker phase... goodbye world... :)

l.
#!/usr/bin/env python

import sys
import random

maketemplate = """\
CC := gcc
CFILES:=$(shell ls | grep "\.c")
OBJS:=$(CFILES:%.c=%.o)
DEPS := $(CFILES:%.c=%.d)
CFLAGS := -g -g -g
LDFLAGS := -g -g -g

%.d: %.c
	$(CC) $(CFLAGS) -MM -o $@ $<

%.o: %.c
	$(CC) $(CFLAGS) -o $@ -c $<

#	$(CC) $(CFLAGS) -include $(DEPS) -o $@ $<

main: $(OBJS)
	$(CC) $(OBJS) $(LDFLAGS) -o main
"""

def gen_makefile():
with open("Makefile", "w") as f:
f.write(maketemplate)

def gen_headers(num_files, num_fns):
for fnum in range(num_files):
with open("hdr{}.h".format(fnum), "w") as f:
for fn_num in range(num_fns):
f.write("extern int fn_{}_{}(int arg1);\n".format(fnum, fn_num))

def gen_c_code(num_files, num_fns, num_calls, static_sz):
for fnum in range(num_files):
with open("src{}.c".format(fnum), "w") as f:
for hfnum in range(num_files):
f.write('#include "hdr{}.h"\n'.format(hfnum))
f.write('static char data[%d] = {1};\n' % static_sz)
for fn_num in range(num_fns):
f.write("int fn_%d_%d(int arg1)\n{\n" % (fnum, fn_num))
f.write("\tint arg = arg1 + 1;\n")
for nc in range(num_calls):
cnum = random.randint(0, num_fns-1)
cfile = random.randint(0, num_files-1)
f.write("\targ += fn_{}_{}(arg);\n".format(cfile, cnum))
f.write("\treturn arg;\n")
f.write("}\n")
if fnum != 0:
continue
f.write("int main(int argc, char *argv[])\n{\n")
f.write("\tint arg = 0;\n")
for nc in range(num_calls):
cnum = random.randint(0, num_fns-1)
cfile = random.randint(0, num_files-1)
f.write("\targ += fn_{}_{}(arg);\n".format(cfile, cnum))
f.write("\treturn 0;\n")
f.write("}\n")

if __name__ == '__main__':
num_files = int(sys.argv[1])
num_fns = int(sys.argv[2])
num_calls = int(sys.argv[3])
static_sz = int(sys.argv[4])
gen_makefile()
gen_headers(num_files, num_fns)
gen_c_code(num_files, num_fns, num_calls, static_sz)


Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Luke Kenneth Casson Leighton
On Tuesday, January 8, 2019, Mike Hommey  wrote:

> On Mon, Jan 07, 2019 at 11:46:41PM +, Luke Kenneth Casson Leighton
> wrote:
>
> > At some point apps are going to become so insanely large that not even
> > disabling debug info will help.
>
> That's less likely, I'd say. Debug info *is* getting incredibly more and
> more complex for the same amount of executable weight, and linking that
> is making things worse and worse. But having enough code to actually be
> a problem without debug info is probably not so close.
>
>
It's a slow boil problem, taken 10 years to get bad, another 10 years to
get really bad. Needs strategic planning. Right now things are not exactly
being tackled except in a reactive way, which unfortunately takes time as
everyone is volunteers. Exacerbates the problem and leaves drastic
"solutions" such as "drop all 32 bit support".


> There are solutions to still keep full debug info, but the Debian
> packaging side doesn't support that presently: using split-dwarf. It
> would probably be worth investing in supporting that.
>
>
Sounds very reasonable, always wondered why debug syms are not separated at
build/link, would buy maybe another decade?



-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68


Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Mike Hommey
On Mon, Jan 07, 2019 at 11:46:41PM +, Luke Kenneth Casson Leighton wrote:
> On Tuesday, January 8, 2019, Mike Hommey  wrote:
> 
> > .
> >
> > Note that Firefox is built with --no-keep-memory
> > --reduce-memory-overheads, and that was still not enough for 32-bts
> > builds. GNU gold instead of BFD ld was also given a shot. That didn't
> > work either. Presently, to make things link at all on 32-bits platforms,
> > debug info is entirely disabled. I still need to figure out what minimal
> > debug info can be enabled without incurring too much memory usage
> > during linking.
> 
> 
> Dang. Yes, removing debug symbols was the only way I could get webkit to
> link without thrashing, it's a temporary fix though.
> 
> So the removal of the algorithm in ld Dr Stallman wrote, dating back to the
> 1990s, has already resulted in a situation that's worse than I feared.
> 
> At some point apps are going to become so insanely large that not even
> disabling debug info will help.

That's less likely, I'd say. Debug info *is* getting incredibly more and
more complex for the same amount of executable weight, and linking that
is making things worse and worse. But having enough code to actually be
a problem without debug info is probably not so close.

There are solutions to still keep full debug info, but the Debian
packaging side doesn't support that presently: using split-dwarf. It
would probably be worth investing in supporting that.

Mike



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Luke Kenneth Casson Leighton
On Tuesday, January 8, 2019, Mike Hommey  wrote:

> .
>
> Note that Firefox is built with --no-keep-memory
> --reduce-memory-overheads, and that was still not enough for 32-bts
> builds. GNU gold instead of BFD ld was also given a shot. That didn't
> work either. Presently, to make things link at all on 32-bits platforms,
> debug info is entirely disabled. I still need to figure out what minimal
> debug info can be enabled without incurring too much memory usage
> during linking.


Dang. Yes, removing debug symbols was the only way I could get webkit to
link without thrashing, it's a temporary fix though.

So the removal of the algorithm in ld Dr Stallman wrote, dating back to the
1990s, has already resulted in a situation that's worse than I feared.

At some point apps are going to become so insanely large that not even
disabling debug info will help.

At which point perhaps it is worth questioning the approach of having an
app be a single executable in the first place.  Even on a 64 bit system if
an app doesn't fit into 4gb RAM there's something drastically going awry.



-- 
---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68


Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Mike Hommey
On Mon, Jan 07, 2019 at 10:28:31AM +, Luke Kenneth Casson Leighton wrote:
> On Sun, Jan 6, 2019 at 11:46 PM Steve McIntyre  wrote:
> >
> > [ Please note the cross-post and respect the Reply-To... ]
> >
> > Hi folks,
> >
> > This has taken a while in coming, for which I apologise. There's a lot
> > of work involved in rebuilding the whole Debian archive, and many many
> > hours spent analysing the results. You learn quite a lot, too! :-)
> >
> > I promised way back before DC18 that I'd publish the results of the
> > rebuilds that I'd just started. Here they are, after a few false
> > starts. I've been rebuilding the archive *specifically* to check if we
> > would have any problems building our 32-bit Arm ports (armel and
> > armhf) using 64-bit arm64 hardware. I might have found other issues
> > too, but that was my goal.
> 
>  very cool.
> 
>  steve, this is probably as good a time as any to mention a very
> specific issue with binutils (ld) that has been slowly and inexorably
> creeping up on *all* distros - both 64 and 32 bit - where the 32-bit
> arches are beginning to hit the issue first.
> 
>  it's a 4GB variant of the "640k should be enough for anyone" problem,
> as applied to linking.
> 
>  i spoke with dr stallman a couple of weeks ago and confirmed that in
> the original version of ld that he wrote, he very very specifically
> made sure that it ONLY allocated memory up to the maximum *physical*
> resident available amount (i.e. only went into swap as an absolute
> last resort), and secondly that the number of object files loaded into
> memory was kept, again, to the minimum that the amount of spare
> resident RAM could handle.
> 
>  some... less-experienced people, somewhere in the late 1990s, ripped
> all of that code out [what's all this crap, why are we not just
> relying on swap, 4GB swap will surely be enough for anybody"]
> 
>  by 2008 i experienced a complete melt-down on a 2GB system when
> compiling webkit.  i tracked it down to having accidentally enabled
> "-g -g -g" in the Makefile, which i had done specifically for one
> file, forgot about it, and accidentally recompiled everything.
> 
>  that resulted in an absolute thrashing meltdown that nearly took out
> the entire laptop.
> 
>  the problem is that the linker phase in any application is so heavy
> on cross-references that the moment the memory allocated by the linker
> goes outside of the boundary of the available resident RAM it is
> ABSOLUTELY GUARANTEED to go into permanent sustained thrashing.
> 
>  i cannot emphasise enough how absolutely critical that this is to
> EVERY distribution to get this fixed.
> 
> resources world-wide are being completely wasted (power, time, and the
> destruction of HDDs and SSDs) because systems which should only really
> take an hour to do a link are instead often taking FIFTY times longer
> due to swap thrashing.
> 
> not only that, but the poor design of ld is beginning to stop certain
> packages from even *linking* on 32-bit systems!  firefox i heard now
> requires SEVEN GIGABYTES during the linker phase!
> 
> and it's down to this very short-sighted decision to remove code
> written by dr stallman, back in the late 1990s.
> 
> it would be extremely useful to confirm that 32-bit builds can in fact
> be completed, simply by adding "-Wl no-keep-memory" to any 32-bit
> builds that are failing at the linker phase due to lack of memory.

Note that Firefox is built with --no-keep-memory
--reduce-memory-overheads, and that was still not enough for 32-bts
builds. GNU gold instead of BFD ld was also given a shot. That didn't
work either. Presently, to make things link at all on 32-bits platforms,
debug info is entirely disabled. I still need to figure out what minimal
debug info can be enabled without incurring too much memory usage
during linking.

Mike



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Luke Kenneth Casson Leighton
(hi edmund, i'm reinstating debian-devel on the cc list as this is not
a debian-arm problem, it's *everyone's* problem)

On Mon, Jan 7, 2019 at 12:40 PM Edmund Grimley Evans
 wrote:

> >  i spoke with dr stallman a couple of weeks ago and confirmed that in
> > the original version of ld that he wrote, he very very specifically
> > made sure that it ONLY allocated memory up to the maximum *physical*
> > resident available amount (i.e. only went into swap as an absolute
> > last resort), and secondly that the number of object files loaded into
> > memory was kept, again, to the minimum that the amount of spare
> > resident RAM could handle.
>
> How did ld back then determine how much physical memory was available,
> and how might a modern reimplemention do it?

 i don't know: i haven't investigated the code.  one clue: gcc does
exactly the same thing (or, used to: i believe that someone *may* have
tried removing the feature from recent versions of gcc).

 ... you know how gcc stays below the radar of available memory, never
going into swap-space except as a last resort?

> Perhaps you use sysconf(_SC_PHYS_PAGES) or sysconf(_SC_AVPHYS_PAGES).
> But which? I have often been annoyed by how "make -j" may attempt
> several huge linking phases in parallel.

 on my current laptop, which was one of the very early quad core i7
skylakes with 2400mhz DDR4 RAM, the PCIe bus actually shuts down if
too much data goes over it (too high a power draw occurs).

 consequently, if swap-thrashing occurs, it's extremely risky, as it
causes the NVMe SSD to go *offline*, re-initialise, and come back on
again after some delay.

 that means that i absolutely CANNOT allow the linker phase to go into
swap-thrashing, as it will result in the loadavg shooting up to over
120 within just a few seconds.


> Would it be possible to put together a small script that demonstrates
> ld's inefficient use of memory? It is easy enough to generate a big
> object file from a tiny source file, and there are no doubt easy ways
> of measuring how much memory a process used, so it may be possible to
> provide a more convenient test case than "please try building Firefox
> and watch/listen as your SSD/HDD gets t(h)rashed".
>
> extern void *a[], *b[];
> void *c[1000] = {  };
> void *d[1000] = {  };
>
> If we had an easy test case we could compare GNU ld, GNU gold, and LLD.

 a simple script that auto-generated tens of thousands of functions in
a couple of hundred c files, with each function making tens to
hundreds of random cross-references (calls) to other functions across
the entire range of auto-generated c files should be more than
adequate to make the linker phase go into near-total meltdown.

 the evil kid in me really *really* wants to give that a shot...
except it would be extremely risky to run on my laptop.

 i'll write something up. mwahahah :)

l.



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Edmund Grimley Evans
>  i spoke with dr stallman a couple of weeks ago and confirmed that in
> the original version of ld that he wrote, he very very specifically
> made sure that it ONLY allocated memory up to the maximum *physical*
> resident available amount (i.e. only went into swap as an absolute
> last resort), and secondly that the number of object files loaded into
> memory was kept, again, to the minimum that the amount of spare
> resident RAM could handle.

How did ld back then determine how much physical memory was available,
and how might a modern reimplemention do it?

Perhaps you use sysconf(_SC_PHYS_PAGES) or sysconf(_SC_AVPHYS_PAGES).
But which? I have often been annoyed by how "make -j" may attempt
several huge linking phases in parallel.

Would it be possible to put together a small script that demonstrates
ld's inefficient use of memory? It is easy enough to generate a big
object file from a tiny source file, and there are no doubt easy ways
of measuring how much memory a process used, so it may be possible to
provide a more convenient test case than "please try building Firefox
and watch/listen as your SSD/HDD gets t(h)rashed".

extern void *a[], *b[];
void *c[1000] = {  };
void *d[1000] = {  };

If we had an easy test case we could compare GNU ld, GNU gold, and LLD.

Edmund



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Steve McIntyre
On Mon, Jan 07, 2019 at 09:54:32AM +, Edmund Grimley Evans wrote:
>The Haskell CP15 failures might be this:
>
>https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=864847
>
>Since it is claimed there that the CP15 instructions come from LLVM,
>the Mono failures might have a very similar cause and solution.

ACK, good call and thanks for the link! That looks like exactly the
problem, still. 18 months later with no response from the
maintainers. :-(

-- 
Steve McIntyre, Cambridge, UK.st...@einval.com
"I used to be the first kid on the block wanting a cranial implant,
 now I want to be the first with a cranial firewall. " -- Charlie Stross



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Luke Kenneth Casson Leighton
On Sun, Jan 6, 2019 at 11:46 PM Steve McIntyre  wrote:
>
> [ Please note the cross-post and respect the Reply-To... ]
>
> Hi folks,
>
> This has taken a while in coming, for which I apologise. There's a lot
> of work involved in rebuilding the whole Debian archive, and many many
> hours spent analysing the results. You learn quite a lot, too! :-)
>
> I promised way back before DC18 that I'd publish the results of the
> rebuilds that I'd just started. Here they are, after a few false
> starts. I've been rebuilding the archive *specifically* to check if we
> would have any problems building our 32-bit Arm ports (armel and
> armhf) using 64-bit arm64 hardware. I might have found other issues
> too, but that was my goal.

 very cool.

 steve, this is probably as good a time as any to mention a very
specific issue with binutils (ld) that has been slowly and inexorably
creeping up on *all* distros - both 64 and 32 bit - where the 32-bit
arches are beginning to hit the issue first.

 it's a 4GB variant of the "640k should be enough for anyone" problem,
as applied to linking.

 i spoke with dr stallman a couple of weeks ago and confirmed that in
the original version of ld that he wrote, he very very specifically
made sure that it ONLY allocated memory up to the maximum *physical*
resident available amount (i.e. only went into swap as an absolute
last resort), and secondly that the number of object files loaded into
memory was kept, again, to the minimum that the amount of spare
resident RAM could handle.

 some... less-experienced people, somewhere in the late 1990s, ripped
all of that code out [what's all this crap, why are we not just
relying on swap, 4GB swap will surely be enough for anybody"]

 by 2008 i experienced a complete melt-down on a 2GB system when
compiling webkit.  i tracked it down to having accidentally enabled
"-g -g -g" in the Makefile, which i had done specifically for one
file, forgot about it, and accidentally recompiled everything.

 that resulted in an absolute thrashing meltdown that nearly took out
the entire laptop.

 the problem is that the linker phase in any application is so heavy
on cross-references that the moment the memory allocated by the linker
goes outside of the boundary of the available resident RAM it is
ABSOLUTELY GUARANTEED to go into permanent sustained thrashing.

 i cannot emphasise enough how absolutely critical that this is to
EVERY distribution to get this fixed.

resources world-wide are being completely wasted (power, time, and the
destruction of HDDs and SSDs) because systems which should only really
take an hour to do a link are instead often taking FIFTY times longer
due to swap thrashing.

not only that, but the poor design of ld is beginning to stop certain
packages from even *linking* on 32-bit systems!  firefox i heard now
requires SEVEN GIGABYTES during the linker phase!

and it's down to this very short-sighted decision to remove code
written by dr stallman, back in the late 1990s.

it would be extremely useful to confirm that 32-bit builds can in fact
be completed, simply by adding "-Wl no-keep-memory" to any 32-bit
builds that are failing at the linker phase due to lack of memory.

however *please do not make the mistake of thinking that this is
specifically a 32-bit problem*.  resources are being wasted on 64-bit
systems by them going into massive thrashing, just as much as they are
on 32-bit ones: it's just that if it happens on a 32-bit system a hard
error occurs.

somebody needs to take responsibility for fixing binutils: the
maintainer of binutils needs help as he does not understand the
problem.  https://sourceware.org/bugzilla/show_bug.cgi?id=22831

l.



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-07 Thread Edmund Grimley Evans
The Haskell CP15 failures might be this:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=864847

Since it is claimed there that the CP15 instructions come from LLVM,
the Mono failures might have a very similar cause and solution.



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-06 Thread Steve McIntyre
On Mon, Jan 07, 2019 at 12:07:49AM +, peter green wrote:
>On 06/01/19 23:45, Steve McIntyre wrote:
>> In my initial testing for rebuilding armhf only, I did not enable
>> either of these. I was then finding *lots* of "Illegal Instruction"
>> crashes due to CP15 barrier usage in armhf Haskell and Mono
>> programs. This suggests that the baseline architecture in these
>> toolchains is incorrectly set to target ARMv6 rather than
>> ARMv7. That should be fixed and all those packages rebuilt at some
>> point.
>
>Haskell does appear to be configured for armv7 in Debian (in the
>sense that I had to patch the package in Raspbian to make it build
>for armv6). I would guess for haskell that this is an issue that
>needs digging into the upstream source, not just looking at the build
>system.

OK, fair enough. Something inside is using CP15 barriers for v7, then,
which is just Wrong. :-)

-- 
Steve McIntyre, Cambridge, UK.st...@einval.com
< Aardvark> I dislike C++ to start with. C++11 just seems to be
handing rope-creating factories for users to hang multiple
instances of themselves.



Re: Rebuilding the entire Debian archive twice on arm64 hardware for fun and proft

2019-01-06 Thread peter green

On 06/01/19 23:45, Steve McIntyre wrote:

In my initial testing for rebuilding armhf only, I did not enable
either of these. I was then finding *lots* of "Illegal Instruction"
crashes due to CP15 barrier usage in armhf Haskell and Mono
programs. This suggests that the baseline architecture in these
toolchains is incorrectly set to target ARMv6 rather than
ARMv7. That should be fixed and all those packages rebuilt at some
point.

Haskell does appear to be configured for armv7 in Debian (in the sense that I 
had to patch the package in Raspbian to make it build for armv6). I would guess 
for haskell that this is an issue that needs digging into the upstream source, 
not just looking at the build system.