[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 Sam James changed: What|Removed |Added CC||sam at gentoo dot org -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #38 from Giovanni Lostumbo --- I contacted Dr.(h.c) Richard Stallman the other day to inquire which of original GNU ld versions he wrote. He replied (spelling errors included and annotated with "[sic]" by me), "The original GNU ld was written ny[sic] me. I designed it to minimize total memory usage by reading all the object files and libraries twice in the right oder[sic]. Others wrote a different ld program in the early 1990s. I think that was to support additional features and output formats. However, machines' memory sizes were bigger and they did not preserve what I had done to reduce the total memory requirement." Upon checking the ld.c file in Binutils 1.9, Stallman wrote the first version of ld. "/* Written by Richard Stallman with some help from Eric Albert." He also wrote the 1988 and 1988 binutils files, which appear to be betas. The ld file(s) in Binutils 1.94-beta were written by Steve Chamberlain, and Binutils 2.1 is a completely different version onwards. Thus it appears that the only official version of binutils that contains the memory minimizing technique is 1.9. I have attached ld.c here along with the source (171KB). Link: http://www.mirrorservice.org/sites/sources.redhat.com/pub/binutils/old-releases/binutils-1.9.tar.bz2 What are the additional features? From an old GNU ld manual, "ld version 2 January 1994": "This version of ld uses the general purpose BFD libraries to operate on object files. This allows ld to read, combine, and write object files in many different formats--for example, COFF or a.out. Different formats may be linked together to produce any available kind of object file. See section BFD, for more information. Aside from its flexibility, the GNU linker is more helpful than other linkers in providing diagnostic information. Many linkers abandon execution immediately upon encountering an error; whenever possible, ld continues executing, allowing you to identify other errors (or, in some cases, to get an output file in spite of the error)." Source: https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_1.html#SEC1 Could the original version be adapted to retain memory minimizing techniques while supporting additional output formats? I don't know. It seems like most, if not all the new features result in additional memory usage (size of ld 1.9 is 171 KB uncompressed; size of ld folder in 2.1 is 696 KB-multiple ld files). Whether the original techniques were even tested to work with the new features is something worth exploring. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #37 from Giovanni Lostumbo --- Created attachment 14339 --> https://sourceware.org/bugzilla/attachment.cgi?id=14339&action=edit ld.c 3-29-1991 reads object files and libraries 2x to minimize memory -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 Giovanni Lostumbo changed: What|Removed |Added CC||giovanni.lostumbo at gmail dot com --- Comment #36 from Giovanni Lostumbo --- Binutils Version 0.001-2.13 (1988-2002) http://www.mirrorservice.org/sites/sources.redhat.com/pub/binutils/old-releases/ Binutils 2.30 was released in 2018. As Luke mentions in comment #20, he spoke with Stallman back then, who confirmed that the code that helped ld stay within resident memory was removed in the late 1990s. If the source code is indeed in that University of Kent mirror ^ for all the legacy versions of binutils pre-2003, one should quickly be able to locate the last version of binutils with the original code that Stallman used. I cannot interpret code, but I'm entering this discussion from a hardware design viewpoint. Thrashing results in increased power consumption, and quickly depletes battery and disk life, if it is even successful at compiling. To help organize a solution to this bug, I propose that the original algorithmic code be identified and analyzed here or somewhere where it can be compared and contrasted to the swap mechanism that replaced it. Then, if the code can be reimplemented (not rewritten, since Luke claims it was already working code- no need to reinvent the wheel here), it can be tested in both 32 bit and 64 bit systems. While I do not understand programming languages, I do understand that there is a possibility that the legacy code had algorithms intrinsic to 32 bit, and may require some adapting for 64 bit. I'm guessing it could also extrapolate to 64 bit, independent of architecture, but that is more of a mathematical question beyond my capabilities. I can also imagine other use-cases where restoring the original ld algorithms could be immensely efficient/beneficial. Say one is testing an array of new builds, and is modifying a select number of lines in source to test a new functionality or performance. One might develop 20 copies of source, save for a few lines of experimental code. Compiling each may take 24-48 hrs for each compile if it goes into swap space. That's 3-7 weeks of running a laptop or desktop. But if it uses code that never runs into swap, it can complete the compile much faster and and with much less power. Now multiply that by 1000x users, with a server running 20,000 virtual machines (e.g Amazon EC2). The Kwh can add up very quickly, especially for those who cannot test their device locally, and can't afford to rent a server with that many VMs for 24-48 hrs. Swap space can also be less secure. Sensitive data stored on swap could get stuck, especially on a remote server, which could experience a power outage and be accessed at a later time. Data is relatively less vulnerable to theft in RAM. If Stallman's code prevents the thrashing that arose out of the swap mechanism, then, this bug report would NOT an enhancement, it would be restoring the original, concisely operating functionality. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #35 from Luke Kenneth Casson Leighton --- On Sat, Jul 23, 2022 at 3:04 PM amodra at gmail dot com wrote: > And "new algorithm needed" is really saying "rewrite the linker". i mention this very early on in this bugreport: back in the early 90s it was indeed rewritten, to remove Dr Stallman's algorithms, on the flawed assumption "640k^H^H^H^H 4GB should be enough for anybody". > That's low > priority. Also, there are other linkers, eg. gold and lld, that are much > newer > than ld.bfd. gold suffers from similar problems - i was able to make it keel over just as easily. i've not heard of lld before: if it likewise makes the same flawed assumption that going into swap is acceptable, it will likewise result in the exact same problem. > They don't do much better at memory usage, do they? if Dr Stallman's carefully-crafted original algorithms had been left in place, which, just as in gcc, made *really certain* to only use *resident* RAM, we would not be having this conversation as this bugreport would not need to be raised. the fundamental flawed assumption is that it's "ok to use swap". the sheer overwhelming amount of cross-referencing required in a linker *100% guarantee* that even 10 kbytes over resident RAM will result in thrashing. any rewrite or redesign that does not take that into account is 100% guaranteed to be problematic. this is just how it is: it's basic fundamental computer science that a linker *has* to jump around across the entirety of *all* of the objects it's trying to link. this makes the "Working Set" *equal* to 100% of the available Swap, which is unfortunately the very definition of "thrash conditions". -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #34 from Alan Modra --- I'll note that the priority and severity fields in bugzilla are primarily for the use of maintainers, or at least that should be the way they are treated. They are not for bug reporters to say "this bug is really, really important!" That said, I've experienced exactly the pain you ran into with a machine swapping like crazy, in fact it used to happen to me quite regularly. And some things we do, like trying to free memory before exit to pacify people crying "memory leak!" only make things worse when you run into swap. I had one link take 30 minutes extra just freeing memory.. In putting the severity at "enhancement" I'm merely reflecting reality. Using more memory than necessary is not a bug, at least not until you run out of memory. Even with ideal memory usage you will always be able to generate a workload that is just too big to handle. And "new algorithm needed" is really saying "rewrite the linker". That's low priority. Also, there are other linkers, eg. gold and lld, that are much newer than ld.bfd. They don't do much better at memory usage, do they? -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #33 from luke.leighton at gmail dot com --- that was supposed to be a private reply, bugzilla masked the email address "amodra ". the comment still stands though. i apologise for the toppost context. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #32 from luke.leighton at gmail dot com --- (replying privately) dealing with this one was deeply unpleasant. i gave up as people were not listening. i refer people to it frequently whenever they encounter serious build problems. the torture demo is dead easy to autogenerate programs that crash both ld and gold, for both 32 and 64 bit. for 64 bit just keep increasing the parameters until programs exceed 16 gbytes in size and in some cases they won't even link at all. there are multiple complaints by distro builders that their 128 GB and 256 GB build farms actually kernel panic if they happen to accidentally have e.g. firefox, libreoffice and other massive linking occur simultaneously, due to thrashing. with 128 GB of RAM! i have had my very expensive laptop hit 1,200 loadavg due to this problem, it nearly lost me a year's work and took 25 minutes to get the cursor to move so i could hold down Ctrl-C and terminate the build. it's exacerbated significantly by debug builds. l. On July 23, 2022 3:37:03 AM GMT+01:00, amodra at gmail dot com wrote: >https://sourceware.org/bugzilla/show_bug.cgi?id=22831 > >Alan Modra changed: > > What|Removed |Added > > Severity|critical|enhancement > Status|WAITING |NEW > Priority|P1 |P3 > >--- Comment #31 from Alan Modra --- >Putting priority and severity back where they belong. > >-- >You are receiving this mail because: >You reported the bug. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 Alan Modra changed: What|Removed |Added Severity|critical|enhancement Status|WAITING |NEW Priority|P1 |P3 --- Comment #31 from Alan Modra --- Putting priority and severity back where they belong. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 Dmitry Nezhevenko changed: What|Removed |Added CC||dion at inhex dot net -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #30 from Luke Kenneth Casson Leighton --- cross-reference here, raised priority critical bug in the debian bugtracker as well: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=919882 -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #29 from Luke Kenneth Casson Leighton --- i tried the same massive 6GB link as was carried out under an i386 (32-bit) chroot. this time both of them succeeded. ld-bfd with --no-keep-memory succeeded as before with a warning, using only 280mb during the linker phase (the number of functions called had been increase: python evil_linker_torture.py 800 400 20 800) ld-gold *also* succeeded, once again requiring 6.5 GB of resident RAM to carry out the link [on a 64-bit system]. it would appear that the options recommended to use in comment #25 do not prevent ld-gold from mallocing the full memory of the full size of the target executable. consequently, attempting to link a 6 GB executable on a 32-bit system (with an obvious limit of 4GB) is guaranteed to fail. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #28 from Luke Kenneth Casson Leighton --- (In reply to Luke Kenneth Casson Leighton from comment #27) > ld-bfd - with "--no-keep-memory" - only requires 750 MB of resident RAM, to > link the exact same 6GB executable. (and aside from a warning "i686-linux-gnu-ld.bfd: warning: cannot find entry symbol _start; defaulting to 08048094", succeeded. correction: i had added one too many "0s" onto the evil python command. however after correction, the results are exactly the same: debian-i386-chroot$ python evil_linker_torture.py 80 40 20 8000 * make maing FAILs * make main SUCCEEDs (except only using 85mb for the linker phase) so this is far more complex and involved than it first appears. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 Luke Kenneth Casson Leighton changed: What|Removed |Added Attachment #11522|0 |1 is obsolete|| --- Comment #27 from Luke Kenneth Casson Leighton --- Created attachment 11540 --> https://sourceware.org/bugzilla/attachment.cgi?id=11540&action=edit updated version of liinker torturer i decided to run an i386 debian chroot, using a variant of evil_linker_torture.py and running these options: $python evil_linker_torture.py 80 40 20 8 the results of the link (an error) are below: $ make -j8 maing i686-linux-gnu-ld.gold src0.o src1.o src10.o src11.o src12.o src13.o src14.o src15.o src16.o src17.o src18.o src19.o src2.o src20.o src21.o src22.o src23.o src24.o src25.o src26.o src27.o src28.o src29.o src3.o src30.o src31.o src32.o src33.o src34.o src35.o src36.o src37.o src38.o src39.o src4.o src40.o src41.o src42.o src43.o src44.o src45.o src46.o src47.o src48.o src49.o src5.o src50.o src51.o src52.o src53.o src54.o src55.o src56.o src57.o src58.o src59.o src6.o src60.o src61.o src62.o src63.o src64.o src65.o src66.o src67.o src68.o src69.o src7.o src70.o src71.o src72.o src73.o src74.o src75.o src76.o src77.o src78.o src79.o src8.o src9.o -g -g -g --no-mmap-output-file --no-map-whole-files --no-keep-files-mapped --no-keep-memory -o maing i686-linux-gnu-ld.gold: internal error in convert_types, at ../../gold/gold.h:192 this is with the following version: $ gold --version GNU gold (GNU Binutils for Debian 2.28) 1.14 the most likely reason is that the size of the executable is over 6GB, and a 32-bit version of gold cannot cope. when run on 64-bit it does fine, the only strange thing being that it still requires 7GB of resident RAM to link a 6GB executable. ld-bfd - with "--no-keep-memory" - only requires 750 MB of resident RAM, to link the exact same 6GB executable. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #26 from Luke Kenneth Casson Leighton --- (In reply to Ian Lance Taylor from comment #25) > When using gold the key options are --no-mmap-output-file > --no-map-whole-files --no-keep-files-mapped. Can you confirm that those > options--all of them together--were tried with gold? hi ian, as i mentioned to hj, i personally do not have safe resources (resource that would not be damaged by doing so). i am acting as a go-between, alerting various people to the nature of this bug. i will contact them and alert them to the options that you describe. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 Ian Lance Taylor changed: What|Removed |Added CC||ian at airs dot com --- Comment #25 from Ian Lance Taylor --- When using gold the key options are --no-mmap-output-file --no-map-whole-files --no-keep-files-mapped. Can you confirm that those options--all of them together--were tried with gold? -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #24 from Luke Kenneth Casson Leighton --- hiya nick, thanks for trying out the torture program. basically the parameters there generate a 6.1mb object file (with gcc 7.3), and 3000x that equals an 18 gbytes executable. so, it's possible to work out what needs to be done: increase the 2nd or 3rd parameter directly proportionately so as to ensure that the object file increases to where the available RAM will be exceeded. regarding ld-gold: https://lists.debian.org/debian-devel/2019/01/msg00069.html so no, it doesn't work. mike hommey tried gnu gold for firefox on debian 32-bit: everything he's tried has failed. that leaves cross-compiling using a 64-bit system as literally *the* only option (which is completely unacceptable as a band-aid "solution") regarding "-g -g -g": it increases the amount of debug information, and consequently is a quick-hack way to increase the size of the output binary. regarding the evil idea of letting the limit be hit and weeding out applications that try it, on the basis that it's pretty insane to have such massive static executables: i really like it :) ... except... the first casualty is already being hit, and that's *all* 32-bit hardware. armhf, armel, i686, MIPS32 and a few more besides. all distros supporting 32-bit hardware are currently going through hell, and/or are *DROPPING* 32-bit support entirely, whilst 64-bit hardware continues to "accept" the insane inexorable increase in static executable size. so, perfectly good 32-bit hardware is being thrown into landfill because there's absolutely no way they can get hold of a modern distro that works on it... ... all because of this one bug that dates back to a short-sighted decision from the late 1990s. hence why i raised this to priority one critical level a couple of days ago. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 Nick Clifton changed: What|Removed |Added CC||nickc at redhat dot com --- Comment #23 from Nick Clifton --- (In reply to Luke Kenneth Casson Leighton from comment #22) Hi Luke, > $ python evil_linker_torture.py 3000 400 200 50 Actually this ran OK on my system. Admittedly it is a fairly big machine, and I am sure that you could suggest increased parameters that would bring it to its knees. I was a little bit confused as to why the "-g" flag appears three times in both CFLAGS and LDFLAGS. Is this really necessary ? Anyway, my main question is - have you tried using the gold linker instead of the bfd linker ? (Ie adding "-fuse-ld=gold" to the final command line). The reason being that the bfd linker is very old, and it is not wholly surprising that it does not cope well with modern, very large, binaries. The gold linker on the other hand is new, it has been designed from the ground up with large ELF programs in mind, and it does not have any of the cruft that encumbers the bfd linker. Cheers Nick PS. Waving my "devil's advocate" flag for a moment. It could be argued that not linking these gigantic binaries might actually by a good thing, as they are getting ridiculously large. Such binaries are going to take a huge amount of time (and resources) to link, and if linkers were to refuse to link them, then the programmers might have to rethink their monolithic approach and maybe come up with a more modular design. Which might not be a bad thing at all... -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #22 from Luke Kenneth Casson Leighton --- Created attachment 11522 --> https://sourceware.org/bugzilla/attachment.cgi?id=11522&action=edit repro test case attached is a test file that can generate a Makefile and associated header and c files that will easily exceed the capacity of a 64-bit system to cope with. here are arguments to the script that will cause GNU ld to attempt to create an EIGHTEEN GIGABYTE executable. $ python evil_linker_torture.py 3000 400 200 50 a 32-bit system will be completely unable to cope with this, as it hopelessly exceeds the 4GB resident limit by 450%. when compiled on a 64-bit system it was necessary to terminate it with prejudice, as by the time it got to 9.5GB resident memory it was in danger of putting the compile host into severe and irrecoverable swap thrashing. if the Makefile is modified to include the option "-Wl,--no-keep-memory", the following output is generated and the errors result in the link phase terminating unsuccessfully. ld: warning: cannot find entry symbol _start; defaulting to 00401000 ld: src9.o: in function `fn_9_0': /home/lkcl/src/ld_torture/src9.c:3006:(.text+0x27): relocation truncated to fit: R_X86_64_PLT32 against symbol `fn_1149_322' defined in .text section in src1149.o ld: /home/lkcl/src/ld_torture/src9.c:3008:(.text+0x41): relocation truncated to fit: R_X86_64_PLT32 against symbol `fn_1387_379' defined in .text section in src1387.o ld: /home/lkcl/src/ld_torture/src9.c:3014:(.text+0x8f): relocation truncated to fit: R_X86_64_PLT32 against symbol `fn_1821_295' defined in .text section in src1821.o ld: /home/lkcl/src/ld_torture/src9.c:3015:(.text+0x9c): relocation truncated to fit: R_X86_64_PLT32 against symbol `fn_1082_189' defined in .text section in src1082.o ld: /home/lkcl/src/ld_torture/src9.c:3016:(.text+0xa9): relocation truncated to fit: R_X86_64_PLT32 against symbol `fn_183_330' defined in .text section in src183.o ld: /home/lkcl/src/ld_torture/src9.c:3024:(.text+0x111): relocation truncated to fit: R_X86_64_PLT32 against symbol `fn_162_394' defined in .text section in src162.o ld: /home/lkcl/src/ld_torture/src9.c:3026:(.text+0x12b): relocation truncated to fit: R_X86_64_PLT32 against symbol `fn_132_235' defined in .text section in src132.o ld: /home/lkcl/src/ld_torture/src9.c:3028:(.text+0x145): relocation truncated to fit: R_X86_64_PLT32 against symbol `fn_1528_316' defined in .text section in src1528.o ld: /home/lkcl/src/ld_torture/src9.c:3029:(.text+0x152): relocation truncated to fit: R_X86_64_PLT32 against symbol `fn_1178_357' defined in .text section in src1178.o ld: /home/lkcl/src/ld_torture/src9.c:3031:(.text+0x16c): relocation truncated to fit: R_X86_64_PLT32 against symbol `fn_1180_278' defined in .text section in src1180.o ld: /home/lkcl/src/ld_torture/src9.c:3035:(.text+0x1a0): additional relocation overflows omitted from the output ^Cmake: *** Deleting file `main' make: *** [main] Interrupt -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 Luke Kenneth Casson Leighton changed: What|Removed |Added Priority|P2 |P1 Severity|normal |critical -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #21 from Luke Kenneth Casson Leighton --- to emphasise that this is strategically becoming an absolutely critical bug: https://lists.debian.org/debian-devel/2019/01/msg00081.html here it has been reported that even when using -Wl,--no-keep-memory, firefox completely fails to build on a 32-bit system. the debian developers are presently testing cross-compiling 32-bit packages from 64-bit hosts. they report that ubuntu has *already* moved over to this procedure. 32 bit distributions are no longer self-hosting. this bug is now a priority 1 critical bug. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #20 from Luke Kenneth Casson Leighton --- ok so i spoke to dr stallman a couple of weeks ago, and he confirmed that code that is near-identical to that which i described in the very first comment of this bugreport was REMOVED some time in the late 1990s, by persons not familiar with the type of issues that linking has to deal with. the original code that dr stallman wrote did two things: (1) checked to make absolutely sure that it stayed within the bounds of RESIDENT available memory, if it could. (2) that it ONLY loaded into memory the maximum number of object files that would ensure that it remained within bounds of resident available memory, if it could. this code is essential to research and restore its functionality. this is NOT a 32-bit-only problem. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #19 from Luke Kenneth Casson Leighton --- (In reply to H.J. Lu from comment #18) > (In reply to Luke Kenneth Casson Leighton from comment #17) > > https://issues.guix.info/issue/33676 > > > > so we have a successful report that the advised option helps. > > > > Have you tried my users/hjl/pr18028 branch? as i mentioned before, i (personally) do not have the resources to try anything out: i am acting as a go-between, to find people who *can* try out different branches. i took a look at the diffs: https://github.com/hjl-tools/binutils-gdb/compare/users/hjl/pr18028#diff-e65a96fc956244cba3a031705b7b737aR3484 some comments: bfd/linker.c line 3492 - i see what's going on. this is great, it *in principle* makes sure that the amount of memory used is not exceeded. bfd/linker.c line 3484 - this is completely arbitrary. this is NOT repeat NOT, as i have already said, and repeat, NOT limited to 32 bit. 64-bit systems ALSO HAVE THE EXACT SAME PROBLEM. this test needs to be removed. ld/ldmain.c: line 275 - specifying half the memory is arbitrary. so, as i said: it is not enough. what if the amount of memory used by other programs exceeeds half the available memory? conditions where that will occur immediately: make -j2. one ld process will take half the memory the other ld process will take half the memory. now BOTH processes will enter thrashing. as i said, right in the original report: it is necessary to DYNAMICALLY check the amount of available memory, just like gcc does. in that way, ld will remain DYNAMICALLY under the limit, it will STAY in resident memory. ld must be prevented from going into swap space, at all costs, basically. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #18 from H.J. Lu --- (In reply to Luke Kenneth Casson Leighton from comment #17) > https://issues.guix.info/issue/33676 > > so we have a successful report that the advised option helps. > Have you tried my users/hjl/pr18028 branch? -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #17 from Luke Kenneth Casson Leighton --- https://issues.guix.info/issue/33676 so we have a successful report that the advised option helps. please note: the advised option is **NOT** repeat **NOT** a solution. destroying all of the memory and throwing away useful information cannot possibly be called a "solution". ld *really does* need to make *optimal* use of memory, by restoring the techniques that were used decades ago, and to use dynamic analysis of the amount of available *RESIDENT* memory, so as to very very specifically avoid swapping. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #16 from Luke Kenneth Casson Leighton --- the following came up in a debian discussion and is copied here: Florian Weimer 8:31 PM (14 minutes ago) to Luke, Steve, ARM, debian-release, debian-admin, team, debian-gcc, debian-glibc * Luke Kenneth Casson Leighton: > that is not a surprise to hear: the massive thrashing caused by the > linker phase not being possible to be RAM-resident will be absolutely > hammering the drives beyond reasonable wear-and-tear limits. which is > why i'm recommending people try "-Wl,--no-keep-memory". Note that ld will sometimes stuff everything into a single RWX segment as a result, which is not desirable. Unfortunately, without significant investment into historic linker technologies (with external sorting and that kind of stuff), I don't think it is viable to build 32-bit software natively in the near future. Maybe next year only a few packages will need exceptions, but the number will grow with each month. Building on 64-bit kernels will delay the inevitable because more address space is available to user space, but that's probably 12 to 18 month extended life-time for native building. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #15 from Luke Kenneth Casson Leighton --- (In reply to H.J. Lu from comment #14) > (In reply to Luke Kenneth Casson Leighton from comment #13) > > i have 16 GB of DDR4 2400 mhz RAM on my laptop... and because when > > that system goes into swap (it has an NVMe) its loadavg goes over 120 > > and it is absolutely guaranteed to crash about 30 seconds later, > > adding more RAM is *not* the solution. > > > > however much more RAM is added, there *will* be a piece of software > > within 1-5 years which requires more RAM for the linker phase than any > > system provides. > > > > Please try if "-Wl,--no-keep-memory" works. i'll alert some people and see if they are in a position to try that. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #14 from H.J. Lu --- (In reply to Luke Kenneth Casson Leighton from comment #13) > i have 16 GB of DDR4 2400 mhz RAM on my laptop... and because when > that system goes into swap (it has an NVMe) its loadavg goes over 120 > and it is absolutely guaranteed to crash about 30 seconds later, > adding more RAM is *not* the solution. > > however much more RAM is added, there *will* be a piece of software > within 1-5 years which requires more RAM for the linker phase than any > system provides. > Please try if "-Wl,--no-keep-memory" works. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #13 from Luke Kenneth Casson Leighton --- On Wed, Mar 14, 2018 at 12:26 PM, hjl.tools at gmail dot com wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=22831 > > --- Comment #12 from H.J. Lu --- > (In reply to Luke Kenneth Casson Leighton from comment #11) >> (In reply to H.J. Lu from comment #10) >> there are two issues: >> >> 1. 32-bit system >> 2. 64-bit system >> >> both 32-bit and 64-bit are affected by this issue. >> >> the patch that you wrote however looks like it only addresses >> 32-bit. > > True. My patch is a starting point. I'd like to know if it helps > 32-bit system or not. If it doesn't address the issue for 32-bit > system, my approach won't for 64-bit system. unfortutely i cannot risk damaging my system by carrying out any tests (because any tests will result in a loadavg over 120 and 30 seconds later it is guaranteed to hard crash). so we will have to wait for someone else to test the patch. >> that leaves 64-bit systems still affected. > > You can always and should get more RAM for 64-bit system. i have 16 GB of DDR4 2400 mhz RAM on my laptop... and because when that system goes into swap (it has an NVMe) its loadavg goes over 120 and it is absolutely guaranteed to crash about 30 seconds later, adding more RAM is *not* the solution. however much more RAM is added, there *will* be a piece of software within 1-5 years which requires more RAM for the linker phase than any system provides. how does gcc do compilation? how does it stay within the bounds of available memory? l. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #12 from H.J. Lu --- (In reply to Luke Kenneth Casson Leighton from comment #11) > (In reply to H.J. Lu from comment #10) > there are two issues: > > 1. 32-bit system > 2. 64-bit system > > both 32-bit and 64-bit are affected by this issue. > > the patch that you wrote however looks like it only addresses > 32-bit. True. My patch is a starting point. I'd like to know if it helps 32-bit system or not. If it doesn't address the issue for 32-bit system, my approach won't for 64-bit system. > that leaves 64-bit systems still affected. You can always and should get more RAM for 64-bit system. But 32-bit system is limited by address space. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #10 from H.J. Lu --- (In reply to Luke Kenneth Casson Leighton from comment #9) > (In reply to H.J. Lu from comment #8) > > > Have you tried users/hjl/pr18028 branch? > > no, hj, i have not, because it is a fix for a 32-bit system, > not a 64-bit system. what do you need to know to make it > clear that this is a problem that occurs on a 64-bit system > as well as a 32-bit system? What is your main issue? 32-bit system or 64-bit system? -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #11 from Luke Kenneth Casson Leighton --- (In reply to H.J. Lu from comment #10) > What is your main issue? i do not (personally) have an issue, hj. this is a flaw that is independent of me (personally). do you mean to ask, "what is THE main issue?" > 32-bit system or 64-bit system? there are two issues: 1. 32-bit system 2. 64-bit system both 32-bit and 64-bit are affected by this issue. the patch that you wrote however looks like it only addresses 32-bit. that leaves 64-bit systems still affected. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #9 from Luke Kenneth Casson Leighton --- (In reply to H.J. Lu from comment #8) > Have you tried users/hjl/pr18028 branch? no, hj, i have not, because it is a fix for a 32-bit system, not a 64-bit system. what do you need to know to make it clear that this is a problem that occurs on a 64-bit system as well as a 32-bit system? is there a particular reason why you are not answering my questions? in particular can i refer you to my questions at the end of comment #6? i am trying to understand if there is anything unclear about my questions. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #8 from H.J. Lu --- (In reply to Luke Kenneth Casson Leighton from comment #7) > hi hjl, > > so how are you getting on with analysing this problem? is there anything > that is unclear that i can assist you with understanding? Have you tried users/hjl/pr18028 branch? -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #7 from Luke Kenneth Casson Leighton --- hi hjl, so how are you getting on with analysing this problem? is there anything that is unclear that i can assist you with understanding? -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #6 from Luke Kenneth Casson Leighton --- (In reply to H.J. Lu from comment #5) > Please read my suggestion again and follow it to the letter. sorry, hjl, i appreciate you're busy so are providing extremely short responses: please read again what i wrote. i am *not* the person installing or running this. i am acting merely as a *messenger* after seeing and experiencing reports from at least FIVE separate teams over the past SIX years of increasingly difficult build problems due to this increasingly-important bug. i am NOT the person who will be running any of the suggestions that you are giving (because my laptop will potentially be damaged by doing so and i cannot risk that), i will be RELAYING the suggestions to various people across the internet, making them AWARE that you are willing to tackle this particular problem. therefore i require and seek CLARITY on EXACTLY what it is that i am going to tell people, BEFORE suggesting to them that they come and look at this bugreport. is there anything that is unreasonable about that? if so, please let me know. https://github.com/hjl-tools/binutils-gdb/commit/de060bbcc7cca9dce213dc6593887a8e ok so after re-reading twice, i eventually spotted the (misordered) branch name. can i suggest in future, rather than refer to the main branch, to instead post people the link *directly* to the branch, like this? https://github.com/hjl-tools/binutils-gdb/tree/users/hjl/pr18028 it was a simple mistake, much more helpful to say "you missed that i suggested trying a branch named xyz". now, i took a quick look, and there is an assumption in the patch, that the problem will *exclusively* occur on 32-bit systems. this is not the case. there are actually *two* inter-related problems. the FIRST is that the amount of memory used for linking e.g. firefox is so insane that it now requires 7 GIGABYTES of resident memory in order to avoid thrashing... this is simply impossible to do on a 32-bit system. the SECOND is that the linker phase GOES INTO THRASHING IN THE FIRST PLACE and has done for many many years now INCLUDING on 64-BIT SYSTEMS. if you read the original bug-report you will see that i said that one 64-bit x86 laptop that i had, 6 years ago, only had 2GB of RAM. the one that i have now has *16* GB of RAM but because it is an NVMe SSD and an ultra-expensive laptop (USD $2500) i cannot risk the NVMe drive getting damaged so swap is *DISABLED*. despite this, it still goes into total meltdown (loadavg over 100) whenever memory usage approaches 16GB. for both these systems - both of them *64-bit* systems *NOT* 32-bit systems - going into swap-space is an absolute unmitigated disaster, but this is now considered to be NORMAL that a build should go from taking about 1 hour to link if it is below the 100% resident memory usage threshold to taking SEVERAL DAYS in some cases if it goes even the TINIEST FRACTION above the available resident memory... because distros *do not have any choice in the matter*. this is why i suggested the algorithm above, because the algorithm above was part of an exercise set by extremely competent lecturers at Imperial College University during an era where available memory was a tiny fraction of what it is now, and running in virtual memory was simply flat-out inconceivable because most systems were still 16-bit let alone 32-bit. so. questions: 1) would the proposed patch - which reduces virtual memory usage for 32-bit systems to half that of the available memory - *actually* fix the problem as described on a *64-bit* system? 2) what would happen if *more than half* of the available virtual memory is taken up by programs that happen to be running at the same time as the linker phase? consider the cases where, in complex builds, there may be a REALLY LARGE chain of applications that have spawned any given usage of the ld executable, such that the expectation that there will *be* half of the total amount of virtual memory space even *available* is not actually true. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #5 from H.J. Lu --- (In reply to Luke Kenneth Casson Leighton from comment #4) > (In reply to H.J. Lu from comment #3) > > Please try users/hjl/pr18028 branch at > > > > https://github.com/hjl-tools/binutils-gdb > > did i miss something? is there another patch in that branch which > i did not see? Please read my suggestion again and follow it to the letter. -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #4 from Luke Kenneth Casson Leighton --- (In reply to H.J. Lu from comment #3) > Please try users/hjl/pr18028 branch at > > https://github.com/hjl-tools/binutils-gdb hi hjl, i will point some people at this, it may be some time as one of them is the debian-riscv team, they may be maintaining special patches so it might not be as straightforward as just cloning the above branch. others severely affected include armhf systems (max 2GB RAM, 32-bit) and i note that the patch from a couple of days ago mentions "enabled by default on x86", is that correct? what options would be needed to try this out? also i note from the patch commit message it says "change maximum page size", how would that stop severe / critical thrashing? how would it reduce memory usage to only that which is available on the actual system (like when gcc performs compiles, it only uses available memory)? did i miss something? is there another patch in that branch which i did not see? -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #3 from H.J. Lu --- Please try users/hjl/pr18028 branch at https://github.com/hjl-tools/binutils-gdb -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 H.J. Lu changed: What|Removed |Added CC||hjl.tools at gmail dot com -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 --- Comment #2 from Luke Kenneth Casson Leighton --- hi HJ, thanks for that advice - bear in mind that i am not actually directly involved in any of the projects that are experiencing these insane levels of thrashing. we (you and i) are therefore talking "to the general wider internet". so, for anyone *finding* this bugreport, HJ is recommending that, if you are a build maintainer and the linker phase is going into insane thrashing, that you try 2.30 and the option HJ recommends. now. here's the thing, HJ: distros have to "fix" the version of binutils and make it the default / standard for sometimes up to 18 months. also, there's no *guarantee* that they will ever hear about this option. can i recommend, if reports start coming in that it works, that this option be either enabled *by default*... or... that instead, there be an "auto-resident-memory detect" option similar to that used in gcc, where it detects available free resident RAM for compiling and uses that and that alone? ... and then when *that* is stable... make *that* the default. what do you think? -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils
[Bug ld/22831] ld causes massive thrashing if object files are not fully memory-resident: new algorithm needed
https://sourceware.org/bugzilla/show_bug.cgi?id=22831 H.J. Lu changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2018-02-11 Version|unspecified |2.31 (HEAD) Ever confirmed|0 |1 --- Comment #1 from H.J. Lu --- Please try binutils 2.30 with "-Wl,--no-keep-memory". -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils