Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
On 06/06/2018 08:54 PM, Joey Hess wrote: That actually makes some kind of sense, since this bug has something to do with garbage collection, and THP may result in different memory allocation patterns due to the changed page size. Except.. Isn't THP enabled by default on most systems? Indeed it appears you're right, it defaults to on nowadays. Didn't used to (e.g., checked and it's not enabled by default in Stretch). Also, I left the VM repeatedly starting and stopping the assistant overnight; that one failure I got was the only one it saw. Out of many thousand attempts. (And unfortunately, "power cycling" the VM didn't produce another, so it wasn't just the first run). So back to the drawing board, I guess ☹
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
Anthony DeRobertis wrote: > I finally managed to reproduce once on my VM after turning on transparent > hugepages (which both my workstations are running with). The crash rate is > much lower — under 10%, vs. 80–90% on the workstations — but... > > Anyway, the next thing I plan to test is to turn off transparent hugepages > on a workstation and see if that avoids the issue. I have *no idea* why it > matters, and even less so why a rebuild fixes it. Might be a bit, as that > requires rebooting. That actually makes some kind of sense, since this bug has something to do with garbage collection, and THP may result in different memory allocation patterns due to the changed page size. Except.. Isn't THP enabled by default on most systems? -- see shy jo signature.asc Description: PGP signature
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
I finally managed to reproduce once on my VM after turning on transparent hugepages (which both my workstations are running with). The crash rate is much lower — under 10%, vs. 80–90% on the workstations — but... Anyway, the next thing I plan to test is to turn off transparent hugepages on a workstation and see if that avoids the issue. I have *no idea* why it matters, and even less so why a rebuild fixes it. Might be a bit, as that requires rebooting.
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
I've tried to reproduce on a newly-installed Buster VM, but haven't been able to get it to crash. I guess that at least explains why only I'm complaining — probably something weird about my two workstations that cause it. Going to see if I can find it, but welcome any suggestions as to what it might be.
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
I've managed to reproduce the crash with the Debian build on a different machine, with a new git-annex repository (containing only public data, so I can share it if it helps). This took a bit of playing around — so these steps may not be *entirely* accurate. 1. Created a new git repository at /tmp/ohnoes 2. git annex init in it 3. started assistant (all good at this stage), add a few files (fortune -l > a, etc.) 4. cd /tmp && git clone ssh://localhost/tmp/ohnoes ohyeses 5. cd ohyeses && git-annex init ohyeses 6. git annex sync --content 7. cd /tmp/ohnoes && git remote add yes ssh://localhost/tmp/ohyeses 8. start assistant in /tmp/ohnoes: still OK. (Tried a bunch of times). 9. add a bunch of files to /tmp/ohyeses: for f in `seq 1 100`; do fortune -l > $f && git-annex add $f && git commit -m "add $f" ; done 10. stop and start the assistant a bunch of times in /tmp/ohnoes: Once crashed, but mostly worked. 11. add another bunch of files to /tmp/ohyeses (as above, through 150) 12. stop and start the assistant in /tmp/ohnoes: first try crashed. And second. Third worked. 4–8 crashed; 9 worked; 10 crashed. 13. installed my binNMU package (no source changes, just a rebuild—same one I installed on the other machine), and problem went away BTW: I tried "git-annex test" on this machine before installing the rebuilt package, and it passed. Going to try installing a buster VM and see if it's reproducible there. May not be until tomorrow, though, depending on how long it takes to set up.
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
On May 30, 2018 1:57:23 PM UTC, Sean Whitton wrote: >Hello, > >On Tue, May 29 2018, Anthony DeRobertis wrote: > >> ... and it turns out my build does not reproduce the problem. > >Just to be clear, you mean without Joey's patch? Correct. Just rebuilding it, without modifying the source, fixes it for me.
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
Hello, On Tue, May 29 2018, Anthony DeRobertis wrote: > ... and it turns out my build does not reproduce the problem. Just to be clear, you mean without Joey's patch? -- Sean Whitton
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
On Tue, May 29, 2018 at 12:35:59PM -0400, Joey Hess wrote: > Any chance you can build git-annex from source, so we can try a few > modifications to try to narrow this down? > > sudo apt-get build-dep git-annex > apt-get source git-annex > cd git-annex-6.20180509 > make > PATH=`pwd`:$PATH > export PATH > > Then see if you can reproduce the problem, ... and it turns out my build does not reproduce the problem. I ran "git annex assistant" ten times with the Debian package, and 10 times with the source built as above. When it started up, I followed up with "git annex assistant --stop". Results are: Debian package: try 1–4 fail, 5 ok, 6–10 fail My build: try 1–10 ok So just to be sure, I rm -Rf'd that source tree, extracted it again (dpkg-source -x) and did a dch --bin-nmu to up the version. Followed by debuild -b -uc to make a new package, and installed it. My package: try 1–10 ok I'm running diffoscope on the two packages, but it's been thinking long and hard on git-annex's .text segment, so I'm guessing it won't be useful. I'm not sure where to go for here — I'm not sure if forcing a rebuild would fix it in Debian. Or how much work it'd be for me to reproduce Debian's build environment from https://buildd.debian.org/status/fetch.php?pkg=git-annex=amd64=6.20180509-1=1525912069=0 in a VM and see if then I can reproduce a broken build, and if that'd really help us learn anything. -- Democracy is a process by which the people are free to choose the man who will get the blame. -- Laurence J. Peter
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
Any chance you can build git-annex from source, so we can try a few modifications to try to narrow this down? sudo apt-get build-dep git-annex apt-get source git-annex cd git-annex-6.20180509 make PATH=`pwd`:$PATH export PATH Then see if you can reproduce the problem, and then try applying the attached patch, and re-making and see if it avoids the problem. The patch tries to avoid doing anything before forking when starting the assistant, so it should do only the same pre-fork operations as git-annex watch. (Other than some differences due to argument parsing I suppose.) -- see shy jo diff --git a/Command/Assistant.hs b/Command/Assistant.hs index 70088674d..b148b2566 100644 --- a/Command/Assistant.hs +++ b/Command/Assistant.hs @@ -60,8 +60,8 @@ start o liftIO autoStop stop | otherwise = do - liftIO ensureInstalled - ensureInitialized + --liftIO ensureInstalled + --ensureInitialized Command.Watch.start True (daemonOptions o) (startDelayOption o) startNoRepo :: AssistantOptions -> IO () signature.asc Description: PGP signature
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
On 05/27/2018 05:24 PM, Joey Hess wrote: Anthony DeRobertis wrote: So right now, it's just refusing the run in the background :-/ If you're able to reproduce the bug on demand that way, that could point to the way git-annex daemonizes itself with forkProcess. Which from its documentation: forkProcess comes with a giant warning: since any other running threads are not copied into the child process, it's easy to go wrong: e.g. by accessing some shared resource that was held by another thread in the parent. git-annex tries to use forkProcess in a safe way, but that's not especially well-defined or easy to check. You might try running git-annex watch instead of git-annex assistant, since they both daemonize but the latter has a simpler code path. I stopped the (foreground) assistant with "git annex assistant --stop"; then ran "git annex assistant" (worked). Then stopped it again. Then started (failed). Tried another few times, all failed. I tried a few "git annex watch" followed by "git annex watch --stop" (after getting the "(started...)" message in the log — all of those worked. Tried a few more "git annex assistant", all failed. So, absent the one "git annex assistant" which worked, those all failed. And all of the "git annex watch" worked. So does starting the assistant with "git annex assistant --foreground". (And maybe that's why the webapp was working for me earlier too — it doesn't fork...) Honestly, if that code is a nightmare... you could just remove it. "git annex assistant --foreground &" is easy enough. And presumably entirely avoids the forkProcess issue there.
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
Anthony DeRobertis wrote: > So right now, it's just refusing the run in the background :-/ If you're able to reproduce the bug on demand that way, that could point to the way git-annex daemonizes itself with forkProcess. Which from its documentation: forkProcess comes with a giant warning: since any other running threads are not copied into the child process, it's easy to go wrong: e.g. by accessing some shared resource that was held by another thread in the parent. git-annex tries to use forkProcess in a safe way, but that's not especially well-defined or easy to check. You might try running git-annex watch instead of git-annex assistant, since they both daemonize but the latter has a simpler code path. -- see shy jo signature.asc Description: PGP signature
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
On 05/27/2018 01:20 PM, Joey Hess wrote: One person reported the same error message 7 years ago here: https://ghc.haskell.org/trac/ghc/ticket/5085 They were using git-annex get, not the assistant when it crashed. They also were able to git bisect git-annex's code and found an utterly innocuous commit that triggered whatever the problem is. (commit 828a84ba3341d4b7a84292d8b9002a8095dd2382) It's probably a memory problem, or a ghc bug, or a bug in some library that is doing something memory related and messes up, such that ghc's garbage collector sees bad data. First off, thank you for git-annex. It's really useful software. I can't imagine syncing all this data between several desktops, laptops, & a few tablets without it. And apologies in advance for this rambling message. Just finished 2, almost 3 passes of memtest86+ with no errors. (Before starting using them machine, several years ago, it had over 24h of memtest). It's been stable, haven't seen any random crashes/corruption/etc. — so I doubt it's a hardware issue. It's also not overclocked or anything silly like that. I'm not sure what bisecting this would entail; as far as I can tell... it's random. (BTW: I use the CLI interface too, quite a lot, and have never seen a weird error from it. Only from the assistant). Freshly after booting, I ran "git annex assistant"; it gave one of those errors in the log. So I ran "git annex assistant --debug"; on the console it gave: [2018-05-27 15:56:53.403932801] read: uname ["-o"] [2018-05-27 15:56:53.405388769] process done ExitSuccess [2018-05-27 15:56:53.502601514] logging to .git/annex/daemon.log [2018-05-27 15:56:53.503385967] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","git-annex"] [2018-05-27 15:56:53.505100304] process done ExitSuccess [2018-05-27 15:56:53.505256854] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/heads/git-annex"] [2018-05-27 15:56:53.50805146] process done ExitSuccess [2018-05-27 15:56:53.517155935] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch"] [2018-05-27 15:56:53.517579777] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)"] [2018-05-27 15:56:53.522956836] logging to .git/annex/daemon.log and in the log, [2018-05-27 15:56:40.385313411] main: starting assistant version 6.20180509 [2018-05-27 15:56:40.39050583] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","git-annex"] [2018-05-27 15:56:40.393284694] process done ExitSuccess [2018-05-27 15:56:40.393397832] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/heads/git-annex"] [2018-05-27 15:56:40.395763112] process done ExitSuccess [2018-05-27 15:56:40.396781131] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch"] git-annex: internal error: evacuate: strange closure type 4325407 (GHC version 8.2.2 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug I ran the same command again; got the same (promising!). Then I added "--foreground" hoping to get some more debug outout... and isntead, it decided to work. The working log is (of course) longer, it looks like: [2018-05-27 15:57:10.648904305] main: starting assistant version 6.20180509 [2018-05-27 15:57:10.652259294] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","git-annex"] [2018-05-27 15:57:10.712408937] process done ExitSuccess [2018-05-27 15:57:10.712560137] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/heads/git-annex"] [2018-05-27 15:57:10.715542944] process done ExitSuccess [2018-05-27 15:57:10.716916651] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch"] [2018-05-27 15:57:10.717421932] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch-check=%(objectname) %(objecttype) %(objectsize)"] [2018-05-27 15:57:10.843564188] Cronner: waiting Seconds {fromSeconds = 43369} for next scheduled fsck self 15m every day at 4 AM [2018-05-27 15:57:10.958325684] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","ls-files","--stage","-z","--","."] [2018-05-27 15:57:11.034783419] process done ExitSuccess [2018-05-27 15:57:11.034973766] read: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","ls-files","--stage","-z","--","."] [2018-05-27 15:57:11.162696] process done ExitSuccess [2018-05-27 15:57:11.250761052] chat: nice ["ionice","-c3","nocache","/usr/bin/git-annex","remotedaemon","--foreground"] [2018-05-27 15:57:11.251914205] TransferScanner: Syncing with zia, einstein [2018-05-27 15:57:11.3242642] TransferWatcher: watching for transfers [2018-05-27 15:57:11.324391068] MountWatcher: Using running DBUS service
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
One person reported the same error message 7 years ago here: https://ghc.haskell.org/trac/ghc/ticket/5085 They were using git-annex get, not the assistant when it crashed. They also were able to git bisect git-annex's code and found an utterly innocuous commit that triggered whatever the problem is. (commit 828a84ba3341d4b7a84292d8b9002a8095dd2382) It's probably a memory problem, or a ghc bug, or a bug in some library that is doing something memory related and messes up, such that ghc's garbage collector sees bad data. -- see shy jo signature.asc Description: PGP signature
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
On 05/27/2018 10:55 AM, Sean Whitton wrote: This isn't enough information about how to reproduce this bug for me to be comfortable forwarding it upstream, but I've CCed upstream just in case he recognises the error message. I hear you! But that's the entire log file ... I also managed to produce a similar message, by using the "restart daemon" option in the webapp: [2018-05-27 09:50:55.085069462] NetWatcherFallback: Syncing with public, zia, einstein Everything up-to-date Everything up-to-date [2018-05-27 10:50:55.968806564] NetWatcherFallback: Syncing with public, zia, einstein Everything up-to-date Everything up-to-date tail: '/home/anthony/Westerley-Board/.git/annex/daemon.log' has been replaced; following new file [2018-05-27 11:01:09.077123603] main: starting assistant version 6.20180509 git-annex: internal error: evacuate: strange closure type 1061 (GHC version 8.2.2 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug At this point, git annex assistant --stop doesn't work and I have to kill the git-annex process. After that, git annex assistant produces another very similar message: tail: '/home/anthony/Westerley-Board/.git/annex/daemon.log' has appeared; following new file [2018-05-27 11:04:00.911450237] main: starting assistant version 6.20180509 git-annex: internal error: evacuate: strange closure type 4325407 (GHC version 8.2.2 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug Trying again a few times produces similar messages, just with different numbers. This all works perfectly fine on a new test repository ... and it seems removing webapp.html wasn't actually what fixed it, I just need to wait long enough. Or something. It went away randomly one time I tried it. And now restarting the webapp is working. Confusing. Just to be sure, going to reboot this machine into memtest (which was done before I first started using it, but you never know...)
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
control: tag -1 +moreinfo Dear Anthony, On Sun, May 27 2018, Anthony DeRobertis wrote: > Running "git annex assistant" in my repository to start the assistant > gives a weird error in .git/annex/daemon.log: > > [2018-05-27 00:49:40.496075979] main: starting assistant version 6.20180509 > git-annex: internal error: evacuate: strange closure type 4325404 > (GHC version 8.2.2 for x86_64_unknown_linux) > Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug > > I fixed it (temporarily at least) via: rm .git/annex/webapp.html This isn't enough information about how to reproduce this bug for me to be comfortable forwarding it upstream, but I've CCed upstream just in case he recognises the error message. -- Sean Whitton
Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404
Package: git-annex Version: 6.20180509-1 Severity: important -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Running "git annex assistant" in my repository to start the assistant gives a weird error in .git/annex/daemon.log: [2018-05-27 00:49:40.496075979] main: starting assistant version 6.20180509 git-annex: internal error: evacuate: strange closure type 4325404 (GHC version 8.2.2 for x86_64_unknown_linux) Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug I fixed it (temporarily at least) via: rm .git/annex/webapp.html - -- System Information: Debian Release: buster/sid APT prefers testing-debug APT policy: (500, 'testing-debug'), (500, 'testing'), (500, 'stable'), (130, 'unstable-debug'), (130, 'unstable'), (120, 'experimental-debug'), (120, 'experimental') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 4.15.0-3-amd64 (SMP w/8 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Init: systemd (via /run/systemd/system) Versions of packages git-annex depends on: ii curl7.60.0-1 ii git 1:2.17.0-1 ii libc6 2.27-3 ii libffi6 3.2.1-8 ii libgmp102:6.1.2+dfsg-3 ii libmagic1 1:5.33-2 ii libsqlite3-03.23.1-1 ii libxml2 2.9.4+dfsg1-6.1 ii openssh-client 1:7.7p1-2 ii rsync 3.1.2-2.1 ii zlib1g 1:1.2.11.dfsg-1 Versions of packages git-annex recommends: ii aria2 1.33.1-1 ii bind9-host 1:9.11.3+dfsg-1 ii git-remote-gcrypt 1.1-1 ii gnupg 2.2.5-1 ii lsof 4.89+dfsg-0.1 ii nocache1.0-1 ii youtube-dl 2018.05.18-dmo1 Versions of packages git-annex suggests: pn adb ii bup 0.29-3 ii libnss-mdns 0.14.1-1 pn magic-wormhole pn tahoe-lafs pn tor pn uftp ii xdot0.9-1 - -- no debconf information -BEGIN PGP SIGNATURE- iHMEARECADMWIQTlAc7j4DAtSNRJJ0z7P4jCVepZ/gUCWwo5chUcYW50aG9ueUBk ZXJvYmVydC5uZXQACgkQ+z+IwlXqWf4QhACcDzRBM0mqGX0ZkXQEtKb40RW//b0A mgIJvD0/xnnlZHHYZTYUIj6+hgD5 =7F8c -END PGP SIGNATURE-