Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-06-07 Thread Anthony DeRobertis

On 06/06/2018 08:54 PM, Joey Hess wrote:

That actually makes some kind of sense, since this bug has something
to do with garbage collection, and THP may result in different
memory allocation patterns due to the changed page size.

Except.. Isn't THP enabled by default on most systems?


Indeed it appears you're right, it defaults to on nowadays. Didn't used 
to (e.g., checked and it's not enabled by default in Stretch).


Also, I left the VM repeatedly starting and stopping the assistant 
overnight; that one failure I got was the only one it saw. Out of many 
thousand attempts. (And unfortunately, "power cycling" the VM didn't 
produce another, so it wasn't just the first run).


So back to the drawing board, I guess ☹



Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-06-06 Thread Joey Hess
Anthony DeRobertis wrote:
> I finally managed to reproduce once on my VM after turning on transparent
> hugepages (which both my workstations are running with). The crash rate is
> much lower — under 10%, vs. 80–90% on the workstations — but...
> 
> Anyway, the next thing I plan to test is to turn off transparent hugepages
> on a workstation and see if that avoids the issue. I have *no idea* why it
> matters, and even less so why a rebuild fixes it. Might be a bit, as that
> requires rebooting.

That actually makes some kind of sense, since this bug has something
to do with garbage collection, and THP may result in different
memory allocation patterns due to the changed page size.

Except.. Isn't THP enabled by default on most systems?

-- 
see shy jo


signature.asc
Description: PGP signature


Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-06-06 Thread Anthony DeRobertis
I finally managed to reproduce once on my VM after turning on 
transparent hugepages (which both my workstations are running with). The 
crash rate is much lower — under 10%, vs. 80–90% on the workstations — 
but...


Anyway, the next thing I plan to test is to turn off transparent 
hugepages on a workstation and see if that avoids the issue. I have *no 
idea* why it matters, and even less so why a rebuild fixes it. Might be 
a bit, as that requires rebooting.




Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-31 Thread Anthony DeRobertis
I've tried to reproduce on a newly-installed Buster VM, but haven't
been able to get it to crash. I guess that at least explains why only
I'm complaining — probably something weird about my two workstations
that cause it.

Going to see if I can find it, but welcome any suggestions as to what it
might be.



Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-30 Thread Anthony DeRobertis
I've managed to reproduce the crash with the Debian build on a different 
machine, with a new git-annex repository (containing only public data, 
so I can share it if it helps).


This took a bit of playing around — so these steps may not be *entirely* 
accurate.


1. Created a new git repository at /tmp/ohnoes
2. git annex init  in it
3. started assistant (all good at this stage), add a few files (fortune
   -l > a, etc.)
4. cd /tmp && git clone ssh://localhost/tmp/ohnoes ohyeses
5. cd ohyeses && git-annex init ohyeses
6. git annex sync --content
7. cd /tmp/ohnoes && git remote add yes ssh://localhost/tmp/ohyeses
8. start assistant in /tmp/ohnoes: still OK. (Tried a bunch of times).
9. add a bunch of files to /tmp/ohyeses: for f in `seq 1 100`; do
   fortune -l > $f && git-annex add $f && git commit -m "add $f" ; done
10. stop and start the assistant a bunch of times in /tmp/ohnoes: Once
   crashed, but mostly worked.
11. add another bunch of files to /tmp/ohyeses (as above, through 150)
12. stop and start the assistant in /tmp/ohnoes: first try crashed. And
   second. Third worked. 4–8 crashed; 9 worked; 10 crashed.
13. installed my binNMU package (no source changes, just a rebuild—same
   one I installed on the other machine), and problem went away

BTW: I tried "git-annex test" on this machine before installing the 
rebuilt package, and it passed.


Going to try installing a buster VM and see if it's reproducible there. 
May not be until tomorrow, though, depending on how long it takes to set up.




Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-30 Thread Anthony DeRobertis



On May 30, 2018 1:57:23 PM UTC, Sean Whitton  wrote:
>Hello,
>
>On Tue, May 29 2018, Anthony DeRobertis wrote:
>
>> ... and it turns out my build does not reproduce the problem.
>
>Just to be clear, you mean without Joey's patch?

Correct. Just rebuilding it, without modifying the source, fixes it for me. 



Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-30 Thread Sean Whitton
Hello,

On Tue, May 29 2018, Anthony DeRobertis wrote:

> ... and it turns out my build does not reproduce the problem.

Just to be clear, you mean without Joey's patch?

-- 
Sean Whitton



Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-29 Thread Anthony DeRobertis
On Tue, May 29, 2018 at 12:35:59PM -0400, Joey Hess wrote:
> Any chance you can build git-annex from source, so we can try a few
> modifications to try to narrow this down?
> 
> sudo apt-get build-dep git-annex
> apt-get source git-annex
> cd git-annex-6.20180509
> make
> PATH=`pwd`:$PATH
> export PATH
> 
> Then see if you can reproduce the problem,

... and it turns out my build does not reproduce the problem.

I ran "git annex assistant" ten times with the Debian package, and 10
times with the source built as above. When it started up, I followed up
with "git annex assistant --stop". Results are:

Debian package: try 1–4 fail, 5 ok, 6–10 fail
My build: try 1–10 ok

So just to be sure, I rm -Rf'd that source tree, extracted it again
(dpkg-source -x) and did a dch --bin-nmu to up the version. Followed by
debuild -b -uc to make a new package, and installed it.

My package: try 1–10 ok

I'm running diffoscope on the two packages, but it's been thinking long
and hard on git-annex's .text segment, so I'm guessing it won't be
useful.

I'm not sure where to go for here — I'm not sure if forcing a rebuild
would fix it in Debian. Or how much work it'd be for me to reproduce
Debian's build environment from
https://buildd.debian.org/status/fetch.php?pkg=git-annex=amd64=6.20180509-1=1525912069=0
in a VM and see if then I can reproduce a broken build, and if that'd
really help us learn anything.



-- 
Democracy is a process by which the people are free to choose the man who
will get the blame.
-- Laurence J. Peter



Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-29 Thread Joey Hess
Any chance you can build git-annex from source, so we can try a few
modifications to try to narrow this down?

sudo apt-get build-dep git-annex
apt-get source git-annex
cd git-annex-6.20180509
make
PATH=`pwd`:$PATH
export PATH

Then see if you can reproduce the problem, and then try applying the
attached patch, and re-making and see if it avoids the problem.
The patch tries to avoid doing anything before forking when starting
the assistant, so it should do only the same pre-fork operations as
git-annex watch. (Other than some differences due to argument parsing I
suppose.)

-- 
see shy jo
diff --git a/Command/Assistant.hs b/Command/Assistant.hs
index 70088674d..b148b2566 100644
--- a/Command/Assistant.hs
+++ b/Command/Assistant.hs
@@ -60,8 +60,8 @@ start o
liftIO autoStop
stop
| otherwise = do
-   liftIO ensureInstalled
-   ensureInitialized
+   --liftIO ensureInstalled
+   --ensureInitialized
Command.Watch.start True (daemonOptions o) (startDelayOption o)
 
 startNoRepo :: AssistantOptions -> IO ()


signature.asc
Description: PGP signature


Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-27 Thread Anthony DeRobertis

On 05/27/2018 05:24 PM, Joey Hess wrote:

Anthony DeRobertis wrote:

So right now, it's just refusing the run in the background :-/

If you're able to reproduce the bug on demand that way,
that could point to the way git-annex daemonizes itself with
forkProcess. Which from its documentation:

 forkProcess comes with a giant warning: since any other running threads
 are not copied into the child process, it's easy to go wrong: e.g. by
 accessing some shared resource that was held by another thread in the
 parent.

git-annex tries to use forkProcess in a safe way, but that's not
especially well-defined or easy to check.

You might try running git-annex watch instead of git-annex assistant,
since they both daemonize but the latter has a simpler code path.

I stopped the (foreground) assistant with "git annex assistant --stop"; 
then ran "git annex assistant" (worked). Then stopped it again. Then 
started (failed). Tried another few times, all failed. I tried a few 
"git annex watch" followed by "git annex watch --stop" (after getting 
the "(started...)" message in the log — all of those worked. Tried a few 
more "git annex assistant", all failed.


So, absent the one "git annex assistant" which worked, those all failed. 
And all of the "git annex watch" worked. So does starting the assistant 
with "git annex assistant --foreground". (And maybe that's why the 
webapp was working for me earlier too — it doesn't fork...)


Honestly, if that code is a nightmare... you could just remove it. "git 
annex assistant --foreground &" is easy enough. And presumably entirely 
avoids the forkProcess issue there.




Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-27 Thread Joey Hess
Anthony DeRobertis wrote:
> So right now, it's just refusing the run in the background :-/

If you're able to reproduce the bug on demand that way,
that could point to the way git-annex daemonizes itself with
forkProcess. Which from its documentation:

forkProcess comes with a giant warning: since any other running threads
are not copied into the child process, it's easy to go wrong: e.g. by
accessing some shared resource that was held by another thread in the
parent.

git-annex tries to use forkProcess in a safe way, but that's not
especially well-defined or easy to check.

You might try running git-annex watch instead of git-annex assistant,
since they both daemonize but the latter has a simpler code path.

-- 
see shy jo


signature.asc
Description: PGP signature


Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-27 Thread Anthony DeRobertis

On 05/27/2018 01:20 PM, Joey Hess wrote:

One person reported the same error message 7 years ago here:
https://ghc.haskell.org/trac/ghc/ticket/5085
They were using git-annex get, not the assistant when it crashed.

They also were able to git bisect git-annex's code and found an utterly
innocuous commit that triggered whatever the problem is.
(commit 828a84ba3341d4b7a84292d8b9002a8095dd2382)

It's probably a memory problem, or a ghc bug, or a bug in some library
that is doing something memory related and messes up, such that ghc's
garbage collector sees bad data.


First off, thank you for git-annex. It's really useful software. I can't 
imagine syncing all this data between several desktops, laptops, & a few 
tablets without it. And apologies in advance for this rambling message.


Just finished 2, almost 3 passes of memtest86+ with no errors. (Before 
starting using them machine, several years ago, it had over 24h of 
memtest). It's been stable, haven't seen any random 
crashes/corruption/etc. — so I doubt it's a hardware issue. It's also 
not overclocked or anything silly like that.


I'm not sure what bisecting this would entail; as far as I can tell... 
it's random.


(BTW: I use the CLI interface too, quite a lot, and have never seen a 
weird error from it. Only from the assistant).


Freshly after booting, I ran "git annex assistant"; it gave one of those 
errors in the log. So I ran "git annex assistant --debug"; on the 
console it gave:


[2018-05-27 15:56:53.403932801] read: uname ["-o"]
[2018-05-27 15:56:53.405388769] process done ExitSuccess
[2018-05-27 15:56:53.502601514] logging to .git/annex/daemon.log
[2018-05-27 15:56:53.503385967] read: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","git-annex"]
[2018-05-27 15:56:53.505100304] process done ExitSuccess
[2018-05-27 15:56:53.505256854] read: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/heads/git-annex"]
[2018-05-27 15:56:53.50805146] process done ExitSuccess
[2018-05-27 15:56:53.517155935] chat: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch"]
[2018-05-27 15:56:53.517579777] chat: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch-check=%(objectname)
 %(objecttype) %(objectsize)"]
[2018-05-27 15:56:53.522956836] logging to .git/annex/daemon.log

and in the log,

[2018-05-27 15:56:40.385313411] main: starting assistant version 6.20180509
[2018-05-27 15:56:40.39050583] read: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","git-annex"]
[2018-05-27 15:56:40.393284694] process done ExitSuccess
[2018-05-27 15:56:40.393397832] read: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/heads/git-annex"]
[2018-05-27 15:56:40.395763112] process done ExitSuccess
[2018-05-27 15:56:40.396781131] chat: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch"]
git-annex: internal error: evacuate: strange closure type 4325407
(GHC version 8.2.2 for x86_64_unknown_linux)
Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

I ran the same command again; got the same (promising!). Then I added 
"--foreground" hoping to get some more debug outout... and isntead, it 
decided to work. The working log is (of course) longer, it looks like:


[2018-05-27 15:57:10.648904305] main: starting assistant version 6.20180509
[2018-05-27 15:57:10.652259294] read: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","git-annex"]
[2018-05-27 15:57:10.712408937] process done ExitSuccess
[2018-05-27 15:57:10.712560137] read: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","show-ref","--hash","refs/heads/git-annex"]
[2018-05-27 15:57:10.715542944] process done ExitSuccess
[2018-05-27 15:57:10.716916651] chat: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch"]
[2018-05-27 15:57:10.717421932] chat: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","cat-file","--batch-check=%(objectname)
 %(objecttype) %(objectsize)"]
[2018-05-27 15:57:10.843564188] Cronner: waiting Seconds {fromSeconds = 43369} 
for next scheduled fsck self 15m every day at 4 AM
[2018-05-27 15:57:10.958325684] read: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","ls-files","--stage","-z","--","."]
[2018-05-27 15:57:11.034783419] process done ExitSuccess
[2018-05-27 15:57:11.034973766] read: git 
["--git-dir=.git","--work-tree=.","--literal-pathspecs","ls-files","--stage","-z","--","."]
[2018-05-27 15:57:11.162696] process done ExitSuccess
[2018-05-27 15:57:11.250761052] chat: nice 
["ionice","-c3","nocache","/usr/bin/git-annex","remotedaemon","--foreground"]
[2018-05-27 15:57:11.251914205] TransferScanner: Syncing with zia, einstein
[2018-05-27 15:57:11.3242642] TransferWatcher: watching for transfers
[2018-05-27 15:57:11.324391068] MountWatcher: Using running DBUS service 

Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-27 Thread Joey Hess
One person reported the same error message 7 years ago here:
https://ghc.haskell.org/trac/ghc/ticket/5085
They were using git-annex get, not the assistant when it crashed.

They also were able to git bisect git-annex's code and found an utterly
innocuous commit that triggered whatever the problem is.
(commit 828a84ba3341d4b7a84292d8b9002a8095dd2382)

It's probably a memory problem, or a ghc bug, or a bug in some library
that is doing something memory related and messes up, such that ghc's
garbage collector sees bad data.

-- 
see shy jo


signature.asc
Description: PGP signature


Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-27 Thread Anthony DeRobertis

On 05/27/2018 10:55 AM, Sean Whitton wrote:

This isn't enough information about how to reproduce this bug for me to
be comfortable forwarding it upstream, but I've CCed upstream just in
case he recognises the error message.


I hear you! But that's the entire log file ...

I also managed to produce a similar message, by using the "restart 
daemon" option in the webapp:


[2018-05-27 09:50:55.085069462] NetWatcherFallback: Syncing with public, zia, 
einstein
Everything up-to-date
Everything up-to-date
[2018-05-27 10:50:55.968806564] NetWatcherFallback: Syncing with public, zia, 
einstein
Everything up-to-date
Everything up-to-date
tail: '/home/anthony/Westerley-Board/.git/annex/daemon.log' has been replaced;  
following new file
[2018-05-27 11:01:09.077123603] main: starting assistant version 6.20180509
git-annex: internal error: evacuate: strange closure type 1061
(GHC version 8.2.2 for x86_64_unknown_linux)
Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

At this point, git annex assistant --stop doesn't work and I have to 
kill the git-annex process. After that, git annex assistant produces 
another very similar message:


tail: '/home/anthony/Westerley-Board/.git/annex/daemon.log' has appeared;  
following new file
[2018-05-27 11:04:00.911450237] main: starting assistant version 6.20180509
git-annex: internal error: evacuate: strange closure type 4325407
(GHC version 8.2.2 for x86_64_unknown_linux)
Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug


Trying again a few times produces similar messages, just with different 
numbers. This all works perfectly fine on a new test repository ... and 
it seems removing webapp.html wasn't actually what fixed it, I just need 
to wait long enough. Or something. It went away randomly one time I 
tried it. And now restarting the webapp is working.


Confusing. Just to be sure, going to reboot this machine into memtest 
(which was done before I first started using it, but you never know...)




Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-27 Thread Sean Whitton
control: tag -1 +moreinfo

Dear Anthony,

On Sun, May 27 2018, Anthony DeRobertis wrote:

> Running "git annex assistant" in my repository to start the assistant
> gives a weird error in .git/annex/daemon.log:
>
> [2018-05-27 00:49:40.496075979] main: starting assistant version 6.20180509
> git-annex: internal error: evacuate: strange closure type 4325404
> (GHC version 8.2.2 for x86_64_unknown_linux)
> Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
>
> I fixed it (temporarily at least) via: rm .git/annex/webapp.html

This isn't enough information about how to reproduce this bug for me to
be comfortable forwarding it upstream, but I've CCed upstream just in
case he recognises the error message.

-- 
Sean Whitton



Bug#900173: git-annex: internal error: evacuate: strange closure type 4325404

2018-05-26 Thread Anthony DeRobertis
Package: git-annex
Version: 6.20180509-1
Severity: important

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Running "git annex assistant" in my repository to start the assistant
gives a weird error in .git/annex/daemon.log:

[2018-05-27 00:49:40.496075979] main: starting assistant version 6.20180509
git-annex: internal error: evacuate: strange closure type 4325404
(GHC version 8.2.2 for x86_64_unknown_linux)
Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

I fixed it (temporarily at least) via: rm .git/annex/webapp.html

- -- System Information:
Debian Release: buster/sid
  APT prefers testing-debug
  APT policy: (500, 'testing-debug'), (500, 'testing'), (500, 'stable'), (130, 
'unstable-debug'), (130, 'unstable'), (120, 'experimental-debug'), (120, 
'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.15.0-3-amd64 (SMP w/8 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Init: systemd (via /run/systemd/system)

Versions of packages git-annex depends on:
ii  curl7.60.0-1
ii  git 1:2.17.0-1
ii  libc6   2.27-3
ii  libffi6 3.2.1-8
ii  libgmp102:6.1.2+dfsg-3
ii  libmagic1   1:5.33-2
ii  libsqlite3-03.23.1-1
ii  libxml2 2.9.4+dfsg1-6.1
ii  openssh-client  1:7.7p1-2
ii  rsync   3.1.2-2.1
ii  zlib1g  1:1.2.11.dfsg-1

Versions of packages git-annex recommends:
ii  aria2  1.33.1-1
ii  bind9-host 1:9.11.3+dfsg-1
ii  git-remote-gcrypt  1.1-1
ii  gnupg  2.2.5-1
ii  lsof   4.89+dfsg-0.1
ii  nocache1.0-1
ii  youtube-dl 2018.05.18-dmo1

Versions of packages git-annex suggests:
pn  adb 
ii  bup 0.29-3
ii  libnss-mdns 0.14.1-1
pn  magic-wormhole  
pn  tahoe-lafs  
pn  tor 
pn  uftp
ii  xdot0.9-1

- -- no debconf information

-BEGIN PGP SIGNATURE-

iHMEARECADMWIQTlAc7j4DAtSNRJJ0z7P4jCVepZ/gUCWwo5chUcYW50aG9ueUBk
ZXJvYmVydC5uZXQACgkQ+z+IwlXqWf4QhACcDzRBM0mqGX0ZkXQEtKb40RW//b0A
mgIJvD0/xnnlZHHYZTYUIj6+hgD5
=7F8c
-END PGP SIGNATURE-