Bug#841143: [pkg-gnupg-maint] Bug#841143: False assumptions about nPth (was: Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup)

2017-01-18 Thread Ian Jackson
NIIBE Yutaka writes ("Re: [pkg-gnupg-maint] Bug#841143: False assumptions about 
nPth(was:   Bug#841143: Suspected race in gpg1 to gpg2 conversion   or 
agent startup)"):
> I don't know if this fix solves all the problems of Ian.  One step done,
> that's good.

I have been running with all my previously posted patches, some of
which I now think are irrelevant and shouldn't change the behaviour.

> Ian's approach is to change the line of
> 
>   active_connections--;
> 
> into:
> 
>   if (--active_connections == 0)
> interrupt_main_thread_loop ();
> 
> There are two parts in gpg-agent.c to change.

If you like I can try just that change and report back.  I strongly
suspect that this will mostly-fix the problem, but leave the race I
mentioned in my other email just now.

Thanks very much for your help in disentangling this conversation.

Regards,
Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: False assumptions about nPth (was: Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup)

2017-01-18 Thread Ian Jackson
Daniel Kahn Gillmor writes ("Re: [pkg-gnupg-maint] Bug#841143: False 
assumptions about nPth (was:   Bug#841143: Suspected race in gpg1 to gpg2 
conversion   or agent startup)"):
> You're just talking about adding this one test, right?

That's my understanding.

> It seems to me like this shouldn't be necessary, since i'd have thought
> the child channel (chan_9 in your example) receiving eof would make the
> main thread wake back up.

AIUI this is supposed to happen, indeed.  So the timer tick is hiding
the race, by having gpg-agent wake up occasionally anyway and become
unstuck.

> can you explain why that wouldn't be the case?  is there some way to
> cause the main thread to trigger a loop when the child channel closes?

There's interrupt_main_thread_loop.  But it is not called.  I think it
should be called when active_connections becomes zero.

That's what my patch 4/4 does, and that patch fixes most of the
problem for me.

There is a remaining race with much lower probability.  I think It
happens when an incoming connection arrives just as gpg-agent is
finally deciding to actually exit.  The symptoms are that some gpg
invocation complains about getting EOF from the agent.

Thanks,
Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: False assumptions about nPth (was: Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup)

2017-01-18 Thread NIIBE Yutaka
Daniel Kahn Gillmor  wrote:
> fwiw, i don't want the behavior to be exactly the same as upstream -- i
> don't want gpg-agent to wake up every few seconds on platforms where it
> shouldn't need to, for example :/

Yes, I understand your purpose.  +1 from me.  Actually, I was inspired
by your patches and I'd like to do something similar in scdaemon.

> but the change i think you're proposing might be OK -- if it does the
> frequent wakeups when it's trying to shut down...

Yes.  My intention was to minimize the change and to show the issue
clearly.  (And my preference is less change against upstream.)

My point was that there is a condition when the main thread keeps
blocking at npth_pselect, when shutdown_pending != 0.

Perhaps, you like Ian's approach better: when --active_connections == 0,
send SIGCONT to resume the main thread.

> You're just talking about adding this one test, right?

Exactly.

> It seems to me like this shouldn't be necessary, since i'd have thought
> the child channel (chan_9 in your example) receiving eof would make the
> main thread wake back up.

It is needed.  (With Ian's approach, the main thread is waken up.)

The main thread blocks at npth_pselect watching the sockets.  The
sockets are listening connection from client.  Connection comming, the
main thread accepts it (by new fd) and creates a thread to handle
commands from fd.  After creating a thread, the main thread does nothing
with the new fd, it's up to the new thread just created.

> Ideally, we don't want to wait a timer tick (up to 2 seconds) before
> shutting down.  if we know we're shutting down and the last client has
> closed, we should just fall into the main loop itself, right?

So, you like Ian's approach.

Ian's approach is to change the line of

  active_connections--;

into:

  if (--active_connections == 0)
interrupt_main_thread_loop ();

There are two parts in gpg-agent.c to change.

I think that it does exactly what you described.

I'm glad that we communicate successfully, and we will have a fix soon.

I don't know if this fix solves all the problems of Ian.  One step done,
that's good.
-- 



Bug#841143: [pkg-gnupg-maint] Bug#841143: False assumptions about nPth (was: Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup)

2017-01-17 Thread Daniel Kahn Gillmor
Hi gniibe--

Thanks for this code review, it's much appreciated!

On Mon 2017-01-16 23:21:15 -0500, NIIBE Yutaka wrote:
> For me, it is a bit difficult to apply the fourth patch only.  So, I
> seek the update of the patch:
>
> 0003-agent-Avoid-tight-timer-tick-when-possible.patch
>
> How about changing the need_tick function, instead?  My intention is to
> make the behavior of gpg-agent as similar as upstream version.

fwiw, i don't want the behavior to be exactly the same as upstream -- i
don't want gpg-agent to wake up every few seconds on platforms where it
shouldn't need to, for example :/

but the change i think you're proposing might be OK -- if it does the
frequent wakeups when it's trying to shut down...

> I mean, changing the first hunk of the patch of gnupg/agent/gpg-agent.c,
> like this (adding the check against shutdown_pending).
>
> --- gnupg.orig/agent/gpg-agent.c
> +++ gnupg/agent/gpg-agent.c
> @@ -2267,6 +2267,29 @@ create_directories (void)
>  }
>  
>  
> +static int
> +need_tick (void)
> +{
 […]
> +  /* if a shutdown was requested, we wait all connections closing.  */
> +  if (shutdown_pending)
> +return 1;

You're just talking about adding this one test, right?

It seems to me like this shouldn't be necessary, since i'd have thought
the child channel (chan_9 in your example) receiving eof would make the
main thread wake back up.

can you explain why that wouldn't be the case?  is there some way to
cause the main thread to trigger a loop when the child channel closes?
Ideally, we don't want to wait a timer tick (up to 2 seconds) before
shutting down.  if we know we're shutting down and the last client has
closed, we should just fall into the main loop itself, right?  The
entire time between when the shutdown is requested and when we finally
shut down is a time that socket is locked and new clients can't
effectively connect, right?

i'm happy to apply your proposed change if there's no better way (it's
certainly better than the indefinite hang you've caught), but it still
feels sloppier than i'd want in general.

any thoughts?

  --dkg


signature.asc
Description: PGP signature


Bug#841143: False assumptions about nPth (was: Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup)

2017-01-17 Thread Ian Jackson
Control: tags -1 confirmed

NIIBE Yutaka writes ("Bug#841143: False assumptions about nPth (was:
Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup)"):
> I confirmed the possibility where the main thread might block at
> npth_pselect forever.  There are connections, shutdown_pending is set by
> signal, npth_pselect is called, then connections are finished.  The main
> thread keeps staying at npth_pselect.

That's the symptoms I saw.  What version did you see this with ?

> For me, it is a bit difficult to apply the fourth patch only.  So, I
> seek the update of the patch:
> 
> 0003-agent-Avoid-tight-timer-tick-when-possible.patch

AFAICT from looking at the git history, some of these patches have
gone upstream quite recently.

If you like I could prepare you a source tree (in git form) or a
binary package containing my 4/4 patch on top of the version you have
right now, for you to test.

NB I am not the maintainer of gnupg2.  I am just another user.

Regards,
Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: False assumptions about nPth (was: Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup)

2017-01-16 Thread NIIBE Yutaka
Ian Jackson  wrote:
> I think that at least my patch
>   [PATCH 4/4] gpg agent lockup fix: Interrupt main loop when 
> active_connections_value==0
> is very likely a fix to an actual race.
[...]
> I would like this bug fixed in stretch.

I think that this issue is a bug in the patches of
debian/patches/gpg-agent-idling/.

I confirmed the possibility where the main thread might block at
npth_pselect forever.  There are connections, shutdown_pending is set by
signal, npth_pselect is called, then connections are finished.  The main
thread keeps staying at npth_pselect.

For me, it is a bit difficult to apply the fourth patch only.  So, I
seek the update of the patch:

0003-agent-Avoid-tight-timer-tick-when-possible.patch

How about changing the need_tick function, instead?  My intention is to
make the behavior of gpg-agent as similar as upstream version.

I mean, changing the first hunk of the patch of gnupg/agent/gpg-agent.c,
like this (adding the check against shutdown_pending).

--- gnupg.orig/agent/gpg-agent.c
+++ gnupg/agent/gpg-agent.c
@@ -2267,6 +2267,29 @@ create_directories (void)
 }
 
 
+static int
+need_tick (void)
+{
+#ifdef HAVE_W32_SYSTEM
+  /* We do not know how to interrupt the select loop on Windows, so we
+ always need a short tick there. */
+  return 1;
+#else
+  /* if we were invoked like "gpg-agent cmd arg1 arg2" then we need to
+ watch our parent. */
+  if (parent_pid != (pid_t)(-1))
+return 1;
+  /* if scdaemon is running, we need to check that it's alive */
+  if (agent_scd_check_running ())
+return 1;
+  /* if a shutdown was requested, we wait all connections closing.  */
+  if (shutdown_pending)
+return 1;
+  /* otherwise, nothing fine-grained to do. */
+  return 0;
+#endif /*HAVE_W32_SYSTEM*/
+}
+
 
 /* This is the worker for the ticker.  It is called every few seconds
and may only do fast operations. */
-- 



Bug#841143: False assumptions about nPth (was: Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup [and 1 more messages]) [and 1 more messages]

2017-01-16 Thread Ian Jackson
Ian Jackson writes ("Re: False assumptions about nPth (was: Bug#841143: 
Suspected race in gpg1 to gpg2 conversion or agent startup [and 1 more 
messages]) [and 1 more messages]"):
> I think that at least my patch
>   [PATCH 4/4] gpg agent lockup fix: Interrupt main loop when 
> active_connections_value==0
> is very likely a fix to an actual race.
...
> At the very least empirically that patch reduces the failure
> probability of a run of the complete dgit test suite on my laptop from
> about 100% (I guess that represents a failure probability of 0.1% per
> gnupg run) to about 5-10%.
...
> Do you intend to rework my patch(es) and apply the ones that make
> sense ?  Do you intend to fix the remaining bug ?
...
> PS: npth is also not bug-free.  For example, see #850686, just
> reported.

I see that gnupg2 2.1.17-4 was uploaded by Daniel on the 11th of
January.  You have not applied my patch 4/4 which as I say I still
think fixes a real bug.

Please confirm your intentions.  I would like this bug fixed in
stretch.

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: False assumptions about nPth (was: Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup [and 1 more messages]) [and 1 more messages]

2017-01-09 Thread Ian Jackson
Werner Koch writes ("False assumptions about nPth (was: Bug#841143: Suspected 
race in gpg1 to gpg2 conversion or agent startup [and 1 more messages])"):
> Please point out a single threading bug in gpg-agent or any other part
> of GnuPG.  But before you point me to your patches please learn about
> nPth (and its predecessor GNU Pth) and understand why we are not using
> Posix threads directly.

You are right that I was confused about pth.  It would have been very
helpful if you had mentioned at some earlier point in this
conversation that npth is a non-preemptive threading library and that
that is why you thought there aren't threading bugs.  I thought it was
a simple wrapper around pthreads with some signal handling support.

Use of a non-concurrent threading library is part of the kind of
"systematic and effective way to avoid threading bugs" which I was
hoping to find.  Sorry for missing that.


I think that at least my patch
  [PATCH 4/4] gpg agent lockup fix: Interrupt main loop when 
active_connections_value==0
is very likely a fix to an actual race.

During debugging I several times had a gdb attached to a stuck
gpg-agent process.  I found the process stuck in select, selecting
only on the inotify fd, with `shutdown_pending' having the value 1 and
`active_connections' having the value 0.  Because of difficulties
collecting logging, and the fact that adding logging (once I figured
out how to do so) seemed to dramatically reduce the failure
probability, I can't be 100% sure of the history of those stuck
gpg-agents.

At the very least empirically that patch reduces the failure
probability of a run of the complete dgit test suite on my laptop from
about 100% (I guess that represents a failure probability of 0.1% per
gnupg run) to about 5-10%.


Thanks for your logging tips.  Unfortunately, however, they came
rather late.  Yesterday this problem got me completely blocked on dgit
development so I had to fight the bug alone.  It took me many hours
which could probably have been significantly shortened with your help.

Next time someone reports a bug like this, it would be better if you
mentioned the reasons why you think it's not a bug (npth's special
properties, in this case).  You could have linked to npth's
documentation.  Earlier instructions for collecting debug logs would
have been helpful.  Speculation as to where the bug might or might not
be, rather than blanket denials, would have been welcome.

I'm afraid this has made me somewhat tetchy as you can probably tell.


Do you intend to rework my patch(es) and apply the ones that make
sense ?  Do you intend to fix the remaining bug ?

Ian.

PS: npth is also not bug-free.  For example, see #850686, just
reported.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2017-01-08 Thread Werner Koch
On Sun,  8 Jan 2017 18:47, ijack...@chiark.greenend.org.uk said:

> follow, but I am still stumped as to get debugging output from
> gpg-agent.  I tried making a stunt shell script to pass --debug-all

The best way to debug the system is to 

--8<---cut here---start->8---
log-file socket://
verbose
debug ipc
--8<---cut here---end--->8---

into the respective configure files.  For gpg-agent.conf you way also
add "debug-pinentry".  Which debug flags you nee depends on what you
want to debug.  ipc is a good start the other debug flags are listed in
the man pages or use

  $ gpg-agent --debug help
  gpg-agent[7724]: reading options from '/home/wk/.gnupg/gpg-agent.conf'
  gpg-agent[7724]: available debug flags:
  gpg-agent[7724]:  2 mpi
  gpg-agent[7724]:  4 crypto
  gpg-agent[7724]: 32 memory
  gpg-agent[7724]: 64 cache
  gpg-agent[7724]:128 memstat
  gpg-agent[7724]:512 hashing
  gpg-agent[7724]:   1024 ipc
  gpg-agent[7724]: gpg-agent running and available
  
to get a list of supported debug flags.  They slightly differ between
tools and you most likely don't want "hashing" becuase that creates a
file per hash context.  The best way to view or collect the debug output
is to start 

  watchgnupg --force --time-only $(gpgconf --list-dirs socketdir)/S.log

in another xterm

See also the watchnug man page.  The use of gpgconf to figure out the
log socket and the abbreviated "socket://" log-file is currently missing
From that man page.


Salam-Shalom,

   Werner

-- 
Die Gedanken sind frei.  Ausnahmen regelt ein Bundesgesetz.


pgpqNCxqNkQIl.pgp
Description: PGP signature


Bug#841143: False assumptions about nPth (was: Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup [and 1 more messages])

2017-01-08 Thread Werner Koch
On Sun,  8 Jan 2017 23:46, ijack...@chiark.greenend.org.uk said:

> gpg-agent is AIUI the main program which handles key material.  We
> cannot afford for it to be afflicted by threading bugs.

Please point out a single threading bug in gpg-agent or any other part
of GnuPG.  But before you point me to your patches please learn about
nPth (and its predecessor GNU Pth) and understand why we are not using
Posix threads directly.


Shalom-Salam,

   Werner

-- 
Die Gedanken sind frei.  Ausnahmen regelt ein Bundesgesetz.


pgpOWS6Bpr4gG.pgp
Description: PGP signature


Bug#841143: Info received (Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup [and 1 more messages])

2017-01-08 Thread Ian Jackson
With my patches I just saw this:

   signfile ../ruby-rails-3.2_3.2.6-2~dummy3_amd64.changes 39B13D8A
  gpg: WARNING: unsafe permissions on homedir 
'/home/ian/things/Dgit/2dgit/tests/tmp/gnupg/gnupg'
  gpg: WARNING: unsafe permissions on homedir 
'/home/ian/things/Dgit/2dgit/tests/tmp/gnupg/gnupg'
  gpg: can't connect to the agent: End of file
  gpg: skipped "39B13D8A": No secret key

>From the error message I think that that test case was probably
connecting to an exiting agent.  I think that demonstrates that there
are still bugs remaining in this area.

The failure probability is much reduced, though.

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup [and 1 more messages]

2017-01-08 Thread Ian Jackson
Ian Jackson writes ("Re: Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected 
race in gpg1 to gpg2 conversion or agent startup [and 1 more messages]"):
> I have patches to fix these bugs and add some debugging.  With these
> patches my test suite seems to reliably run successfully to
> completion.  Without them I can repro the failure nearly every time.
> I'll send the patches in a moment as a four-mail patchbomb.

There was also a two-mail patchbomb with the debugging diffs.

You can find all of these here:
  http://www.chiark.greenend.org.uk/ucgi/~ian/git/gnupg2.git/
  git://git.chiark.greenend.org.uk/~ian/gnupg2.git
in the branches
  841143-bugfix
  841143-extra-debug-messages

I hope this is helpful.

Regards,
Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup [and 1 more messages]

2017-01-08 Thread Ian Jackson
Ian Jackson writes ("Re: Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected 
race in gpg1 to gpg2 conversion or agent startup"):
> I have been digging in the code.  I found it very difficult to get any
> useful debug logging out.  Some patches to maybe help with that will
> follow, but I am still stumped as to get debugging output from
> gpg-agent.  I tried making a stunt shell script to pass --debug-all
> --no-detach and redirect stderr somewhere, but it is ineffective for
> some reason.

The reason was that gpg closes all the fds when spawning gpg-agent.

Ian Jackson writes ("Re: Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected 
race in gpg1 to gpg2 conversion or agent startup"):
> I fixed this but it didn't help.  I now have a gdb onto a stuck agent,
> which has shutdown_pending but is stuck in select.  I think
> shutdown_pending must have become 1 between the main loop test and the
> entry to select.

I have patches to fix these bugs and add some debugging.  With these
patches my test suite seems to reliably run successfully to
completion.  Without them I can repro the failure nearly every time.
I'll send the patches in a moment as a four-mail patchbomb.

> This approach to programming is a quite a rich seam of opportunities
> for threading bugs.
> 
> For example, I think the variables `check_own_socket_running' and
> `shutdown_pending' are both accessed willy-nilly on multiple threads
> without locking.

Even with my patches, bugs remain.  I only looked at the agent
startup/shutdown code.  I fear for the rest of the code.

I am very concerned that there doesn't seem to be any systematic and
effective way to avoid threading bugs in gpg-agent.

Trying to write multithreaded C code in the mutex lock/unlock style,
without either serious coding style, code structure, and build system
support, or extensive use of state-of-the-art static or dynamic
analysis tools, almost inevitably leads to threading bugs like the
ones I have discovered.

I found no evidence of the kind of coding style/structure approach
that would avoid introduction of threading bugs through human error.
And the bugs I found so far demonstrate that there has been no
effective tool-based audit.

gpg-agent is AIUI the main program which handles key material.  We
cannot afford for it to be afflicted by threading bugs.

Would you please consider a different approach ?  For example, I think
the existing code structure might support use of fork rather than
threads.

Thanks,
Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2017-01-08 Thread Ian Jackson
Ian Jackson writes ("Re: Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected 
race in gpg1 to gpg2 conversion or agent startup"):
> The variable `active_connectionis' in gpg-agent.c seems to be updated
> by multiple threads without any locking.  If it were to get corrupted,
> I think gpg-agent might get stuck trying to exit, with clients which
> had successfully connected at the syscall level.

I fixed this but it didn't help.  I now have a gdb onto a stuck agent,
which has shutdown_pending but is stuck in select.  I think
shutdown_pending must have become 1 between the main loop test and the
entry to select.

This approach to programming is a quite a rich seam of opportunities
for threading bugs.

For example, I think the variables `check_own_socket_running' and
`shutdown_pending' are both accessed willy-nilly on multiple threads
without locking.

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2017-01-08 Thread Ian Jackson
I have been digging in the code.  I found it very difficult to get any
useful debug logging out.  Some patches to maybe help with that will
follow, but I am still stumped as to get debugging output from
gpg-agent.  I tried making a stunt shell script to pass --debug-all
--no-detach and redirect stderr somewhere, but it is ineffective for
some reason.

Nevertheless, I have discovered a possible explanation for the bug.

The variable `active_connectionis' in gpg-agent.c seems to be updated
by multiple threads without any locking.  If it were to get corrupted,
I think gpg-agent might get stuck trying to exit, with clients which
had successfully connected at the syscall level.

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2017-01-08 Thread Ian Jackson
Control: found -1 2.1.17-2

I tried upgrading and this has had no useful effect AFAICT.

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2017-01-08 Thread Ian Jackson
Ian Jackson writes ("Re: Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected 
race in gpg1 to gpg2 conversion or agent startup"):
> I'm going to try adding some sleeps.

I added a sleep 1 before and after calling gpg.  Sadly this makes the
dgit test suite impractically slow and I got bored waiting for it to
finish.  I didn't see a repro of the total lockup.

However, I did notice several times that there were gpg processes
which seemed to take a long time (several seconds) to make signatures.

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2017-01-08 Thread Ian Jackson
Ian Jackson writes ("Re: Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected 
race in gpg1 to gpg2 conversion or agent startup"):
> I'm going to try working around it by serialising all my calls to gpg.

Now I have:
 * each run of the test suite gets a fresh value of GNUPGHOME
   (ie, not a directory name that was previously used and
   has since been deleted)
 * all my calls to gpg are serialised

The problem persists, with very similar symptoms.  Of course now I
have only one stuck gpg process.  The other processes in the test
suite are blocking waiting for my gpg serialisation lock.

I'm going to try adding some sleeps.

Ian.

zealot:~> ps -efH | grep gpg
ian  10091 10090  0 13:44 pts/50   00:00:00 with-lock-ex -w 
/home/ian/things/Dgit/dgit/tests/tmp/gnupg/gnupg/1483882680.27251.2362246/dgit-gpg-serialisation-lock
 /usr/bin/gpg --status-fd=1 --keyid-format=long --verify 
/tmp/.git_vtag_tmp1Tb8EN -
ian   7897  7220  0 13:44 pts/50   00:00:00 /usr/bin/gpg 
--detach-sign --armor -u 39B13D8A .git/dgit/tag.tmp
ian   8639  8281  0 13:44 pts/50   00:00:00 with-lock-ex -w 
/home/ian/things/Dgit/dgit/tests/tmp/gnupg/gnupg/1483882680.27251.2362246/dgit-gpg-serialisation-lock
 /usr/bin/gpg --detach-sign --armor -u 39B13D8A tag.tmp
ian   8332  8245  0 13:44 pts/50   00:00:00 with-lock-ex -w 
/home/ian/things/Dgit/dgit/tests/tmp/gnupg/gnupg/1483882680.27251.2362246/dgit-gpg-serialisation-lock
 /usr/bin/gpg --detach-sign --armor -u 39B13D8A .git/dgit/tag.tmp
ian  10211 10180  0 13:44 pts/50   00:00:00 with-lock-ex -w 
/home/ian/things/Dgit/dgit/tests/tmp/gnupg/gnupg/1483882680.27251.2362246/dgit-gpg-serialisation-lock
 /usr/bin/gpg --detach-sign --armor -u 39B13D8A .git/dgit/tag.tmp
ian  11041  5436  0 13:52 pts/84   00:00:00   grep gpg
root 10869  5468  0 13:50 pts/94   00:00:00 gdb /usr/bin/gpg-agent 
7079
ian   7079 1  0 13:44 ?00:00:00   gpg-agent --homedir 
/home/ian/things/Dgit/dgit/tests/tmp/gnupg/gnupg/1483882680.27251.2362246 
--use-standard-socket --daemon
zealot:~> 



-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.
;



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2017-01-08 Thread Ian Jackson
Werner Koch writes ("Re: Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected 
race in gpg1 to gpg2 conversion or agent startup"):
> You may want to read the top of gnupg/common/dotlock.c to see why we use
> this scheme. It is the only _portable_ way of doing advisory locks
> _across platforms_.  FWIW, GNOME uses the same code.

Well, I'm not *sure* it's startup that's the problem.  Right now I
have an instance of the dgit test suite which has hung.

As you can see in the transcripts below:

 * I have four gpg processes which are stuck.  I investigated one, and
   it is in the startup / agent connection code.  strace shows it
   selecting on the agent socket which netstat shows is CONNECTED.

 * The agent process is selecting only on an inotify fd.  Surely it
   should be selecting at least on a socket master fd.

I'm not sure I can match up all the sockets properly with this
information, but I think this is clearly a bug.  That it happens to me
in the dgit test suite, but rarely to anyone else, also suggests it's
a bug in the startup logic.

There may have been a previous agent with this GNUPGHOME, but if so
that directory was deleted and recreated before any of the current
crop of gpg were started, and the corresponding agent no longer
exists.

I have already arranged to do the conversion from gpg1 format data
(which is what the test suite starts with) once.  That is, I do this:

  some things which may leave an agent lying around, such
  as a previous run of the test suite

  rm -rf .../tests/tmp

  GNUPGHOME=.../tests/tmp/gnupg/gnupg gpg --list-secret

  many times in parallel, things which call
  GNUPGHOME=.../tests/tmp/gnupg/gnupg gpg something

This is all with 2.1.16-3.

I'm going to try working around it by serialising all my calls to gpg.

Thanks for your attention.

Regards,
Ian.

zealot:~> ps -efH | grep gpg
ian   5101  1467  0 12:20 pts/900:00:00   gpg 
--detach-sign --armor -u 39B13D8A .git/dgit/tag-dgit.tmp
ian   5099  2755  0 12:20 pts/900:00:00   gpg 
--detach-sign --armor -u 39B13D8A .git/dgit/tag-dgit.tmp
ian   5098  4213  0 12:20 pts/900:00:00   gpg 
--detach-sign --armor -u 39B13D8A .git/dgit/tag-maintview.tmp
ian   5392  5285  0 12:20 pts/900:00:00   gpg 
--detach-sign --armor -u 39B13D8A tag.tmp
ian   1961 1  0 12:20 ?00:00:00   gpg-agent --homedir 
/home/ian/things/Dgit/2dgit/tests/tmp/gnupg/gnupg --use-standard-socket --daemon
ian   5502  5436  0 12:31 pts/84   00:00:00   grep gpg
zealot:~> 

root(ian)@zealot:~> strace -p5101
strace: Process 5101 attached
read(8, ^Cstrace: Process 5101 detached
 
root(ian)@zealot:~> ll /proc/5101/fd/8
lrwx-- 1 ian ian 64 Jan  8 12:34 /proc/5101/fd/8 -> socket:[5837034]
root(ian)@zealot:~> strace -p1961
strace: Process 1961 attached
pselect6(8, [7], NULL, NULL, NULL, {[], 8}^Cstrace: Process 1961 detached
 
root(ian)@zealot:~> ll /proc/1961/fd
total 0
dr-x-- 2 root root  0 Jan  8 12:28 ./
dr-xr-xr-x 9 ian  ian   0 Jan  8 12:20 ../
lr-x-- 1 root root 64 Jan  8 12:28 0 -> /dev/null
l-wx-- 1 root root 64 Jan  8 12:28 1 -> /dev/null
l-wx-- 1 root root 64 Jan  8 12:28 2 -> /dev/null
lrwx-- 1 root root 64 Jan  8 12:28 3 -> socket:[5827012]
lrwx-- 1 root root 64 Jan  8 12:28 4 -> socket:[5827013]
lrwx-- 1 root root 64 Jan  8 12:28 5 -> socket:[5827014]
lrwx-- 1 root root 64 Jan  8 12:28 6 -> socket:[5827015]
lr-x-- 1 root root 64 Jan  8 12:28 7 -> anon_inode:inotify
lr-x-- 1 root root 64 Jan  8 12:28 9 -> /dev/urandom
root(ian)@zealot:~>

root(ian)@zealot:~> gdb /usr/bin/gpg 5101
GNU gdb (Debian 7.11.1-2+b1) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/gpg...Reading symbols from 
/usr/lib/debug/.build-id/29/a11dd70c57cf6c90a9927b30c7f81aa3448f72.debug...done.
done.
Attaching to program: /usr/bin/gpg, process 5101
Reading symbols from /usr/lib/x86_64-linux-gnu/libgtk3-nocsd.so.0...(no 
debugging symbols found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libz.so.1...(no debugging symbols 
found)...done.
Reading symbols from /lib/x86_64-linux-gnu/libbz2.so.1.0...(no debugging 
symbols found)...done.
Reading symbols from /usr/lib/x86_64-linux-gnu/libsqlite3.so.0...(no debugging 
symbols 

Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-22 Thread Werner Koch
On Sat, 22 Oct 2016 17:15, ijack...@chiark.greenend.org.uk said:

> to be creating lockfiles with link().  It is quite difficult to make a
> reliable locking scheme with link().  I would have recommended flock

You may want to read the top of gnupg/common/dotlock.c to see why we use
this scheme. It is the only _portable_ way of doing advisory locks
_across platforms_.  FWIW, GNOME uses the same code.


Shalom-Salam,

   Werner

-- 
Die Gedanken sind frei.  Ausnahmen regelt ein Bundesgesetz.


pgpNo3KTaFj3F.pgp
Description: PGP signature


Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-22 Thread Ian Jackson
Ian Jackson writes ("Re: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 
to gpg2 conversion or agent startup"):
> I haven't tried to narrow the test case.  I'm not 100% sure that
> concurrent execution of different gnupg instances is necessary.
> My replication is with the dgit test suite, which does run dgit but
> only in a self-contained way.

I straced a migration run in the hope that I might spot something
obvious.  I see an awful lot of very complicated activity which seems
to be creating lockfiles with link().  It is quite difficult to make a
reliable locking scheme with link().  I would have recommended flock
or fcntl.

I'm afraid I don't have time now to investigate the gnupg2 source
code.  For now I will arrange for my test suite to cause the migration
to happen once for the whole test suite.

Thanks,
Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-19 Thread Ian Jackson
Daniel Kahn Gillmor writes ("Re: [pkg-gnupg-maint] Bug#841143: Suspected race 
in gpg1 to gpg2 conversion or agent startup"):
> can you clarify the race?  i'm afraid we've been arguing about the gnupg
> upgrade in several places and i'm happy to re-focus this particular
> ticket.

Sorry about that.  I guess I must be coming across as quite grumpy.
Please don't be discouraged.  Yes, let's refocus this bug.

> i think you're saying that if two different instances of 2.1
> concurrently try to upgrade a given 1.4.x homedir, one of them may
> intermittently fail.
> 
> Is that correct?  Do you have a narrower replication example than
> running dgit repeatedly?

I haven't tried to narrow the test case.  I'm not 100% sure that
concurrent execution of different gnupg instances is necessary.
My replication is with the dgit test suite, which does run dgit but
only in a self-contained way.

> > Can you at least make the migration work every time ?
> 
> can you help me to replicate the migration failure?  from stretch, you
> can create a GNUPGHOME with gpg1 and try to trigger parallel upgrades.
> 
> I've done:
> 
> export GNUPGHOME=$(mktemp -d)
> gpg1 --gen-key
> for x in 1 2 3 4 ; do
>echo test $x | gpg --output test$x.gpg --clearsign &
> done
> wait %1 %2 %3 %4
> 
> and it seems to work fine on my 4-core machine.
> 
> Is there a better way to replicate?

I don't know.  You could try

   sudo apt-get install dgit dgit-infrastructure devscripts debhelper
   dgit clone dgit sid
   cd dgit
   tests/using-intree tests/run-all

and then look in

   test/tmp/NAME-OF-FAILED-TEST.log

Or you could give me a version of gnupg2 which prints a better error
message or instructions for making it produce debugging output.
Currently I see, when it fails:

  gpg: starting migration from earlier GnuPG versions
  gpg: can't connect to the agent: IPC connect call failed

This doesn't say what the errno was.  (And is "IPC connect call" a
reference to connect(2) ?)

> > This is a very broad definition of "co-installable".  In practice an
> > admonition not to use gnupg1 and gnupg2 with the same ~ is going to be
> > impractical to comply with.
> 
> That's why i'm trying to help consolidate debian to only use a single
> gpg, and to support 1.4.x only for people with unusual/antique use
> cases.

In fact the other things in your mail were much more reassuring.  For
example, given the behaviour you describe, I can convert the test
suite's $GNUPGHOME once and it will work just fine with both gnupg1
and gnupg2.  If I add private keys later with gnupg2 then those won't
be visible to gnupg1, but for me that's kind of expected.

But I would like to nail the intermittent failure before I cover it up
by making the conversion happen much less often (and probably covered
by some kind of outer lock of my own)...

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-19 Thread Daniel Kahn Gillmor
On Wed 2016-10-19 09:39:28 -0400, Ian Jackson wrote:
> I think this bug #841143 is about a race in this upgrade path.  Do you
> intend to investigate or fix this race ?

can you clarify the race?  i'm afraid we've been arguing about the gnupg
upgrade in several places and i'm happy to re-focus this particular
ticket.

i think you're saying that if two different instances of 2.1
concurrently try to upgrade a given 1.4.x homedir, one of them may
intermittently fail.

Is that correct?  Do you have a narrower replication example than
running dgit repeatedly?

> No, I think you misunderstand.
>
> An schroot typically shares its /home with the "outside" system.
> People often use such chroots for running newer versions of things on
> an older system, or vice versa, whenever that's needed.
>
> If I have a jessie system with a stretch chroot, and I run `gnupg' in
> the stretch chroot, gnupg's conversion will mess up my ~/.gnupg so
> that my main system does not work any more.

That's not actually the case, fwiw.  gpg in the sid chroot will import
the keys from your secring.gpg into private-keys-v1.d/, and it will
create .gpg-v21-migrated, but it will not delete secring.gpg.

however, if you subsequently create new secret keys in secring.gpg from
jessie, those keys will not be visible to 2.1.x in future connections
(since it will think the migration is already done because of
.gpg-v21-migrated -- i've filed https://bugs.gnupg.org/gnupg/issue2811
as a minor improvement on this).

if you use gpg 2.1 to modify ~/.gnupg/gpg.conf to include options that
1.4.x doesn't know about or can't handle, then all bets are off.  but
the same is true for ~/.ssh/known_hosts and any other comparable
software-maintained file, right?

Would you consider it a bug in ssh if an ecdsa entry added to
~/.ssh/known_hosts by a newer version of ssh wouldn't be read
successfully by an older version of ssh?

> Can you at least make the migration work every time ?

can you help me to replicate the migration failure?  from stretch, you
can create a GNUPGHOME with gpg1 and try to trigger parallel upgrades.

I've done:

export GNUPGHOME=$(mktemp -d)
gpg1 --gen-key
for x in 1 2 3 4 ; do
   echo test $x | gpg --output test$x.gpg --clearsign &
done
wait %1 %2 %3 %4

and it seems to work fine on my 4-core machine.

Is there a better way to replicate?

> This is a very broad definition of "co-installable".  In practice an
> admonition not to use gnupg1 and gnupg2 with the same ~ is going to be
> impractical to comply with.

That's why i'm trying to help consolidate debian to only use a single
gpg, and to support 1.4.x only for people with unusual/antique use
cases.

Thanks for helping make this change happen.

Regards,

   --dkg


signature.asc
Description: PGP signature


Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-19 Thread Ian Jackson
Daniel Kahn Gillmor writes ("Re: [pkg-gnupg-maint] Bug#841143: Suspected race 
in gpg1 to gpg2 conversion or agent startup"):
> On Wed 2016-10-19 05:47:02 -0400, Ian Jackson wrote:
> > If gnupg doesn't guarantee that v1's will work with v2 then you don't
> > have an upgrade path for your users.
> 
> We do have an upgrade path currently from v1.4.x to v2.0.x and v2.1.x.
> However, i don't know whether GnuPG upstream is willing to guarantee
> that v1 will work with v2.4.x.  If you want things to be arbitrarily
> portable, you should use the portable data formats.

I think this bug #841143 is about a race in this upgrade path.  Do you
intend to investigate or fix this race ?

> > I'll take your answer as a declaration that downgrading is not
> > supported.  Unfortunately I think this means you have a bug.
> >
> > For example, consider schroots, which typically contain /home.
> 
> an schroot will also work when upgraded across single debian versions.

No, I think you misunderstand.

An schroot typically shares its /home with the "outside" system.
People often use such chroots for running newer versions of things on
an older system, or vice versa, whenever that's needed.

If I have a jessie system with a stretch chroot, and I run `gnupg' in
the stretch chroot, gnupg's conversion will mess up my ~/.gnupg so
that my main system does not work any more.

I'm sorry to say that I think this all seems quite ill-advised.

> I'm afraid you're simply not going to get the fastest possible
> conversion if you do incur an upgrade during your test suite's
> migration.  sorry!

Can you at least make the migration work every time ?

> > Also there are institutions where all the home directories are on NFS.
> > Obviously one wouldn't recommend putting GNUPGHOME on NFS, but there
> > might be reasons why it's OK in context.
> >
> > In both of these situations the same ~ may be operated on by programs
> > from different Debian releases (or non-Debian operating systems) in
> > any arbitrary interleaved order.
> 
> I believe upstream is aware of this, which is why they've declared (for
> example) that gpg 2.0 and gpg 2.1 are not "co-installable".

This is a very broad definition of "co-installable".  In practice an
admonition not to use gnupg1 and gnupg2 with the same ~ is going to be
impractical to comply with.

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-19 Thread Daniel Kahn Gillmor
On Wed 2016-10-19 05:47:02 -0400, Ian Jackson wrote:
> Daniel Kahn Gillmor writes ("Re: [pkg-gnupg-maint] Bug#841143: Suspected race 
> in gpg1 to gpg2 conversion or agent startup"):
>> If you have a test suite that intends to use secret key material, and
>> you want it to work across different versions of GnuPG, your test suite
>> should not ship what it thinks is a GNUPGHOME.  GnuPG doesn't guarantee
>> that one version will necessarily work with the other's.
>
> If gnupg doesn't guarantee that v1's will work with v2 then you don't
> have an upgrade path for your users.

We do have an upgrade path currently from v1.4.x to v2.0.x and v2.1.x.
However, i don't know whether GnuPG upstream is willing to guarantee
that v1 will work with v2.4.x.  If you want things to be arbitrarily
portable, you should use the portable data formats.

Similarly, if you wanted to use a mysql or postgresql database in your
test suite, you should ship the pre-populated database as a textual sql
file.

> I'll take your answer as a declaration that downgrading is not
> supported.  Unfortunately I think this means you have a bug.
>
> For example, consider schroots, which typically contain /home.

an schroot will also work when upgraded across single debian versions.
I'm afraid you're simply not going to get the fastest possible
conversion if you do incur an upgrade during your test suite's
migration.  sorry!

> Also there are institutions where all the home directories are on NFS.
> Obviously one wouldn't recommend putting GNUPGHOME on NFS, but there
> might be reasons why it's OK in context.
>
> In both of these situations the same ~ may be operated on by programs
> from different Debian releases (or non-Debian operating systems) in
> any arbitrary interleaved order.

I believe upstream is aware of this, which is why they've declared (for
example) that gpg 2.0 and gpg 2.1 are not "co-installable".

Upstream is supporting an upgrade path, but it's true that after
converting a homedir to 2.1, 1.4 cannot see the same key material any
more.

--dkg


signature.asc
Description: PGP signature


Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-19 Thread Ian Jackson
Daniel Kahn Gillmor writes ("Re: [pkg-gnupg-maint] Bug#841143: Suspected race 
in gpg1 to gpg2   conversion or agent startup"):
> If you have a test suite that intends to use secret key material, and
> you want it to work across different versions of GnuPG, your test suite
> should not ship what it thinks is a GNUPGHOME.  GnuPG doesn't guarantee
> that one version will necessarily work with the other's.

If gnupg doesn't guarantee that v1's will work with v2 then you don't
have an upgrade path for your users.

> Instead, if you have public or secret key material that you want any
> version of GnuPG to work with, you should ship that material in the
> standard OpenPGP transferable formats (e.g. the output of "gpg
> --export-keys" and "gpg --export-secret-keys"), and then import it at
> the start of the test suite while building the GNUPGHOME for the test
> suite to use.

This seems likely to be slower than what I previously had with gnupg1.

> The contents of any particular GNUPGHOME is not a part of the GnuPG API
> contract.

All I'm relying on is that in Debian, `gnupg' works with a $HOME that
was created with a previous version of `gnupg', even perhaps from an
earlier Debian release.

I'll take your answer as a declaration that downgrading is not
supported.  Unfortunately I think this means you have a bug.

For example, consider schroots, which typically contain /home.

Also there are institutions where all the home directories are on NFS.
Obviously one wouldn't recommend putting GNUPGHOME on NFS, but there
might be reasons why it's OK in context.

In both of these situations the same ~ may be operated on by programs
from different Debian releases (or non-Debian operating systems) in
any arbitrary interleaved order.

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-18 Thread Daniel Kahn Gillmor
On Tue 2016-10-18 14:11:21 -0400, Ian Jackson wrote:
> Ian Jackson writes ("Re: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 
> to gpg2 conversion or agent startup"):
>> This makes it somewhat surprising that it should fail occasionally.
>> Each of the individual tests is largely single-threaded.
>
> Also, watching tests by hand shows gpg pausing for a noticeable time
> (~500ms?) even on my extremely fast laptop, perhaps when converting
> the test gpg1 keys etc. to gpg2 keys etc.
>
> If I do this conversion once in the source, will the result be useable
> by gpg1 ?  (Since I want the test suite to still work on earlier
> versions of Debian.)

If you have a test suite that intends to use secret key material, and
you want it to work across different versions of GnuPG, your test suite
should not ship what it thinks is a GNUPGHOME.  GnuPG doesn't guarantee
that one version will necessarily work with the other's.

Instead, if you have public or secret key material that you want any
version of GnuPG to work with, you should ship that material in the
standard OpenPGP transferable formats (e.g. the output of "gpg
--export-keys" and "gpg --export-secret-keys"), and then import it at
the start of the test suite while building the GNUPGHOME for the test
suite to use.

The contents of any particular GNUPGHOME is not a part of the GnuPG API
contract.

  --dkg


signature.asc
Description: PGP signature


Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-18 Thread Ian Jackson
Ian Jackson writes ("Re: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 
to gpg2   conversion or agent startup"):
> This makes it somewhat surprising that it should fail occasionally.
> Each of the individual tests is largely single-threaded.

Also, watching tests by hand shows gpg pausing for a noticeable time
(~500ms?) even on my extremely fast laptop, perhaps when converting
the test gpg1 keys etc. to gpg2 keys etc.

If I do this conversion once in the source, will the result be useable
by gpg1 ?  (Since I want the test suite to still work on earlier
versions of Debian.)

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-18 Thread Ian Jackson
Ian Jackson writes ("Re: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 
to gpg2   conversion or agent startup"):
> Daniel Kahn Gillmor writes ("Re: [pkg-gnupg-maint] Bug#841143: Suspected race 
> in gpg1 to gpg2 conversion or agent startup"):
> > Is it possible to create this homedir with mode 0700 ?  aiui, gpg-agent
> > doesn't want to create its sockets in a directory that other users have
> > read and write access to.
...
> I can try if doing that works around the problem, but I guess even if
> it does there's a real bug here ?  I'm happy to try to help find it.

I tried this and it doesn't seem to have made any difference.
(I haven't done enough tests to know the failure probability
before-and-after but I have seen a failure after.)

I saw in netstat that there were a lot of programs using AF_UNIX
sockets in the tests' temporary GNUPGHOME directories.

This makes it somewhat surprising that it should fail occasionally.
Each of the individual tests is largely single-threaded.

I also tried it with unsetting GPG_AGENT_INFO, as well, and that made
no difference.

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-18 Thread Ian Jackson
Daniel Kahn Gillmor writes ("Re: [pkg-gnupg-maint] Bug#841143: Suspected race 
in gpg1 to gpg2   conversion or agent startup"):
> On Mon 2016-10-17 20:50:02 -0400, Ian Jackson wrote:
> > I have now, for the 2nd time, seen an unexplained failure while
> > running the dgit test suite, looking like this:
> >
> > + gpg --detach-sign --armor -u 39B13D8A .git/dgit/tag.tmp
> > gpg: WARNING: unsafe permissions on homedir 
> > '/home/ian/things/Dgit/dgit/tests/tmp/distropatches-reject/gnupg'
> 
> Is it possible to create this homedir with mode 0700 ?  aiui, gpg-agent
> doesn't want to create its sockets in a directory that other users have
> read and write access to.

It would be possible, but it would be somewhat undesirable (for no
other reason than that these are all test keys and test data and test
results, which people might want to share).

I can try if doing that works around the problem, but I guess even if
it does there's a real bug here ?  I'm happy to try to help find it.

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: [pkg-gnupg-maint] Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-17 Thread Daniel Kahn Gillmor
On Mon 2016-10-17 20:50:02 -0400, Ian Jackson wrote:

> I have now, for the 2nd time, seen an unexplained failure while
> running the dgit test suite, looking like this:
>
> + gpg --detach-sign --armor -u 39B13D8A .git/dgit/tag.tmp
> gpg: WARNING: unsafe permissions on homedir 
> '/home/ian/things/Dgit/dgit/tests/tmp/distropatches-reject/gnupg'

Is it possible to create this homedir with mode 0700 ?  aiui, gpg-agent
doesn't want to create its sockets in a directory that other users have
read and write access to.

 --dkg


signature.asc
Description: PGP signature


Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup [and 1 more messages]

2016-10-17 Thread Ian Jackson
Here's the log file.



distropatches-reject.log
Description: shell transcript (with set -x)


-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.


Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup [and 1 more messages]

2016-10-17 Thread Ian Jackson
Ian Jackson writes ("Suspected race in gpg1 to gpg2 conversion or agent 
startup"):
> I have now, for the 2nd time, seen an unexplained failure while
> running the dgit test suite, looking like this:
> 
> + gpg --detach-sign --armor -u 39B13D8A .git/dgit/tag.tmp
> gpg: WARNING: unsafe permissions on homedir 
> '/home/ian/things/Dgit/dgit/tests/tmp/distropatches-reject/gnupg'
> gpg: starting migration from earlier GnuPG versions
> gpg: can't connect to the agent: IPC connect call failed
> gpg: error: GnuPG agent unusable. Please check that a GnuPG agent can be 
> started.

Perhaps this is due to me having updated my system and restarted my
session, with the result that now I seem to have a gpg-agent running.
At least, I have GPG_AGENT_INFO in my environment and I don't think I
did before.

I ran the whole test suite again and this time two tests failed the
same way.  I have added `unset GPG_AGENT_INFO' to the top of the test
suite library script just in case.  (Surely setting GNUPGHOME should
be enough to stop the test suite from using my actual gpg agent ?
With gnupg1, setting GNUPGHOME was sufficient to isolate a gnupg
instance.)  Sadly that seems not to have helped.

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.



Bug#841143: Suspected race in gpg1 to gpg2 conversion or agent startup

2016-10-17 Thread Ian Jackson
Package: gnupg
Version: 2.1.15-4

I have now, for the 2nd time, seen an unexplained failure while
running the dgit test suite, looking like this:

+ gpg --detach-sign --armor -u 39B13D8A .git/dgit/tag.tmp
gpg: WARNING: unsafe permissions on homedir 
'/home/ian/things/Dgit/dgit/tests/tmp/distropatches-reject/gnupg'
gpg: starting migration from earlier GnuPG versions
gpg: can't connect to the agent: IPC connect call failed
gpg: error: GnuPG agent unusable. Please check that a GnuPG agent can be 
started.
gpg: migration aborted
gpg: skipped "39B13D8A": No secret key
gpg: signing failed: No secret key
dgit: failed command: gpg --detach-sign --armor -u 39B13D8A .git/dgit/tag.tmp

This is with a private setting:
 GNUPGHOME=/home/ian/things/Dgit/dgit/tests/tmp/distropatches-reject/gnupg

I have attached a complete shell transcript from the particular dgit
test suite test.  (FTR the test failed with roughly but not exactly
dgit 2.2).  I will keep the tmp directory.  There doesn't seem, now,
to be an agent running.

Thanks,
Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.