Hi, Junio. Thanks for replying.

> I postponed it when I saw it the first time to see if anybody
> comments on it, and then it turns out nobody was interested, and it
> remained uninteresting to the list to this day.  
> 

True, that's what I was afraid of, but I wanted to give it some closure.

> Now, after looking at the message again, from the patch description,
> I would believe you that you experienced _some_ trouble when the
> gpg-agent that is auto-spawned by gpg gets left behind (as I do not
> see any hits from "git grep gpg-agent t/", we are not deliberately
> using the agent). 

True, this is what I could gather from asking people over at #gnupg. The
agent spawns a socket for a GNUPGHOME and leaves it open outside of the
home folder (and it caches the inode for the olddir or so).

> However, I could not convince myself that the
> solution is credible.  

I think you're right on this. I'd rather have more people reproduce the
issue (some of my colleagues were able to do so, but we all were running
the latest GPG, vanilla conf) and find the root of the issue too.

> but the current directory of this part is the $TRASH_DIRECTORY,
> which is always created anew from the beginning in a test.  What
> agent process could possibly be running there immedately after
> creating ./gpghome (which happens with "mkdir gpghome &&" without
> "-p" just before the context of this hunk---if the agent was running
> already, the directory would have been there, and mkdir would have
> failed, which would have caused "test_set_prereq GPG" at the end of
> the "&&" cascade to be skipped.  In other words, it is unclear to
> me, and your log message does not sufficiently explain, why this is
> the right solution (or what the exact cause of the symptom is, for
> that matter).

I see. What is causing this (as far as my current understanding goes)
is:

1) First iteration of the tests is run, the .gpghome is created and a
    unix socket is created too. The keychain is imported etc. Tests run
    normally.

2) The test ends, and the trash directory (along with the .gpghome) are
    flushed, but the agent is not aware of this. The socket is still
    open.

3) The second iteration of the tests is run, the new .gpghome is
    created, but when the keychain fails to import and the agent errors
    out with ENOENT. The and-chain fails and test_set_preqreq GPG is
    skipped (as you pointed out).

This last bit is apparently a result of the agent trying to be clever
with the paths. 

> 
> Or perhaps the gpg-agent socket is created somewhere outside the
> GNUPGHOME and stays behind even after a previous run of the same
> test finishes and $TRASH_DIRECTORY gets cleared (I am guessing the
> "what the exact cause is" part, as the log message did not explain
> it)?  If that is the case, it makes me wonder if either of the two
> alternative may be a more robust solution: (1) running gpg tests
> telling gpg never to use agent, or (2) telling gpg and gpg-agent to
> use a socket inside GNUPGHOME.

I agree. In hindsight this solution seems rather naive. I'll dig into
the root cause, as well as to try to isolate the issue from a
gpg-version and gpg-config viewpoint.

> After all, "kill"ing agent blindly like the above patch would mean
> you do not know what other party is relying on the proper operation
> of the thing you are killing.  That sounds more like a workaround
> that a solution (unless it is explained with a solid reason why that
> is the right way to run more than one instances of GPG).

I agree. It is probably better to seek the solutions that you suggested
above.

> 
> Perhaps everybody else is running these gpg tests without having to
> worry about gpg-agent already because their environment is more
> vanilla, but you have some configuration or environment that cause
> gpg to use agent, and that is the reason why nobody is interested
> (because they have never seen the symptom)?  It is possible that the
> part of t/test-lib.sh that tries to cleanse environment variables
> and other "external influence" to give us a predictable and
> repeatable test is unaware of such a knob that only some developers
> (including you) have and the rest of us were merely lucky.  Perhaps
> we need to throw GPG_AGENT_INFO SSH_AUTH_SOCK etc. into the list of
> envirionment variables to nuke there?
> 
> Combined with the unknown-ness of the root cause of the issue, I can
> only say that the patch may be raising an issue worth addressing,
> but it is too sketchy to tell if it is a right solution or what the
> exact problem being solved is.

I'll dig into this. This sounds a way more reasonable approach.

Thanks for the feedback!
-Santiago.

Attachment: signature.asc
Description: PGP signature

Reply via email to