[Bug 6939] spamc[7611]: connect(AF_UNIX) to spamd failed: Connection refused

bugzilla-daemon Fri, 31 May 2013 10:28:20 -0700

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6939


--- Comment #88 from Kip <[email protected]> ---
(In reply to Sidney Markowitz from comment #85)
> I'm jumping in kind of late, after 84 comments, but since I was one of the
> people involved in the original coding of spamc and the decision to have
> spamc fail safe by returning a 0 exit code under what might normally be
> considered an error condition I have an interest in understanding if there
> is something that needs to be fixed here. But I'm afraid the above comments
> have left me very confused.

Hey Sidney. I'm sorry about that and I had hoped the discussion thus far had
been more constructive and I don't fault you for having skipped the bulk of it.

> First of all, is it the case that the original bug that was reported here
> turned out to be that spamc was configured to talk to a tcp port but spamd
> was listeneing on a unix socket or vice versa, i.e., 

No, sorry if that was not clear. The problem was in how it handled a bad unix
socket parameter under certain conditions. I don't know enough about the TCP
method to say if there's an issue with that transport method, but someone else
more knowledgeable might be interested in looking into that. Although that's
not an issue I'm having.

> Or is there a problem that was demonstrated by the 400 hanging spamc
> processes, not to be confused with the output of ps aux | grep "[spamc]" |
> wc -l which on one of my machines returns 291 even though spamc is not
> installed on that machine? 

The regexp was wrong, but I did not apply it until after running htop and
seeing the number of spamc processes scrolling well off the page. But anyways,
it may well have been related, but I don't know yet for sure until I can verify
at least how each process was being invoked.

> If hanging spamc processes is the bug being
> reported, under what circumstances does it hang? 

My issue originally was not the hanging, although that came up later. I knew my
SA was not working for some reason, but I did not know why. All I would see in
mail.err were many of the following:

     spamc[7611]: connect(AF_UNIX) to spamd  failed: Connection refused

I checked to make sure spamd was running and it looked fine. I reinstalled it,
restarted the service, rebooted, etc., and even verified permissions and it all
seemed fine.

The double space after spamd eventually caught my eye and I went to
investigate. It turned out the path to the UNIX socket belonging to the spamd
daemon was an empty string. I did some more digging and, for whatever reason,
Evolution was invoking spamc with an empty -U "" parameter. That's definitely
Evolution's fault as everyone agrees thus far, but spamc's handling of this
situation could have been more graceful.

> I can see some mention in
> comments that if you allow STDIN to sit without giving it any input then
> spamc will continue to wait for it even if the argument to -U is not a valid
> socket path. I don't see how Evolution could have been using spamc in a way
> that would trigger that kind of hang. 

I actually don't know definitively if or how Evolution was causing them to
hang, so I'm in agreement with you. Thankfully John's given me a useful tip to
track down how they were invoked so we can figure that out. I just need to wait
for it to happen again.

> Even if Evolution passed an empty
> string or other invalid socket path to the -U option, that would still
> terminate after reading in the input.

If -x was provided, then yes, it should have. But for whatever reason, it did
not output the following on my mail.err that I see when I manually invoked it
myself on the command line.

    spamc[8213]: invalid usage

That's fine as error messages go. However, in the case where Evolution was
invoking it, that was not present. Based on the debugging I did, not all code
paths that I examined relied on a transport socketpath variable checked to
ensure that it was non-empty. They did, however, check as far as I could see,
that the string was non-null. Given that those assertions were being made,
coupled with the malformed string contained in "spamd  failed", it was an
indication to me that there was a problem and this was not intentional.

>  echo foo | spamc -U "" -x ; echo $?
> 
> That has empty STDOUT output (does not fail safe) but does indicate the
> error with an exit code 69. The -x option is for people who would rather
> have spamc's caller check for the exit code and handle an error such as -U
> "" instead of having the failsafe behavior.

This is probably the closest analogue to what Evolution was trying to do.

I think there are two problems with this. As you know, most usage of SA is in a
daemonized environment where the user is not going to be able to see the
console output. It's true, the calling application could and should handle the
return code, but even in the case where it is, and Evolution appears to be
doing that, based on my reading of the source (though I'm not certain), it was
not immediately evident based on the mail.err log what the problem actually
was.

The second problem is it seems as though the -x makes spamc behave more as a
user might expect it to behave on a POSIX system. Granted, because of what it
is trying to do and what it must not lose, there can be a good exception to
this here. However, given that at least three or four people above may not have
immediately realized this, even though the behaviour is documented in the man
page, the message on mail.err could have been more helpful since there may be
many others with the same expectation. Karsten and I agree to disagree on how
it handles this, but that is not really the issue that I was trying to dwell on
anyways.

> It's one thing to fail completely if the command line syntax is
> totally wrong, such as no argument to -U. That is an error in setting up the
> mail system which will be caught before going into production. But an
> argument to -U may work fine one day and then no longer be a valid socket
> path the next. 

Agreed. And while the patch could be improved to address that, we can both
agree that an empty socket path will always be invalid on any day.

> The proposed patch could cause a working system to start
> rejecting mail (or losing it if the exit code is not properly checked) if
> something goes wrong and the socket path stops being valid. 

Yes, this is possible, and as I mentioned the patch can be improved to address
that. However, we can reduce the likelihood of it being used incorrectly, even
when supplemented with proper checking of exit codes, if developers and users
can identify wrong usage earlier through a more meaningful error message.

I would venture to guess that probably all higher level applications that
invoke spamc probably flank the -U parameter in quotations for safety reasons,
and it is possible that they too might under certain conditions provide nothing
within them. They should not do that, we agree, but that their code is not
within the purview of SA's.

> The current code
> will failsafe. If you really want the behavior that the patch provides, use
> the -x option, which not only will produce the same result of returning a
> non-zero exit code given a socket path that doesn't stat, it will do it even
> with a valid socket path that is not to a listening spamd, something the
> patch doesn't do. 

I understand what you are saying, but bear in mind that most everyday usage of
spamc is not through direct command line usage by the user, but through some
other higher level application, be it a mail client, a cron job, or what have
you. The standard approach that supplements what you are already trying to do
is to make an adequate note in /var/log/ (e.g. mail.err) (FHS 5.10.1).

> Well, maybe I did miss the description of the actual bug in the previous 84
> comments, so if I misunderstood something I would like to know what needs
> fixing, or how the proposed patch does something more useful that the
> existing code does not do.

I hope that was helpful.

(In reply to John Hardin from comment #87)
> (In reply to John Hardin from comment #86)
> > (In reply to Kevin A. McGrail from comment #13)
> > > Look at adding the patch to spamc.  It isn't a bad patch despite the
> > > volatility of the discussion surrounding it.
> > 
> > +1
> > 
> > Can we *please* put and end to this?
> 
> After reading Sidney's comments, I'd like to revise this: accept the part of
> the patch that provides meaningful error messages, but don't fail unsafe.

Thanks John. I'm fine with that. It would have been more useful to the user
with a more meaningful error message, and consequently might have helped an
Evolution developer, or any other mail client, identify an issue with spamc's
invocation before this became an issue in the first place.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 6939] spamc[7611]: connect(AF_UNIX) to spamd failed: Connection refused

Reply via email to