* Pascal Stumpf on Wed, Oct 12, 2011 at 05:39:48PM +0200:
> > Check your /var/log/lpd.errs.

> Doesn't contain anything but "restarted" messages.

> > Also, ktracing lpd with the the -i flag might give a clue to what the
> > lpd child is doing.

> Apparently, it segfaults:

> I remembered I had the "S" malloc flag set, so I removed
> /etc/malloc.conf, and ta-daaa, works. So this is a bug in the lpd code.
> I suspect it is somewhere in the "common" code for all lp programs, as
> I've also experienced SIGSEGVs in lpc. I'll see if I can hunt it down
> further if I have time ...

I've had a very similar problem after last upgrading to -current.
lpr'ing new jobs would spool them, but complaining about 'unable to
start daemon'. Restarting lpd, purging the queue and some other
hocuspocus eventually got the printing going again, but this was pretty
much at random -- sometimes, it'd just work. (All that without the 'S'
flag to malloc.conf, though.)

The patches from Otto and Todd (i.e., today's snapshot) made the problem
disappear -- many thanks! The rest of the message is just for the
archives (Googling for this kind of problem is an exercise in
frustration...).

The log was basically useless (the lpd master process _did_ see and log
the new jobs, but then apparently did nothing about them). After digging
through the code, it seems to be the same problem as Pascal's, that the
lpd childs were dying instead of working, and from then on the whole
system gets out of sync.

What stymied me was that the whole lpr/lpd code wasn't touched in
years (except for mandoc stuff); since I'd upgraded from 4.7 in theory
nothing should have changed, so everything should have still been
working -- until I stumbled over this thread.


Now that I've already waded through that code (and if my meagre C skills
allow it), I'll try to gently add a few lines of diagnostic messages for
the log, so that it isn't that difficult to hunt down this kind of
problem in the future.

So in this regard, what's the established practice in this situations?
Is code for those kinds of base daemons expected to be correct or should
there be a degree of 'mistrust'? Or in other words: Should lpd assume
that its children will never segfault, or should it assume that
sometimes, something may happen and try to restart?

Up until recently (I've not yet taken a look at the new rc-scripting
stuff yet) the way daemons were started suggested the former.


Cheers,
    s//un

-- 
When I read about the evils of drinking, I gave up reading.
                -- Henry Youngman

Reply via email to