At 1:36 PM +0100 3/17/01, Oliver Fleischmann wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
>
>> At 12:28 PM +0100 3/16/01, Oliver Fleischmann wrote:
>
>> > Mar 16 09:35:45 chlothar popper[24129]: ba3760 at
>> > p3E9E2005.dip.t-dialin.net (62.158.32.5): -ERR [IN-USE]
>> > /var/spool/mail/.ba3760.pop lock busy! Is another session active? (11)
>> >
>> > Now he can't log in any more:
>
>> > The qpopper-process (PID 24123) was still laying around at this time,
>> > as well as the .ba3760.pop - lock-file. Killing it with -HUP
>> does nothing,
>> > I need to use -TERM, which removes the process from the table, but
>> > doesn't clean up the lock file. So I have to clean up myself.
>> >
>> > We are experiencing this every now and then, and only with some
>> > users. But those few users trigger the problem almost every day. At
>> > least one of them used Microsoft Outlook (don't know which version).
>> >
>> > Do I need to go for another POP3 server software? Is there still any
>> > ongoing development for the free version of qpopper?
>
>> Next time it happens, can you get a trace of system calls on the hung
>> stack? This is done by running the system trace facility. This
>> differs depending on the flavor of Unix. On Solaris it is truss(1);
>> on Linux it is strace(1). Other platforms may have one of these or
>> something else. Usually a 'man -k trace' shows it. For example, on
>> Solaris, if the hung stack is pid 1234, use 'truss -p 1234 -o
>> truss-out'
>
> I did strace on a hanging qpopper-process in an very early state (still
> running as user root, so the user has not authenticated yet):
>
> read(0, 0xbfffd68c, 1) = ? ERESTARTSYS (To be restarted)
> - --- SIGHUP (Hangup) ---
> rt_sigaction(SIGHUP, {0x80505a0, [], SA_RESTART|0x4000000}, {0x80505a0, [],
> SA_RESTART|0x4000000}, 8) = 0
> rt_sigaction(SIGPIPE, {0x80505a0, [], SA_RESTART|0x4000000}, {0x80505a0, [],
> SA_RESTART|0x4000000}, 8) = 0
> sigreturn() = ? (mask now [ALRM])
> read(0, 0xbfffd68c, 1) = ? ERESTARTSYS (To be restarted)
> - --- SIGHUP (Hangup) ---
> rt_sigaction(SIGHUP, {0x80505a0, [], SA_RESTART|0x4000000}, {0x80505a0, [],
> SA_RESTART|0x4000000}, 8) = 0
> rt_sigaction(SIGPIPE, {0x80505a0, [], SA_RESTART|0x4000000}, {0x80505a0, [],
> SA_RESTART|0x4000000}, 8) = 0
> sigreturn() = ? (mask now [ALRM])
> read(0, 0xbfffd68c, 1) = ? ERESTARTSYS (To be restarted)
> - --- SIGTERM (Terminated) ---
>
> If I just do strace on the process, it sits in "read(0, " forever. I then
> did a "kill -HUP" for two times an finally a "kill" on the process while
> running strace.
I am confused. Are you running Qpopper in standalone mode? If not,
a HUP should cause it to clean up and go away. Also, if Qpopper is
waiting on user input then an strace(1) will of course show it
waiting on a 'read(0'. It should stay there until input arrives or
the timer expires.
>
> Processes hanging in a later state show the same strace output, as far as I
> remember.
What I'd like to see is the strace(1) for a process that has reported
the error and yet not gone away.
>
>> It would also be helpful to enable debug tracing in Qpopper.
>>
>> To enable tracing in Qpopper:
>>
>> 1. Do a 'make clean'
>> 2. Re-run ./configure, adding '--enable-debugging'.
>> 3. Edit the inetd.conf line for Qpopper, adding '-d' or '-t tracefile'.
>> 4. Send inetd a HUP signal.
>>
>> This causes detailed tracing to be written to the syslog (if you used
>> '-d') or to the file specified as 'tracefile'.
>
> I have a qpopper with debug compiled in running on a different port number
> and try to persuade the critical users to use that port.
That would be helpful, thank you.