On Fri, Dec 14, 2001 at 09:53:03PM -0800, Andrew A. Chen wrote:
> Hello.  Today is a double dose of bugs.
> 
> 1.) The HTTPD bug is still present in the POE CVS from two days 
> ago...  Here's the error from the program.

[...]

I'll get back to this one when I can run it.  I've got to be somewhere
soon.

> 2.) All due apologizes to Mr. Bergman for stealing bug... I seem to have 
> hit the "POE starts growing out of control, consuming all RAM and CPU, and 
> unresponsive to signals except -9" bug.  Incidentally, it's the same 
> program listed above, except TRACE_DEFAULT was set to 1.

Did Artur's bug include runaway memory growth?

> Unlike Mr. Bergman's problem, where it took weeks to produce the bug, I was 
> able to cause POE to do this in minutes.  I actually happens to have this 
> running through tee to a logfile... It's at 
> http://stupid.divo.net/~achen/blankimg.output.gz.  It's 600kB uncompressed, 
> about 20kB compressed.  What sucks is I can't seem to duplicate it 
> again.  =/  Sorry.

That log file doesn't show anything running away.  I'm looking at the
"Event times" lines:

  *** Event times: 328=1.00
  *** Event times: 328=0.00
  *** Event times: 329=1.00
  *** Event times: 329=0.00

Those are the due times in seconds from time().  So event 328 was due
in one second, and then it was due "now".  The same for 329, and
pretty much for all of them.

Throwing in the "iterating" lines:

*** Kernel::run() iterating.  now(326.99) timeout(1.00) then(327.99)
*** Event times: 328=1.00
*** Kernel::run() iterating.  now(327.99) timeout(0.00) then(328.00)
*** Event times: 328=0.00

Shows that select() is timing out as it should.  The now() time is
time in seconds since $^T.  timeout() is the timeout used for select,
and then() is the next due alarm's time.

....

The problem sounds like infinite recursion.  If that's the case, it
would happen just after the last log record.  Since the program would
be on a downward spiral into the bottom-most circle of malloc hell,
the log would never be updated again.

Unless you stopped the log at an arbitrary place, the last thing perl
did before dying was call select().  Artur suggested that signals were
the problem, and this backs up his idea.

So perl's possibly recursing in signal handlers.  Since there's
nothing to lose-- in all probability, it's already dying horribly on
signals-- I'd put warn() statements in the signal handlers
(POE::Kernel::Select's _substrate_signal_handler_xyz functions) to
track down which signal is doing it.

Meanwhile, I'll read the source really carefully and try to re-create
the problem here.

-- Rocco Caputo / [EMAIL PROTECTED] / poe.perl.org / poe.sourceforge.net

Reply via email to