Thanks, this helps - what I'm seeing now is copied below. Basically, pth is 
aborting because there are no threads to schedule.

But I don't understand why pth thinks there are no threads left to 
schedule, though. I have several different kinds of threads in my app, with 
a total thread count of 30 or 40, that all share a similar looking main loop:

   while (!Done()) {
     HandleNextMessage();
     Yield();
   }
(Yield is just a wrapper for pth_yield(NULL).)

Because it's a server, the network input thread (and to a lesser degree the 
signal handling thread) drives everything; the network input thread has a 
main loop like:

   while (!Done()) {
     HandleNetworkInput();
     HandleAvailableMessages();
     Yield();
   }

Is there some constraint on how often threads can yield? In other words, if 
all the threads just keep doing trivial amounts of calculation and then 
yield, will pth bomb out?

Do I need to add an extra dummy thread to calculate fibonacci numbers or 
something to prevent the abort() call? If so, how much calculation should 
be done before a call to yield?

thanks in advance

Brent

---details below---

#3  0x40053795 in __pth_scheduler (dummy=0x0) at pth_sched.c:204
204                 abort();
(gdb) list
199              */
200             pth_current = pth_pqueue_delmax(&pth_RQ);
201             if (pth_current == NULL) {
202                 fprintf(stderr, "**Pth** SCHEDULER INTERNAL ERROR: "
203                                 "no more thread(s) available to 
schedule!?!?\n");
204                 abort();
205             }
206             pth_debug4("pth_scheduler: thread \"%s\" selected (prio=%d, 
qprio=%d)",
207                        pth_current->name, pth_current->prio, 
pth_current->q_prio);
208

At 09:29 AM 6/12/2001 -0700, you wrote:
>Brent Phillips wrote:
> > I've been fighting a strange problem for a while now; I'm not 100% sure if
> > it's in pTh or somewhere else.
> >
> > Basically, my app keeps coredumping as a result of a SIGABRT (which I think
> > is coming from pTh) with the following stack trace:
> >
> > #0  0x400f2111 in __kill ()
> > #1  0x400f1d66 in raise (sig=6) at ../sysdeps/posix/raise.c:27
> > #2  0x400f3447 in abort () at ../sysdeps/generic/abort.c:88
> > #3  0x40053795 in __pth_scheduler ()
> > #4  0x400550ee in pth_spawn_trampoline ()
> > #5  0x40053248 in pth_mctx_set_bootstrap ()
> > #6  0x400531c6 in pth_mctx_set_trampoline ()
> > #7  <signal handler called>
> >
> > Is there any way to get a better idea of why the SIGABRT is being raised?
>
>Sure.. edit the generated Makefile and add "-g" to CFLAGS so that
>debugging information is preserved. Rebuild and reinstall the library
>and use file(1) to verify that it is "not stripped". Regnerate the above
>stacktrace which should show you the file and line number where abort()
>is being called (maybe within an assert() statement).
>
>-Archie
>
>__________________________________________________________________________
>Archie Cobbs     *     Packet Design     *     http://www.packetdesign.com
>______________________________________________________________________
>GNU Portable Threads (Pth)            http://www.gnu.org/software/pth/
>User Support Mailing List                            [EMAIL PROTECTED]
>Automated List Manager (Majordomo)           [EMAIL PROTECTED]


______________________________________________________________________
GNU Portable Threads (Pth)            http://www.gnu.org/software/pth/
User Support Mailing List                            [EMAIL PROTECTED]
Automated List Manager (Majordomo)           [EMAIL PROTECTED]

Reply via email to