On Tuesday 19 May 2009 12:52:43 James Harper wrote:
> > I think we need to move
> > num_workers++ to just before the pthread_create, and decrement it if
>
> the
>
> > pthread_create fails. I'll try that now...
>
> This patch against the current svn resolves the problem for me:
>
> Index: jobq.c
> ===================================================================
> --- jobq.c      (revision 8838)
> +++ jobq.c      (working copy)
> @@ -358,8 +358,10 @@
>        Dmsg0(2300, "Create worker thread\n");
>        /* No idle threads so create a new one */
>        set_thread_concurrency(jq->max_workers + 1);
> +      jq->num_workers++;
>        if ((stat = pthread_create(&id, &jq->attr, jobq_server, (void
> *)jq)) != 0) {
>           berrno be;
> +         jq->num_workers--;
>           Jmsg1(NULL, M_ERROR, 0, _("pthread_create: ERR=%s\n"),
> be.bstrerror(stat));
>           return stat;
>        }
> @@ -386,7 +388,6 @@
>     set_jcr_in_tsd(INVALID_JCR);
>     Dmsg0(2300, "Start jobq_server\n");
>     P(jq->mutex);
> -   jq->num_workers++;
>
>     for (;;) {
>        struct timeval tv;
>
> Kern: I can commit this if it's okay with you?

Thanks for finding this, but please do not commit the above patch at least for 
the moment.  The problem is indeed a race condition as you found.  Your fix 
does significantly reduce the window for it to happen but does not eliminate 
it.  

The correct solution is to put the jq mutex lock/unlock (i.e. P/V) around the 
test for max_workers at line 357 of jobq.c and to move the incrementing of 
the count as you have done.   However, it is even more complicated because 
you only removed one of the many decrements of num_workers from jobq_server, 
and it is not so easy to remove them all.  In addition for the jobq_server to 
work correctly, it must be able to release and reset the lock.  Some careful 
thought needs to go into this.  Possibly the solution is to just pull out the 
main P() and V()s from the jobq_server routine on entry and exit but to leave 
the 3 places in the code where the lock is temporarily released.

Unless I hear otherwise from you, I will assume that you want to create the 
patch.

>
> James
>
> ---------------------------------------------------------------------------
>--- Crystal Reports - New Free Runtime and 30 Day Trial
> Check out the new simplified licensing option that enables
> unlimited royalty-free distribution of the report engine
> for externally facing server and web deployment.
> http://p.sf.net/sfu/businessobjects
> _______________________________________________
> Bacula-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/bacula-devel



------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables 
unlimited royalty-free distribution of the report engine 
for externally facing server and web deployment. 
http://p.sf.net/sfu/businessobjects
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to