Hello,
Please try the following patch (based on 2.2.8):
Index: jobq.c
===================================================================
--- jobq.c (revision 6531)
+++ jobq.c (working copy)
@@ -478,7 +478,7 @@
*/
if (jcr->acquired_resource_locks) {
if (jcr->rstore) {
- jcr->rstore->NumConcurrentJobs = 0;
+ jcr->rstore->NumConcurrentJobs--;
Dmsg1(200, "Dec rncj=%d\n", jcr->rstore->NumConcurrentJobs);
}
if (jcr->wstore) {
Best regards,
Kern
On Sunday 09 March 2008 01.34:06 Peter Much wrote:
> <[EMAIL PROTECTED]> aka Arno Lehmann schrieb
>
> mit Datum Sun, 02 Mar 2008 12:50:17 +0100 in m2n.bacula.devel:
> |> 2. This is the thing that I have been worrying the most about. I
> |> have been following various theories about what might happen
> |> there, yet to no avail. The last of my theories was that it might
> |> have to do with the migrations, but currently I tend to dismiss
> |> this theory also. In fact, I am still clueless.
> |> What happens is that the Director puts all jobs (and all newly
> |> started jobs) into either "waiting on max Storage jobs" or
> |> "waiting execution", while there is no job running on any client
> |> and no job running on the SD. It just does nothing and has to
> |> be restarted.
> |
> |That definitely qualifies as a bug... have you tried looking at the
> |debug output, once the DIR is in this state?
>
> This was a good hint. The debug shows this:
> >BxDir: jcr.c:603-0 OnEntry JobStatus=s set=s
> >BxDir: jcr.c:623-0 OnExit JobStatus=s set=s
> >BxDir: jobq.c:701-0 Wstore=Files
> >BxDir: jobq.c:723-0 Fail wncj=-2
>
> And what I also have seen is rncj=-2, and rncj=3.
>
> Looking into jobq.c, I find that rncj is never supposed to take any
> value except 0 and 1 (maximum one read job per device).
> OTOH, I find that rncj is not a unique entity - it is just the
> NumConcurrentJobs of any Storage device.
>
> So, this seems not to be a migration issue, it seems to be a problem
> with multidrive autoloaders.
> According to the manual, since Bacula version 1.whatever an
> autoloader has to be defined as a single device in the DIR.
> So, if this autoloader has multiple drives, it is well possible
> that these drives are used for reading AND writing at the same time.
>
> And this seems to break the rncj/wncj logic. My current most likely
> interpretation runs that way: Suppose we have one restore running:
> rncj=1. Then we get two backups running: wncj=rncj=3. Then the
> restore terminates and sets rncj=0. So, when the two backup
> jobs terminate, it goes to -2 - and this is where the show ends.
>
> I am now trying the following as a fix, and see if it helps.
>
> rgds,
> PMc
>
> --- src/dird/jobq.c.orig Mon Dec 10 18:54:41 2007
> +++ src/dird/jobq.c Sun Mar 9 00:27:02 2008
> @@ -478,7 +478,8 @@
> */
> if (jcr->acquired_resource_locks) {
> if (jcr->rstore) {
> - jcr->rstore->NumConcurrentJobs = 0;
> + if (jcr->rstore->NumConcurrentJobs > 0)
> + jcr->rstore->NumConcurrentJobs--;
> Dmsg1(200, "Dec rncj=%d\n",
> jcr->rstore->NumConcurrentJobs); }
> if (jcr->wstore) {
> @@ -738,7 +739,8 @@
> Dmsg1(200, "Dec wncj=%d\n", jcr->wstore->NumConcurrentJobs);
> }
> if (jcr->rstore) {
> - jcr->rstore->NumConcurrentJobs = 0;
> + if(jcr->rstore->NumConcurrentJobs > 0);
> + jcr->rstore->NumConcurrentJobs--;
> Dmsg1(200, "Dec rncj=%d\n", jcr->rstore->NumConcurrentJobs);
> }
> set_jcr_job_status(jcr, JS_WaitClientRes);
> @@ -753,7 +755,8 @@
> Dmsg1(200, "Dec wncj=%d\n", jcr->wstore->NumConcurrentJobs);
> }
> if (jcr->rstore) {
> - jcr->rstore->NumConcurrentJobs = 0;
> + if(jcr->rstore->NumConcurrentJobs > 0);
> + jcr->rstore->NumConcurrentJobs--;
> Dmsg1(200, "Dec rncj=%d\n", jcr->rstore->NumConcurrentJobs);
> }
> jcr->client->NumConcurrentJobs--;
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Bacula-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/bacula-devel
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel