>>>>> On Mon, 18 Jan 2021 16:17:04 -0500, Phil Stracchino said:
> 
> On 1/18/21 2:08 PM, Martin Simmons wrote:
> > Looks reasonable to me.  (*)
> > 
> > If the INSERT failed for some other reason then the original mysql_query()
> > should have failed rather than mysql_affected_rows() returning 0.  I think 
> > if
> > a subsequent SELECT returns a PathId then it would be safe to assume it is
> > correct, regardless of any error in the INSERT.
> > 
> > (*) Except that there is the still an outstanding mystery about why this 
> > only
> > seems to happen repeatably for you and I think someone asked for more
> > debugging info about why you have concurrent updates to the db at all.
> 
> 
> That's a good question to which I don't know the answer.
> 
> Perhaps I misunderstand some of the workings:  Does not each job running
> in Bacula make its own updates to the Catalog?

Yes.


> If there is only one ... channel for DB updates, shall we say, i.e. only
> a single thread executes ALL connections to the DB serially with only
> one connection ever open at a time, then (a) I have a new understanding
> of why things stall for so long at job end with caching enabled, and (b)
> I'm newly mystified as to why this would be happening as well, since
> there is in that case no chance of DB contention between threads, which
> would seem to rpeclude race conditions between jobs, which was the only
> reason I could think of as to why this would be happening.

In theory jobs running without batch insert share a single BDB object in
memory, but it is used with a lock by a per-job thread (which is why the db
library has to be thread-safe).  See setup_job().

I say "in theory" because it looks like Bacula 9.6.4 broke this!

What is the output of "show catalog" in bconsole?  My guess is that you will
see db_driver=MySQ i.e. missing the final "L" of MySQL.  This will prevent it
from reusing the BDB object, leading to unexpected concurrency.

The bug is caused by this change:

diff --git a/bacula/src/dird/dird.c b/bacula/src/dird/dird.c
index fdb1d97bf9..11c4406ea7 100644
--- a/bacula/src/dird/dird.c
+++ b/bacula/src/dird/dird.c
@@ -1265,7 +1265,7 @@ static bool check_catalog(cat_op mode)
            /* To copy dbdriver field into "CAT" catalog resource class (local)
             * from dbdriver in "BDB" catalog DB Interface class (global)
             */
-            strncpy(catalog->db_driver, BDB_db_driver, db_driver_len);
+            bstrncpy(catalog->db_driver, BDB_db_driver, db_driver_len);
          }
       }

which was part of 9.6.4.

To fix it, replace db_driver_len with db_driver_len+1 in this call to
bstrncpy.  This has been fixed in Bacula 11 because the seemingly minor
problem with "status catalog" was reported in
https://bugs.bacula.org/view.php?id=2551 but it looks like the significance of
it wasn't realized.

__Martin


_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to