Re: [Bacula-users] [Bacula-devel] 2.1.10 beta: no appendable volume found problem with multidrive autochanger

Kern Sibbald Wed, 23 May 2007 14:02:57 -0700

On Wednesday 23 May 2007 21:13, John Drescher wrote:
> > > I am not sure if this functionality has changed from 1.38.X (as I use
> > > the drives directly now)
> > Multiple drive autochanger support has changed enormously since version 
1.38.
> 
> Thank You. I am very sorry to assume that it was the same. I will
> update my configs as soon as I can.


Hopefully things will work better on a later version.  In particular once we 
shake any bugs out of 2.1.x, everything should work much better.

> 
> > > To minimize tape swapping,
> >
> > This is not currently possible due to current architecture constraints.
> I think what you described above was what I was looking for, I mean if
> bacula finds a tape in drive2 at the start of a new job there is no
> good reason to move it to drive1 and run the job  there instead. 

I don't think it will do what you describe above.  However, it can be a bit 
tricky, because the tape can be in use when a job starts, then free up after 
it is scheduled but before it has a chance to open a drive.  So, there is a 
small window when a tape may be free and need to be moved from one drive to 
another.  

Hopefully, in the near future, Bacula will simply close the first drive and 
open the second one.  That is relatively high on my list of priorities.

> One 
> other feature I would like to see is if there are any empty drives
> when starting a job and none of the correct pools are loaded choose
> the empty drive instead of any unloading media.

I believe this is how it currently functions but am not 100% sure.

> 
> >
> > > maximize the use of the drives
> >
> > The above is not well defined.
> >
> I meant to not block on a second job from a different pool when there
> was a job running on the first and there is a free drive. You
> explained that this should not happen.

No, it shouldn't.  In 1.38.x it was possible that Bacula would get into a race 
condition and block a job (and possibly a drive) forever.  This happened to 
Arno, because there were two mutexes, and in one place they were acquired in 
a different order than elsewhere.  That was fixed a long time ago, and in 
2.1.x there is only one mutex in general.

> 
> > > and to prefer to use tapes with the Append status over grabbing a new 
tape.
> >
> This was again a result of me seeing more than one tape from a pool in
> my list media with the Append status.

I much prefer such a condition, particularly over one where there are no 
volumes with Append status :-)

> 
> > I think what you and others have been seeing is problems associated with 
the
> > evolution of Bacula.  There have been a whole series of "problems" to be
> > resolved that are particularly complicated when you are dealing with
> > multi-threaded programs, pruning/purging/recyling algorithms, and multiple
> > drive autochangers.  It is extremely complicated and difficult to debug, 
and
> > rather than sitting down and developing it in one shot requiring two 
years, I
> > have implemented it in little steps, each time Bacula has gotten smarter.
> >
> Being that I am a software developer I understand how complex this
> type of problem is. And then when you factor in the different hardware
> and different operating systems the problem becomes much harder
> especially when the developer does not have these systems to test.

Yes, not only do I not have a different systems (though I do have access to 
FreeBSD and Solaris systems), but I no longer have the time :-(

> 
> I am very sorry for complaining about 1.38 problems I experienced
> assuming that they are still 2.X problems. 

There was no harm done, and I didn't take offense because I realize that not 
everyone knows all the technical details.  In any case, I wanted to clarify a 
few points where a number of people on the lists seem to have some 
misconceptions -- in particular, that if it does something that seems wrong, 
you need to be able to reproduce it and clearly explain how I can reproduce 
it, or there is little chance of fixing it.  In addition, one needs to supply 
sufficient information. In general job output is a good start, but that alone 
is inadequate for diagnosing these types of problems (described by you very 
well below).  At a minimum, I need listings of the Volume states before and 
after the problem, output from "status storage=xxx", and quite possibly SD 
debug output (typically levels 100-400 are good).

> My intent on entering this 
> thread was to actually tone down the idea that the multidrive
> autochager code was buggy but that in my opinion with the experience
> that I had  with bacula was the reported ill behavior was more a cause
> of complexity and the interaction of different configuration settings
> and several different scheduling rules that were probably not designed
> specifically for multidrive autochangers.
> 

Well, unfortunately there are probably a few bugs still left, but I think your 
analysis of the situation is very accurate. Thanks.

Regards,

Kern

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] [Bacula-devel] 2.1.10 beta: no appendable volume found problem with multidrive autochanger

Reply via email to