Hello,

Bacula does already attempt to acquire the needed devices in the SD and
then backs them out if all the needed resources cannot be obtained. 
This works quite nicely.   Consequently, while the job is waiting the
resources are released in the SD. 

The problem occurs because the SD realizes that the resources are not
available, so it will wait a short period of time trying again to
acquire the resources, which is what one wants for virtually all jobs. 
When it cannot acquire the resources the SD will fail the job.  The
problem occurs because the user is over committing the SD resources. 
The solution is to get more drives or modify how you run jobs.

From what I understand in this case is that the user has a large number
of jobs that regularly fail and thus the user explicitly over commits
the resources.  The consequent is that Bacula works as it should but the
user gets lots of messages about the SD not being able to get resources.

Bacula was designed in a way were it expects to have the needed
resources available (i.e. the configuration should be optimized for the
available resources).  It also handles the case where you over load the
SD (too many jobs for available resources), but in that case it will
warn you, which is exactly what 99% of all users want.

One possible solution would be to add a new directive that suppresses
the reservation failure message.  However there is very likely a better
solution with the existing Bacula, I just do not know what it is at this
time.  This is the first time in 19 years that this problem has come up,
so before changing anything in the code, it has to be very clearly
understood, which is not the case (at least for me).

Another solution is for the user to modify the source code and remove
the warning message.

Best regards,
Kern

On 9/25/19 10:50 AM, Andrea Venturoli wrote:
> On 2019-09-25 10:19, Radosław Korzeniewski wrote:
>> Hello,
>>
>> sob., 21 wrz 2019 o 00:52 David Brodbeck <brodb...@math.ucsb.edu
>> <mailto:brodb...@math.ucsb.edu>> napisał(a):
>>
>>     I think this is a somewhat unfortunate design decision, to be
>>     honest. (...)
>>
>>
>> So what should be the best design in this case which should solve the
>> problem?
>
> I'm not so into the code to tell for sure.
> Maybe rescheduling should release the SD once the job first fails and
> reserve again when it starts the next time?
>
>  bye & Thanks
>     av.
>



_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to