Re: [Bacula-users] areas for improvement?

2020-06-13 Thread Kern Sibbald

  
  
Hello David,
Thanks for your confirmation of the problems.  I have a few
  suggestions for you:
1. Talk to Bacula Systems.  They give Universities very nice
  discounts, and they do have client initiated backup.  Bacula
  Systems has by default a subscription model, but certain customers
  prefer a one time purchase, which is possible.

2. If that is not an option, please let me know, and I will check
  with Bacula Systems for some pointers.  I have not personally used
  this feature but helped design it.  I believe it is already in
  Bacula community, but if it is not, then it surely will be
  available in the next version (probably around the end of the
  year).  

Best regards,
Kern

On 6/12/20 7:39 PM, David Brodbeck
  wrote:


  
  




  On Wed, Jun 10, 2020 at 8:41
AM Josh Fisher  wrote:
  
I still feel that Bacula's design is correct. Yes, 802.3az
changes the 
always-on nature of a connection, allowing either side to
temporarily 
power down its transmitter to save energy, but the standard
itself 
doesn't change the original goal of a persistent connection.
It is the 
switch firmware and/or NIC device drivers that claim to
support it, but 
do not. It makes sense for Bacula to be as robust as
possible, but this 
is not a Bacula design flaw. It is a work-around for buggy
hardware.
  
  
  
  I've also run into this when trying to back up over a
VPN. The backup time can easily exceed the VPN's maximum
session time.
  
  
  It's fair to argue that both NAT routers and VPNs are a
corruption of TCP/IP's design intent, but it doesn't seem
likely we'll be rid of them any time soon. Bacula doesn't
work very well with either. Besides the connection drop
issues, I haven't yet gotten client-initiated backups to
work from behind a NAT, and I haven't found anyone who's
confirmed they have it working, either.
  
  
  None of this, of course, is an issue when backing up
always-on servers with static IPs -- which is Bacula's
focus. The problems come in when it's used to back up
endpoints. Unfortunately I haven't yet found anything else
to use for that that lets me control my own data and isn't a
subscription model. (In academic departments, it's much
easier to find money for one-time expenses than it is to
find a consistent source of it.)
  
  

-- 

  David Brodbeck
System Administrator, Department of Mathematics
  University of California, Santa Barbara
  
  

  

  
  
  
  
  
  ___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


  


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] areas for improvement?

2020-06-12 Thread Dmitri Maziuk

On 6/12/2020 12:39 PM, David Brodbeck wrote:


None of this, of course, is an issue when backing up always-on servers with
static IPs -- which is Bacula's focus.


Not really: what's happening is if an intermediate node goes down, IP 
will find a different route -- *at the network layer*.


We normally run backups over non-routed links. "Always-on servers with 
static IPs" would normally be on the same (V)LAN. A VPN is a direct 
point-to-point link at Data Link Layer, even if it's tunneled over TCP/IP.


When a switch or NAT router or a virtual interface goes down at link 
layer, there's nothing IP can do about it.


The problem is time: as data size goes up, it takes longer to complete 
the backup, and with that the chances of a link-level disruption go up too.


Dima


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] areas for improvement?

2020-06-12 Thread Heitor Faria
Hello David, 

> It's fair to argue that both NAT routers and VPNs are a corruption of TCP/IP's
> design intent, but it doesn't seem likely we'll be rid of them any time soon.
> Bacula doesn't work very well with either. Besides the connection drop issues,
> I haven't yet gotten client-initiated backups to work from behind a NAT, and I
> haven't found anyone who's confirmed they have it working, either.

It is painful, requires tray-monitor and console configurations on the 
client-side, but I was able to make it work 
.
 
However, as for Firewall Transversing, the Enterprise Bacula Edition has now a 
proper better solution 

 that eventually could be ported to the Community version, but I have no 
information on that. 

> --
> David Brodbeck
> System Administrator, Department of Mathematics
> University of California, Santa Barbara

Regards, 
-- 

MSc Heitor Faria 
Bacula LATAM CEO 

mobile1: + 1 909 655-8971 
mobile2: + 55 61 98268-4220 
[ https://www.linkedin.com/in/msc-heitor-faria-5ba51b3 ] 
[ http://www.bacula.com.br/ ] 

América Latina 
[ http://bacula.lat/ | bacula.lat ] | [ http://www.bacula.com.br/ | 
bacula.com.br ] 
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] areas for improvement?

2020-06-12 Thread David Brodbeck
On Wed, Jun 10, 2020 at 8:41 AM Josh Fisher  wrote:

> I still feel that Bacula's design is correct. Yes, 802.3az changes the
> always-on nature of a connection, allowing either side to temporarily
> power down its transmitter to save energy, but the standard itself
> doesn't change the original goal of a persistent connection. It is the
> switch firmware and/or NIC device drivers that claim to support it, but
> do not. It makes sense for Bacula to be as robust as possible, but this
> is not a Bacula design flaw. It is a work-around for buggy hardware.
>

I've also run into this when trying to back up over a VPN. The backup time
can easily exceed the VPN's maximum session time.

It's fair to argue that both NAT routers and VPNs are a corruption of
TCP/IP's design intent, but it doesn't seem likely we'll be rid of them any
time soon. Bacula doesn't work very well with either. Besides the
connection drop issues, I haven't yet gotten client-initiated backups to
work from behind a NAT, and I haven't found anyone who's confirmed they
have it working, either.

None of this, of course, is an issue when backing up always-on servers with
static IPs -- which is Bacula's focus. The problems come in when it's used
to back up endpoints. Unfortunately I haven't yet found anything else to
use for that that lets me control my own data and isn't a subscription
model. (In academic departments, it's much easier to find money for
one-time expenses than it is to find a consistent source of it.)

-- 
David Brodbeck
System Administrator, Department of Mathematics
University of California, Santa Barbara
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] areas for improvement?

2020-06-11 Thread Kern Sibbald

Hello Josh,

Yes, you are correct, it was simply a design decision and reasonable at 
the time.  However, with the knowledge that I have today of how networks 
evolved that I did not have then, were I to do it over, I would add a 
Bacula native persistent connection (using reconnection when the line 
drops).  In fact, the original design contains a sequential block number 
that is not used, but was to be the basis for subsequent code that would 
reconnect.


I appreciate your comments. Thanks.

Kern

On 6/10/20 5:41 PM, Josh Fisher wrote:


On 6/10/2020 8:04 AM, Kern Sibbald wrote:

Hello,

...

Now on the fact that line drops cancel jobs: First Bacula was 
designed with the concept that it would have a stable communications 
line as is supposed to be provided by TCP/IP, which Bacula uses.  
This was a correct design based on networks at the time, but on 
retrospect, I should have included comm line restarts in the original 
design.  In my opinion, the real problem is that modern switches for 
all sorts of good reasons do not really support the original design 
goals of TCP/IP.



I still feel that Bacula's design is correct. Yes, 802.3az changes the 
always-on nature of a connection, allowing either side to temporarily 
power down its transmitter to save energy, but the standard itself 
doesn't change the original goal of a persistent connection. It is the 
switch firmware and/or NIC device drivers that claim to support it, 
but do not. It makes sense for Bacula to be as robust as possible, but 
this is not a Bacula design flaw. It is a work-around for buggy hardware.




___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] areas for improvement?

2020-06-10 Thread Josh Fisher


On 6/10/2020 8:04 AM, Kern Sibbald wrote:

Hello,

...

Now on the fact that line drops cancel jobs: First Bacula was designed 
with the concept that it would have a stable communications line as is 
supposed to be provided by TCP/IP, which Bacula uses.  This was a 
correct design based on networks at the time, but on retrospect, I 
should have included comm line restarts in the original design.  In my 
opinion, the real problem is that modern switches for all sorts of 
good reasons do not really support the original design goals of TCP/IP.



I still feel that Bacula's design is correct. Yes, 802.3az changes the 
always-on nature of a connection, allowing either side to temporarily 
power down its transmitter to save energy, but the standard itself 
doesn't change the original goal of a persistent connection. It is the 
switch firmware and/or NIC device drivers that claim to support it, but 
do not. It makes sense for Bacula to be as robust as possible, but this 
is not a Bacula design flaw. It is a work-around for buggy hardware.




___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] areas for improvement?

2020-06-10 Thread Kern Sibbald

Hello,

For some reason I never received the original email (very odd). I think 
that Gary has done a very good job at responding.  I'll give you my take 
on this, but please excuse me if I duplicate what has already been said.


First on the SQL database, which is as has been pointed out not 
stateless: I have never seen database communications drops reported as a 
problem.  For me it has never been a problem because I run the Director 
and the database backend (Postgres for me) on the same machine, so it is 
as far as I understand not using the communications lines.


Now on the fact that line drops cancel jobs: First Bacula was designed 
with the concept that it would have a stable communications line as is 
supposed to be provided by TCP/IP, which Bacula uses.  This was a 
correct design based on networks at the time, but on retrospect, I 
should have included comm line restarts in the original design.  In my 
opinion, the real problem is that modern switches for all sorts of good 
reasons do not really support the original design goals of TCP/IP.


That said, I have been aware of the problem that Alan brings up, and 
Bacula does have the ability to restart jobs at the point where the job 
failed under certain conditions such as a comm line drop.  This feature 
seems to be rarely used, but is quite effective in the case where one 
has lots of comm line failures.


For some time, I have had in mind a project to make Bacula restart a 
comm line connection after a drop, however, as Gary points out, this is 
far from being a trivial project.  Bacula Systems currently has a 
project well along the way to implement this feature, and from what I 
have heard, it is now in the testing phase and will probably be in the 
next Bacula Enterprise release. When it will appear in the community 
version is not clear.


Concerning priorities of projects: to the best of my knowledge no one 
has submitted a bug report or a request for this feature other than Alan 
who submitted a request for this feature some time ago in the Enterprise 
version. For Bacula Systems, a lot of time and consideration is devoted 
twice a year to examine new feature requests and decide which to 
implement.  Every six months key managers and an outside consultant are 
requested to submit the most important feature requests.  They are then 
sorted by a number of conditions such as: difficulty of the project, 
number of users impacted, overall need for Bacula, ...  All that then 
works down to a Roadmap for the next release (in roughly 6 months) and 
the following release (in roughly 1 year).  The six month roadmap will 
then be approved by the company managers and reviewed at the bi-annual 
company meeting.  Generally the six month roadmaps do not change much 
(sometimes a feature is dropped or added).  The 1 year roadmap can 
change as you might imagine.


Bottom line: this is a very complex "feature" request, but it is now 
well along in development, and so will be available at some time in the 
not so distant future.


Gary: thanks for your insights :-)

Alan: I am not responding to all your comments, but will say that I 
believe that you have misunderstood certain things about Bacula Systems, 
how they decided what is important, etc.  One of the nice things about 
open source, is that if you are unhappy with what it does, you have all 
the source code, and you can either implement what you want, or hire 
someone to do it.  Having an Enterprises agreement does not necessarily 
meant that any feature request will be immediately implemented -- have 
you ever tried to get Microsoft to fix a bug or implement a new feature?


Best regards,

Kern

On 5/27/20 4:13 PM, Gary R. Schmidt wrote:

On 27/05/2020 23:17, Alan Brown wrote:


I've been running Bacula for ~15 years (community/enterprise) and have
identified a few areas which are in desperate of improvement:

For an "enterprise" grade backup system, it's amazingly fragile in a few
areas (particularly in actual Enterprise networks!)


Bacula DOES NOT LIKE and does not handle network interruptions _at all_
if backups are in progress. This _will_ cause backups to abort - and
these aborted backups are _not_ resumable

Similarly, if there's any kind of disruption between the director and
database, the only fix is to restart the director


What that means is that Bacula _cannot_ be used with a High Availability
database because network interruptions (when switching servers) are part
of the HA paradigm.

It also means that operators have to be _extremely_ careful about
allowing automated or other system upgrades


In days of multi-TB backup sets, this is turning into a showstopping
problem.

As we are an Enteprise customer this has been raised with Baculasystems
but been given _very_ low priority.   I'd like to hear opinions from the
wider community on this


Opinion: I know bugs aren't sexy to work on but these need fixing, not
being brushed off. This is the difference between LAN-quality and actual

Re: [Bacula-users] areas for improvement?

2020-05-29 Thread Radosław Korzeniewski
Hello,

pt., 29 maj 2020 o 15:56 Josh Fisher  napisał(a):

> On 5/29/2020 5:23 AM, Radosław Korzeniewski wrote:
>
> Hello Alan,
>
> śr., 27 maj 2020 o 17:02 Alan Brown  napisał(a):
>
>> Database connections are _supposed_ to be stateless.
>>
>
> I'm very surprised about the above statement as I cannot imagine such
> functionality already available in any SQL database I'm familiar with.
> If you drop a connection to the database in the middle of the transaction
> then your transaction will rollback. So, no it is not a
> stateless connection.
> Or I misunderstood your statement.
>
>
> It is stateful by definition, since information is stored. (The client is
> authenticated, etc.) Also, there are distinct states after connecting;
> waiting to issue, issue query, wait for answer, then back to waiting to
> issue. A query is atomic, from the client's perspective, so must fail if
> the connection is dropped prior to the answer.
>
Right. So, I do not understand that it could be such an expectation like
above for stateless database connection.

> However, while in the 'waiting to issue' state a re-connect is certainly
> possible, so a dropped connection while in the wait state does not have to
> be fatal. Checking connection state before each query would result in
> painfully slow performance, but it should be done once just before the
> catalog updates for a job are made. If a job is acquiring a db connection
> at job start, then there may be quite some time between the start of the
> job and actually updating the catalog. It would be more robust for the time
> between the db connectivity check and the actual issuing of queries to be
> made as short as possible, because it lowers the probability of a dropped
> connection interrupting the job.
>
I never denied it and never defended the current state in this area.

best regards
-- 
Radosław Korzeniewski
rados...@korzeniewski.net
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] areas for improvement?

2020-05-29 Thread Josh Fisher


On 5/29/2020 5:23 AM, Radosław Korzeniewski wrote:

Hello Alan,

śr., 27 maj 2020 o 17:02 Alan Brown > napisał(a):


Database connections are _supposed_ to be stateless.


I'm very surprised about the above statement as I cannot imagine such 
functionality already available in any SQL database I'm familiar with.
If you drop a connection to the database in the middle of the 
transaction then your transaction will rollback. So, no it is not a 
stateless connection.

Or I misunderstood your statement.



It is stateful by definition, since information is stored. (The client 
is authenticated, etc.) Also, there are distinct states after 
connecting; waiting to issue, issue query, wait for answer, then back to 
waiting to issue. A query is atomic, from the client's perspective, so 
must fail if the connection is dropped prior to the answer. However, 
while in the 'waiting to issue' state a re-connect is certainly 
possible, so a dropped connection while in the wait state does not have 
to be fatal. Checking connection state before each query would result in 
painfully slow performance, but it should be done once just before the 
catalog updates for a job are made. If a job is acquiring a db 
connection at job start, then there may be quite some time between the 
start of the job and actually updating the catalog. It would be more 
robust for the time between the db connectivity check and the actual 
issuing of queries to be made as short as possible, because it lowers 
the probability of a dropped connection interrupting the job.


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] areas for improvement?

2020-05-29 Thread Radosław Korzeniewski
Hello Alan,

śr., 27 maj 2020 o 17:02 Alan Brown  napisał(a):

> Database connections are _supposed_ to be stateless.
>

I'm very surprised about the above statement as I cannot imagine such
functionality already available in any SQL database I'm familiar with.
If you drop a connection to the database in the middle of the transaction
then your transaction will rollback. So, no it is not a
stateless connection.
Or I misunderstood your statement.

best regards
-- 
Radosław Korzeniewski
rados...@korzeniewski.net
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] areas for improvement?

2020-05-27 Thread Thomas Lohman

Bacula DOES NOT LIKE and does not handle network interruptions _at all_
if backups are in progress. This _will_ cause backups to abort - and
these aborted backups are _not_ resumable


Hi,

My feeble two cents is that this has been a bit of an Achilles heel for 
us even though we are a LAN backup environment (e.g. backups don't leave 
our local network).  We are still running an older "somewhat/slightly" 
customized/modified version of community bacula so I have not explored 
the restarting of stopped jobs option that has come with newer versions. 
Given that, I can recall when we initially deployed our "backups to 
disk" setup, I would see backups of large file systems/data (e.g. 1TB) 
write 3/4ths of their data to volumes and then error out due to some 
random network interruption.  I didn't like the idea that this meant 
e.g. 750GBs worth of our volume space was taken up by an 
errored/incomplete job that would never be used.  Because of this, I had 
to implement spooling which typically people would only do if their 
backups were then being written to sequential media (tape).  So, we now 
spool all jobs to dedicated spool disks and then bacula writes that data 
to the disk data volumes.  It fixed the "cruft" issue and made large 
backups more stable (along with other options).  But I can imagine a 
scenario where we would not have had to do this if Bacula could more 
easily recover from network glitches and automatically restart jobs 
where it last left off (thinking along the lines of the concept of 
checkpointing in a RDBMS).


As someone else said, this would require non-trivial changes to Bacula 
(i.e. I won't be making those changes to our version - :) ) and the 
devil would be in the details in practice.  Still, if it was put to a 
vote, I'd probably vote for this as "a nice feature to have."


cheers,


--tom



___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] areas for improvement?

2020-05-27 Thread Heitor Faria
Hello Alan,

>>> Bacula DOES NOT LIKE and does not handle network interruptions _at all_
>>> if backups are in progress. This _will_ cause backups to abort - and
>>> these aborted backups are _not_ resumable

I wonder if anyone has ever tried changing the tcp_retries values on Linux and 
how they would affect the Bacula clients' connections 
.

Regards,
-- 
MSc Heitor Faria 
CEO Bacula LATAM 
mobile1: + 1 909 655-8971 
mobile2: + 55 61 98268-4220 
[ https://www.linkedin.com/in/msc-heitor-faria-5ba51b3 ] 
[ http://www.bacula.com.br/ ] 

América Latina 
[ http://bacula.lat/ | bacula.lat ] | [ http://www.bacula.com.br/ | 
bacula.com.br ]


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] areas for improvement?

2020-05-27 Thread Alan Brown
On 27/05/2020 15:13, Gary R. Schmidt wrote:
> On 27/05/2020 23:17, Alan Brown wrote:
>>
>>
>>
>> Bacula DOES NOT LIKE and does not handle network interruptions _at all_
>> if backups are in progress. This _will_ cause backups to abort - and
>> these aborted backups are _not_ resumable
>>
>> Similarly, if there's any kind of disruption between the director and
>> database, the only fix is to restart the director
>>

>> Opinion: I know bugs aren't sexy to work on but these need fixing, not
>> being brushed off. This is the difference between LAN-quality and actual
>> Enterprise grade software.
>>
> I do not consider these to be bugs - they aren't simple errors where
> someone made a mistake or used the wrong sized variable - they require
> a large amount of re-design and reimplementation of Bacula's
> communication modules, and the scheduler, and no doubt other bits to
> go away.


Nonetheless they need to be done. There are a lot of assumptions made
about networks that simply do not hold true or only work in SOHO/SMB scale.


>
> Bacula started life twenty years ago, and the environment has changed
> since then, and, while Bacula has kept up with a some things, disk as
> a target rather than tape, frex, something like re-startable jobs is,
> as I have said, not just an extension or addition to what is there,
> but a big change to a large part of Bacula.


Restarting is there for stopped jobs already. The question is how much
work is needed to extend that to aborted or errored jobs


> And, from the commercial stand-point, that the changes could be made
without interrupting the existing income stream. 

There's a "cost of not implementing". I'm facing pressure to replace
Bacula and this is pointed to as one of the reasons - bear in mind we're
a paying customer who would go away if this isn't sorted


>
> Then there's the projected time-line before it could be released?

You can't project that if it's not even on your TODO list and right now
it keeps being swept into the "WON'T DO" basket.


> I don't want to think about that, Bacula is fragile as it is, ripping
> it apart and stitching it back together would be a massive task!


This is exactly why I _do_ want to think about it. This is _where_ it's
fragile and what most fundamentally needs fixing.

Enteprise software needs to be robust. Bacula is not - in extremely
critical areas


"If carpenters built buildings the way programmers write programs, the
first woodpecker that came along would destroy civilization."


> And Bacula does not have that capability, not in the OSS space nor in
> the Enterprise space.
>
> All the above said, I think that re-startable jobs would be a great
> enhancement for Bacula, but how often and for how long does it try by
> default before giving up?  :->
>

restartable, or reconnecting? (and why not just set defaults - then let
the users decide on #attempts/timeouts?)


The single most fragile part of Bacula:  If the database connection
glitches for _any_ reason the only solution is to restart the entire
program - and you lose _everything_ that was underway at the time.

As I said, that includes using a high availability database (postgresql,
etc). As soon as heads are switched there's a necessary glitch in the
connection.


Database connections are _supposed_ to be stateless. Bacula breaks that
and as such it's a fundamental bug, whether by design or not.






___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] areas for improvement?

2020-05-27 Thread Gary R. Schmidt

On 27/05/2020 23:17, Alan Brown wrote:


I've been running Bacula for ~15 years (community/enterprise) and have
identified a few areas which are in desperate of improvement:

For an "enterprise" grade backup system, it's amazingly fragile in a few
areas (particularly in actual Enterprise networks!)


Bacula DOES NOT LIKE and does not handle network interruptions _at all_
if backups are in progress. This _will_ cause backups to abort - and
these aborted backups are _not_ resumable

Similarly, if there's any kind of disruption between the director and
database, the only fix is to restart the director


What that means is that Bacula _cannot_ be used with a High Availability
database because network interruptions (when switching servers) are part
of the HA paradigm.

It also means that operators have to be _extremely_ careful about
allowing automated or other system upgrades


In days of multi-TB backup sets, this is turning into a showstopping
problem.

As we are an Enteprise customer this has been raised with Baculasystems
but been given _very_ low priority.   I'd like to hear opinions from the
wider community on this


Opinion: I know bugs aren't sexy to work on but these need fixing, not
being brushed off. This is the difference between LAN-quality and actual
Enterprise grade software.

I do not consider these to be bugs - they aren't simple errors where 
someone made a mistake or used the wrong sized variable - they require a 
large amount of re-design and reimplementation of Bacula's communication 
modules, and the scheduler, and no doubt other bits to go away.


Bacula started life twenty years ago, and the environment has changed 
since then, and, while Bacula has kept up with a some things, disk as a 
target rather than tape, frex, something like re-startable jobs is, as I 
have said, not just an extension or addition to what is there, but a big 
change to a large part of Bacula.


And that's a massive risk, it's the sort of task I would be looking at 
having a whole team work on, a couple of designers, six to ten 
programmers, and a QA team with a nasty manager who was not restricted 
from saying, "No!" when things don't work quite right.


And the mob above all have a *really* good understanding of how the 
various bits of Bacula work, and interact, and are capable of and 
allowed to replace ancient groaning bits of code with newer versions 
that just aren't as wrong.  (First task - rename all files so the 
extensions represent the C++ code inside them, and for the really 
cruddy^Wannoying stuff, G++.)


And, from the commercial stand-point, that the changes could be made 
without interrupting the existing income stream.


Then there's the projected time-line before it could be released?
I don't want to think about that, Bacula is fragile as it is, ripping it 
apart and stitching it back together would be a massive task!


And Bacula does not have that capability, not in the OSS space nor in 
the Enterprise space.


All the above said, I think that re-startable jobs would be a great 
enhancement for Bacula, but how often and for how long does it try by 
default before giving up?  :->


Cheers,
GaryB-)


___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] areas for improvement?

2020-05-27 Thread Alan Brown

I've been running Bacula for ~15 years (community/enterprise) and have
identified a few areas which are in desperate of improvement:

For an "enterprise" grade backup system, it's amazingly fragile in a few
areas (particularly in actual Enterprise networks!)


Bacula DOES NOT LIKE and does not handle network interruptions _at all_
if backups are in progress. This _will_ cause backups to abort - and
these aborted backups are _not_ resumable

Similarly, if there's any kind of disruption between the director and
database, the only fix is to restart the director


What that means is that Bacula _cannot_ be used with a High Availability
database because network interruptions (when switching servers) are part
of the HA paradigm.

It also means that operators have to be _extremely_ careful about
allowing automated or other system upgrades


In days of multi-TB backup sets, this is turning into a showstopping
problem.

As we are an Enteprise customer this has been raised with Baculasystems
but been given _very_ low priority.   I'd like to hear opinions from the
wider community on this


Opinion: I know bugs aren't sexy to work on but these need fixing, not
being brushed off. This is the difference between LAN-quality and actual
Enterprise grade software.





___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users