On 23/06/2022 02:39, Marco Gaiarin wrote:

I'm setting up a Bacula installation for a set of servers, using debian
buster package (9.4.2-2+deb10u1) for director and storage daemons.

All works as expected until we use file daemon versions 7+, but some very
old servers that have bacula 5.0 or 5.2 does not work.


Communication works as expected, backup start but stalls after some Kbytes,
for example:

  *status client=merlot-fd
  Connecting to Client merlot-fd at merlot.ps.lnf.it:9102
merlot-fd Version: 5.2.6 (21 February 2012) x86_64-pc-linux-gnu debian 7.0
  Daemon started 22-giu-22 15:53. Jobs: run=0 running=0.
   Heap: heap=724,992 smbytes=176,337 max_bytes=176,484 bufs=116 max_bufs=117
   Sizeof: boffset_t=8 size_t=8 debug=0 trace=0
  Running Jobs:
  JobId 56 Job Merlot.2022-06-22_15.56.40_19 is running.
      Full Backup Job started: 22-giu-22 15:56
      Files=3 Bytes=132,602 Bytes/sec=185 Errors=0
      Files Examined=3
      Processing file: /var/backups/aptitude.pkgstates.0
      SDReadSeqNo=5 fd=5
  Director connected at: 22-giu-22 16:08
  ====

I've checked communication between dir, sd and fd with wireshark, and i've
not found troubles (eg, no filtered packet, retransmission, ...) that feels
me of ''network troubles'' (my first think).


I've also tried to enable spooling (not needed: media are disks), and if i
enable spooling i can see fd send all the spool chunk to sd, then when the
sd start to write to media, backup stall and after some timeout stop.


It's feel to me as some 'metadata' trouble, for that i'm asking of
compatibility between these bacula versions.
But hint on how troubleshoot this carefully are welcomed!

Just some data points, probably not directly helpful...

I'm still running old FDs - 3.0.3 on some AIX boxes being the oldest - talking to a 9.4.2 DIR/SD on Solaris 11.3, and all is good.

One thing is I build all components - except for Windows - myself, I prefer things to be done my way.

And as I use tape, spooling is turned on.

But this:
"> sd start to write to media, backup stall and after some timeout stop."
makes me wonder...

How much RAM is on your server, and how are the disks you use for storage connected?

It could be something to do with disk activity blocking network traffic, which I have seen before (not related to Bacula) on VM systems where things are over-subscribed up the wazoo.
Or when the storage is a NAS on the same network as everything else.
Or just a cheap-shit HBA.

        Cheers,
                Gary    B-)


_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to