Hi all,
I now can say for certain that there is a problem in 2.2.9-p3
which was not present in 2.2.8, and it will hurt people who
use a (real) 2+ drive autochanger.
This is *not* one of my "highly-ambitions-migration stuff" - it
will give problems with plain normal backups.
I demonstrate it from the logfile:
1. This is my autochanger emulation as defined to the SD:
>Device status:
>Autochanger "FileSwap" with devices:
> "Files0" (/var/bxpool/files0)
> "Files1" (/var/bxpool/files1)
> "Files2" (/var/bxpool/files2)
> "Files3" (/var/bxpool/files3)
>Device "Files0" (/var/bxpool/files0) is not open.
> Drive 0 status unknown.
>Device "Files1" (/var/bxpool/files1) is not open.
> Drive 1 is not loaded.
>Device "Files2" (/var/bxpool/files2) is not open.
> Drive 2 status unknown.
>Device "Files3" (/var/bxpool/files3) is not open.
> Drive 3 status unknown.
2. This is my "undelete" Function: a plain incremental backup
of the homedirs running every 10 minutes, at xx:x4. We can see
that it currently writes to Volume DISK-004 (aka slot 4) mounted
to "Files0" (aka drive 0):
29-Mar 19:04 BxDir JobId 19445: Start Backup JobId 19445,
Job=HomeDirsEdge.2008-03-29_19.04.30
29-Mar 19:04 BxDir JobId 19445: Using Device "Files0"
29-Mar 19:04 BxSdGate JobId 19445: 3301 Issuing autochanger "loaded? drive 0"
command.
29-Mar 19:04 BxSdGate JobId 19445: 3302 Autochanger "loaded? drive 0", result
is Slot 4.
29-Mar 19:04 BxSdGate JobId 19445: 3301 Issuing autochanger "loaded? drive 0"
command.
29-Mar 19:04 BxSdGate JobId 19445: 3302 Autochanger "loaded? drive 0", result
is Slot 4.
29-Mar 19:04 BxSdGate JobId 19445: Volume "DISK-004" previously written, moving
to end of data.
29-Mar 19:04 BxSdGate JobId 19445: Ready to append to end of Volume "DISK-004"
size=89145203
29-Mar 19:04 BxSdGate JobId 19445: Job write elapsed time = 00:00:07, Transfer
rate = 162 bytes/second
29-Mar 19:04 BxDir JobId 19445: Bacula BxDir 2.2.9-b3 (27Mar08): 29-Mar-2008
19:04:08
Build OS: i386-portbld-freebsd6.3 freebsd 6.3-RELEASE-p1
JobId: 19445
Job: HomeDirsEdge.2008-03-29_19.04.30
Backup Level: Incremental, since=2008-03-29 18:54:01
Client: "Edge" 2.2.8 (26Jan08)
i386-portbld-freebsd6.3,freebsd,6.3-RELEASE-p1
FileSet: "HomeDirs" 2008-02-07 23:28:19
Pool: "DiskOnly" (From Run pool override)
Storage: "Files" (From Pool resource)
Scheduled time: 29-Mar-2008 19:04:00
Start time: 29-Mar-2008 19:04:01
End time: 29-Mar-2008 19:04:08
Elapsed time: 7 secs
Priority: 10
FD Files Written: 2
SD Files Written: 2
FD Bytes Written: 956 (956 B)
SD Bytes Written: 1,134 (1.134 KB)
Rate: 0.1 KB/s
Software Compression: 65.8 %
VSS: no
Storage Encryption: no
Volume name(s): DISK-004
Volume Session Id: 139
Volume Session Time: 1206772246
Last Volume Bytes: 89,358,477 (89.35 MB)
Non-fatal FD errors: 0
SD Errors: 0
FD termination status: OK
SD termination status: OK
Termination: Backup OK
3. This is the regular daily incremental backup running at 19:12.
It currently writes to DISK-009 (aka slot 9), and we can see
that it moves the DISK-004 out of slot 0 and mounts DISK-009
there (and later switches to DISK-026):
29-Mar 19:14 BxDir JobId 19448: Start Backup JobId 19448,
Job=BackupEdge.2008-03-29_19.12.33
29-Mar 19:14 BxDir JobId 19448: Using Device "Files0"
BxFdEdge JobId 19448: /usr is a different filesystem. Will not descend from /
into /usr
[etc.etc.]
29-Mar 19:21 BxSdGate JobId 19448: User defined maximum volume capacity
266,291,200 exceeded on device "Files0" (/var/bxpool/files0).
29-Mar 19:21 BxSdGate JobId 19448: End of medium on Volume "DISK-009"
Bytes=266,270,170 Blocks=4,130 at 29-Mar-2008 19:21.
29-Mar 19:21 BxDir JobId 19448: There are no more Jobs associated with Volume
"DISK-026". Marking it purged.
29-Mar 19:21 BxDir JobId 19448: All records pruned from Volume "DISK-026";
marking it "Purged"
29-Mar 19:21 BxDir JobId 19448: Recycled volume "DISK-026"
29-Mar 19:21 BxSdGate JobId 19448: 3301 Issuing autochanger "loaded? drive 0"
command.
29-Mar 19:21 BxSdGate JobId 19448: 3302 Autochanger "loaded? drive 0", result
is Slot 9.
29-Mar 19:21 BxSdGate JobId 19448: 3307 Issuing autochanger "unload slot 9,
drive 0" command.
29-Mar 19:21 BxSdGate JobId 19448: 3304 Issuing autochanger "load slot 26,
drive 0" command.
29-Mar 19:21 BxSdGate JobId 19448: 3305 Autochanger "load slot 26, drive 0",
status is OK.
29-Mar 19:21 BxSdGate JobId 19448: 3301 Issuing autochanger "loaded? drive 0"
command.
29-Mar 19:21 BxSdGate JobId 19448: 3302 Autochanger "loaded? drive 0", result
is Slot 26.
29-Mar 19:21 BxSdGate JobId 19448: Recycled volume "DISK-026" on device
"Files0" (/var/bxpool/files0), all previous data lost.
29-Mar 19:21 BxSdGate JobId 19448: New volume "DISK-026" mounted on device
"Files0" (/var/bxpool/files0) at 29-Mar-2008 19:21.
BxFdEdge JobId 19448: /j/serv/home is a different filesystem. Will not descend
from /j/serv into /j/serv/home
[etc.etc.]
29-Mar 19:23 BxSdGate JobId 19448: Job write elapsed time = 00:08:04, Transfer
rate = 79.71 K bytes/second
29-Mar 19:23 BxDir JobId 19448: Bacula BxDir 2.2.9-b3 (27Mar08): 29-Mar-2008
19:23:30 Build OS: i386-portbld-freebsd6.3 freebsd 6.3-RELEASE-p1
JobId: 19448
Job: BackupEdge.2008-03-29_19.12.33
Backup Level: Incremental, since=2008-03-28 19:13:20
Client: "Edge" 2.2.8 (26Jan08)
i386-portbld-freebsd6.3,freebsd,6.3-RELEASE-p1
FileSet: "SysBack" 2008-02-10 06:46:30
Pool: "SysBackDisk" (From Job resource)
Storage: "Files" (From Pool resource)
Scheduled time: 29-Mar-2008 19:12:00
Start time: 29-Mar-2008 19:14:56
End time: 29-Mar-2008 19:23:30
Elapsed time: 8 mins 34 secs
Priority: 10
FD Files Written: 2,602
SD Files Written: 2,602
FD Bytes Written: 38,215,529 (38.21 MB)
SD Bytes Written: 38,583,617 (38.58 MB)
Rate: 74.3 KB/s
Software Compression: 61.2 %
VSS: no
Storage Encryption: no
Volume name(s): DISK-009|DISK-026
Volume Session Id: 144
Volume Session Time: 1206772246
Last Volume Bytes: 14,063,002 (14.06 MB)
Non-fatal FD errors: 0
SD Errors: 0
FD termination status: OK
SD termination status: OK
Termination: Backup OK
4. This is the next "undelete" Backup at 19:14, it runs *while* the
regular daily is still running. We can see that now
-as drive 0 is occupied-
it uses drive 1 and mounts it's DISK-004 there. That's fine so
far:
29-Mar 19:14 BxDir JobId 19449: Start Backup JobId 19449,
Job=HomeDirsEdge.2008-03-29_19.14.34
29-Mar 19:14 BxDir JobId 19449: Using Device "Files1"
29-Mar 19:14 BxSdGate JobId 19449: 3301 Issuing autochanger "loaded? drive 1"
command.
29-Mar 19:14 BxSdGate JobId 19449: 3302 Autochanger "loaded? drive 1", result:
nothing loaded.
29-Mar 19:14 BxSdGate JobId 19449: 3304 Issuing autochanger "load slot 4, drive
1" command.
29-Mar 19:14 BxSdGate JobId 19449: 3305 Autochanger "load slot 4, drive 1",
status is OK.
29-Mar 19:14 BxSdGate JobId 19449: 3301 Issuing autochanger "loaded? drive 1"
command.
29-Mar 19:14 BxSdGate JobId 19449: 3302 Autochanger "loaded? drive 1", result
is Slot 4.
29-Mar 19:14 BxSdGate JobId 19449: Volume "DISK-004" previously written, moving
to end of data.
29-Mar 19:14 BxSdGate JobId 19449: Ready to append to end of Volume "DISK-004"
size=89358477
29-Mar 19:14 BxSdGate JobId 19449: Job write elapsed time = 00:00:30, Transfer
rate = 3.094 K bytes/second
29-Mar 19:14 BxDir JobId 19449: Bacula BxDir 2.2.9-b3 (27Mar08): 29-Mar-2008
19:14:41
Build OS: i386-portbld-freebsd6.3 freebsd 6.3-RELEASE-p1
JobId: 19449
Job: HomeDirsEdge.2008-03-29_19.14.34
Backup Level: Incremental, since=2008-03-29 19:04:01
Client: "Edge" 2.2.8 (26Jan08)
i386-portbld-freebsd6.3,freebsd,6.3-RELEASE-p1
FileSet: "HomeDirs" 2008-02-07 23:28:19
Pool: "DiskOnly" (From Run pool override)
Storage: "Files" (From Pool resource)
Scheduled time: 29-Mar-2008 19:14:00
Start time: 29-Mar-2008 19:14:02
End time: 29-Mar-2008 19:14:41
Elapsed time: 39 secs
Priority: 10
FD Files Written: 4
SD Files Written: 4
FD Bytes Written: 92,443 (92.44 KB)
SD Bytes Written: 92,823 (92.82 KB)
Rate: 2.4 KB/s
Software Compression: 74.8 %
VSS: no
Storage Encryption: no
Volume name(s): DISK-004
Volume Session Id: 142
Volume Session Time: 1206772246
Last Volume Bytes: 89,462,810 (89.46 MB)
Non-fatal FD errors: 0
SD Errors: 0
FD termination status: OK
SD termination status: OK
Termination: Backup OK
5. This is the next "undelete" Backup at 19:24, it runs *after* the
regular daily has terminated. Now the situation is:
in drive 0 there is still DISK-026 from the regular daily
in drive 1 is DISK-004 where this Backup should go to.
Now what happens is this:
29-Mar 19:24 BxDir JobId 19451: Start Backup JobId 19451,
Job=HomeDirsEdge.2008-03-29_19.24.36
29-Mar 19:24 BxDir JobId 19451: Using Device "Files0"
29-Mar 19:24 BxSdGate JobId 19451: 3301 Issuing autochanger "loaded? drive 0"
command.
29-Mar 19:24 BxSdGate JobId 19451: 3302 Autochanger "loaded? drive 0", result
is Slot 26.
29-Mar 19:24 BxSdGate JobId 19451: 3307 Issuing autochanger "unload slot 26,
drive 0" command.
29-Mar 19:24 BxSdGate JobId 19451: 3304 Issuing autochanger "load slot 4, drive
0" command.
29-Mar 19:24 BxSdGate JobId 19451: Fatal error: 3992 Bad autochanger "load slot
4, drive 0": ERR=Child exited with code 1.
Results=
29-Mar 19:24 BxFdEdge JobId 19451: Fatal error: job.c:1811 Bad response to
Append Data command. Wanted 3000 OK data
, got 3903 Error append data
29-Mar 19:24 BxDir JobId 19451: Error: Bacula BxDir 2.2.9-b3 (27Mar08):
29-Mar-2008 19:24:08
Build OS: i386-portbld-freebsd6.3 freebsd 6.3-RELEASE-p1
JobId: 19451
Job: HomeDirsEdge.2008-03-29_19.24.36
Backup Level: Incremental, since=2008-03-29 19:14:02
Client: "Edge" 2.2.8 (26Jan08)
i386-portbld-freebsd6.3,freebsd,6.3-RELEASE-p1
FileSet: "HomeDirs" 2008-02-07 23:28:19
Pool: "DiskOnly" (From Run pool override)
Storage: "Files" (From Pool resource)
Scheduled time: 29-Mar-2008 19:24:00
Start time: 29-Mar-2008 19:24:02
End time: 29-Mar-2008 19:24:08
Elapsed time: 6 secs
Priority: 10
FD Files Written: 0
SD Files Written: 0
FD Bytes Written: 0 (0 B)
SD Bytes Written: 0 (0 B)
Rate: 0.0 KB/s
Software Compression: None
VSS: no
Storage Encryption: no
Volume name(s):
Volume Session Id: 145
Volume Session Time: 1206772246
Last Volume Bytes: 89,462,810 (89.46 MB)
Non-fatal FD errors: 0
SD Errors: 0
FD termination status: Error
SD termination status: Error
Termination: *** Backup Error ***
We see: it unloads the DISK-026 from drive 0, and then wants to load DISK-004
_from_it's_slot_ - while DISK004 actually is in drive 1 where it was
written the last time!
This is reproducible, i.e. it seems to happen quite always under
such circumstances.
This problem will then persist until _either_
- you do a manual umount on drive 1 _or_
- some other backup job comes along and does that umount.
This did not happen with Release 2.2.8 - there the SD would remember
it's mounts (and then either move the Volume correctly from drive 1
to drive 0, or use drive 1 as write-target right away).
At least it did work for *writing* jobs. *Reading* jobs had exactly
this problem all the time - this was one of the issues on my list which
I were not yet talking about (due to my lack in understanding that there
is a lot of difference in handling reading and writing jobs).
rgds,
PMc
P.S. If you use the "disk-changer" script as it is in the
distribution, you will most likely *not* see this problem.
That is because the "disk-changer" can mount the same volume
into two (virtual) drives at the same time.
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel