Re: [Bacula-users] bacula jobs use volumes from the wrong pool - bug?

2012-07-10 Thread Martin Simmons
 On Mon, 09 Jul 2012 12:55:14 -0700, Stephen Thompson said:
 
 On 07/09/12 11:37, Martin Simmons wrote:
  On Fri, 06 Jul 2012 11:12:35 -0700, Stephen Thompson said:
 
  On 07/06/2012 11:01 AM, Martin Simmons wrote:
  On Thu, 05 Jul 2012 11:35:15 -0700, Stephen Thompson said:
 
  Hello again,
 
  Here's something even stranger...  Another Full job logs that it's
  written to a volume in the Full pool (FB0956), but then the status
  output of the job lists a volume in the Incremental pool (IM0093).  This
  Incremental volume was never even mentioned in the log as a volume to
  which the job despooled.
 
  It could be a database problem (the volumes listed in the status output 
  come
  from a query).  What is the output of the sql commands below?
 
  SELECT VolumeName,JobMedia.* FROM JobMedia,Media WHERE 
  JobMedia.JobId=242323 AND JobMedia.MediaId=Media.MediaId;
 
  SELECT MediaId,VolumeName FROM Media WHERE Media.VolumeName in 
  ('IM0093','FB0956');
 
 
  Looks like it did in fact write to the Incremental tape IM0093 instead
  of the requested Full tape BUT logged that it wrote to a Full tape
  FB0956.  This begs the questions 1) Why is it writing to a tape in
  another pool? and 2) Why is logging that it wrote to a different tape
  than it did?
 
  You could verify that IM0093 contains the data by using bls -j with the tape
  loaded (but not mounted in Bacula).
 
  It looks like you have concurrent jobs (non-consecutive JobMediaId values).
  Was another job trying to use IM0093?  Maybe IM0093 was in another drive and
  Bacula mixed up the drives somehow?
 
 
 Yes, I believe that FB0956 was in one drive and IM0093 in the other, 
 though I do not understand how bacula 'mixed up' which volume to use, or 
 which drive a particular volume was in.
 
 Not sure how closely related this is, but I've seen cases occasionally 
 where bacula will say that it cannot mount a certain volume in Drive0 
 and requires user intervention, only to find that the volume requested 
 is already mounted and in use in Drive1 by other jobs.  So it is 
 possible for bacula either to lose track of which drive a volume is in 
 or to not be sure if a volume is already in use.
 
 I did a partial restore of the job and it did in fact load and read off 
 IM0093 successfully.  So in some sense I know what happened, I just 
 don't know why it happened or how to prevent it (other than isolating 
 jobs, but that defeats the point of concurrency).

You could try upgrading to 5.2.10.  If that doesn't fix it, then reporting it
in the bug tracker might be the next step
(http://www.bacula.org/en/?page=bugs).

__Martin

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] bacula jobs use volumes from the wrong pool - bug?

2012-07-10 Thread Stephen Thompson
On 07/10/2012 10:53 AM, Martin Simmons wrote:
 On Mon, 09 Jul 2012 12:55:14 -0700, Stephen Thompson said:

 On 07/09/12 11:37, Martin Simmons wrote:
 On Fri, 06 Jul 2012 11:12:35 -0700, Stephen Thompson said:

 On 07/06/2012 11:01 AM, Martin Simmons wrote:
 On Thu, 05 Jul 2012 11:35:15 -0700, Stephen Thompson said:

 Hello again,

 Here's something even stranger...  Another Full job logs that it's
 written to a volume in the Full pool (FB0956), but then the status
 output of the job lists a volume in the Incremental pool (IM0093).  This
 Incremental volume was never even mentioned in the log as a volume to
 which the job despooled.

 It could be a database problem (the volumes listed in the status output 
 come
 from a query).  What is the output of the sql commands below?

 SELECT VolumeName,JobMedia.* FROM JobMedia,Media WHERE 
 JobMedia.JobId=242323 AND JobMedia.MediaId=Media.MediaId;

 SELECT MediaId,VolumeName FROM Media WHERE Media.VolumeName in 
 ('IM0093','FB0956');


 Looks like it did in fact write to the Incremental tape IM0093 instead
 of the requested Full tape BUT logged that it wrote to a Full tape
 FB0956.  This begs the questions 1) Why is it writing to a tape in
 another pool? and 2) Why is logging that it wrote to a different tape
 than it did?

 You could verify that IM0093 contains the data by using bls -j with the tape
 loaded (but not mounted in Bacula).

 It looks like you have concurrent jobs (non-consecutive JobMediaId values).
 Was another job trying to use IM0093?  Maybe IM0093 was in another drive and
 Bacula mixed up the drives somehow?


 Yes, I believe that FB0956 was in one drive and IM0093 in the other,
 though I do not understand how bacula 'mixed up' which volume to use, or
 which drive a particular volume was in.

 Not sure how closely related this is, but I've seen cases occasionally
 where bacula will say that it cannot mount a certain volume in Drive0
 and requires user intervention, only to find that the volume requested
 is already mounted and in use in Drive1 by other jobs.  So it is
 possible for bacula either to lose track of which drive a volume is in
 or to not be sure if a volume is already in use.

 I did a partial restore of the job and it did in fact load and read off
 IM0093 successfully.  So in some sense I know what happened, I just
 don't know why it happened or how to prevent it (other than isolating
 jobs, but that defeats the point of concurrency).

 You could try upgrading to 5.2.10.  If that doesn't fix it, then reporting it
 in the bug tracker might be the next step
 (http://www.bacula.org/en/?page=bugs).


Already upgraded.  We'll see if it happens again.

thanks,
Stephen



 __Martin

 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] bacula jobs use volumes from the wrong pool - bug?

2012-07-09 Thread Martin Simmons
 On Fri, 06 Jul 2012 11:12:35 -0700, Stephen Thompson said:
 
 On 07/06/2012 11:01 AM, Martin Simmons wrote:
  On Thu, 05 Jul 2012 11:35:15 -0700, Stephen Thompson said:
 
  Hello again,
 
  Here's something even stranger...  Another Full job logs that it's
  written to a volume in the Full pool (FB0956), but then the status
  output of the job lists a volume in the Incremental pool (IM0093).  This
  Incremental volume was never even mentioned in the log as a volume to
  which the job despooled.
 
  It could be a database problem (the volumes listed in the status output come
  from a query).  What is the output of the sql commands below?
 
  SELECT VolumeName,JobMedia.* FROM JobMedia,Media WHERE 
  JobMedia.JobId=242323 AND JobMedia.MediaId=Media.MediaId;
 
  SELECT MediaId,VolumeName FROM Media WHERE Media.VolumeName in 
  ('IM0093','FB0956');
 
 
 Looks like it did in fact write to the Incremental tape IM0093 instead 
 of the requested Full tape BUT logged that it wrote to a Full tape 
 FB0956.  This begs the questions 1) Why is it writing to a tape in 
 another pool? and 2) Why is logging that it wrote to a different tape 
 than it did?

You could verify that IM0093 contains the data by using bls -j with the tape
loaded (but not mounted in Bacula).

It looks like you have concurrent jobs (non-consecutive JobMediaId values).
Was another job trying to use IM0093?  Maybe IM0093 was in another drive and
Bacula mixed up the drives somehow?

__Martin

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] bacula jobs use volumes from the wrong pool - bug?

2012-07-09 Thread Stephen Thompson
On 07/09/12 11:37, Martin Simmons wrote:
 On Fri, 06 Jul 2012 11:12:35 -0700, Stephen Thompson said:

 On 07/06/2012 11:01 AM, Martin Simmons wrote:
 On Thu, 05 Jul 2012 11:35:15 -0700, Stephen Thompson said:

 Hello again,

 Here's something even stranger...  Another Full job logs that it's
 written to a volume in the Full pool (FB0956), but then the status
 output of the job lists a volume in the Incremental pool (IM0093).  This
 Incremental volume was never even mentioned in the log as a volume to
 which the job despooled.

 It could be a database problem (the volumes listed in the status output come
 from a query).  What is the output of the sql commands below?

 SELECT VolumeName,JobMedia.* FROM JobMedia,Media WHERE 
 JobMedia.JobId=242323 AND JobMedia.MediaId=Media.MediaId;

 SELECT MediaId,VolumeName FROM Media WHERE Media.VolumeName in 
 ('IM0093','FB0956');


 Looks like it did in fact write to the Incremental tape IM0093 instead
 of the requested Full tape BUT logged that it wrote to a Full tape
 FB0956.  This begs the questions 1) Why is it writing to a tape in
 another pool? and 2) Why is logging that it wrote to a different tape
 than it did?

 You could verify that IM0093 contains the data by using bls -j with the tape
 loaded (but not mounted in Bacula).

 It looks like you have concurrent jobs (non-consecutive JobMediaId values).
 Was another job trying to use IM0093?  Maybe IM0093 was in another drive and
 Bacula mixed up the drives somehow?


Yes, I believe that FB0956 was in one drive and IM0093 in the other, 
though I do not understand how bacula 'mixed up' which volume to use, or 
which drive a particular volume was in.

Not sure how closely related this is, but I've seen cases occasionally 
where bacula will say that it cannot mount a certain volume in Drive0 
and requires user intervention, only to find that the volume requested 
is already mounted and in use in Drive1 by other jobs.  So it is 
possible for bacula either to lose track of which drive a volume is in 
or to not be sure if a volume is already in use.

I did a partial restore of the job and it did in fact load and read off 
IM0093 successfully.  So in some sense I know what happened, I just 
don't know why it happened or how to prevent it (other than isolating 
jobs, but that defeats the point of concurrency).

Stephen




 __Martin

 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



-- 
Stephen Thompson   Berkeley Seismological Laboratory
step...@seismo.berkeley.edu215 McCone Hall # 4760
404.538.7077 (phone)   University of California, Berkeley
510.643.5811 (fax) Berkeley, CA 94720-4760



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] bacula jobs use volumes from the wrong pool - bug?

2012-07-06 Thread Martin Simmons
 On Thu, 05 Jul 2012 11:35:15 -0700, Stephen Thompson said:
 
 Hello again,
 
 Here's something even stranger...  Another Full job logs that it's 
 written to a volume in the Full pool (FB0956), but then the status 
 output of the job lists a volume in the Incremental pool (IM0093).  This 
 Incremental volume was never even mentioned in the log as a volume to 
 which the job despooled.

It could be a database problem (the volumes listed in the status output come
from a query).  What is the output of the sql commands below?

SELECT VolumeName,JobMedia.* FROM JobMedia,Media WHERE JobMedia.JobId=242323 
AND JobMedia.MediaId=Media.MediaId;

SELECT MediaId,VolumeName FROM Media WHERE Media.VolumeName in 
('IM0093','FB0956');

__Martin

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] bacula jobs use volumes from the wrong pool - bug?

2012-07-06 Thread Stephen Thompson
On 07/06/2012 11:01 AM, Martin Simmons wrote:
 On Thu, 05 Jul 2012 11:35:15 -0700, Stephen Thompson said:

 Hello again,

 Here's something even stranger...  Another Full job logs that it's
 written to a volume in the Full pool (FB0956), but then the status
 output of the job lists a volume in the Incremental pool (IM0093).  This
 Incremental volume was never even mentioned in the log as a volume to
 which the job despooled.

 It could be a database problem (the volumes listed in the status output come
 from a query).  What is the output of the sql commands below?

 SELECT VolumeName,JobMedia.* FROM JobMedia,Media WHERE JobMedia.JobId=242323 
 AND JobMedia.MediaId=Media.MediaId;

 SELECT MediaId,VolumeName FROM Media WHERE Media.VolumeName in 
 ('IM0093','FB0956');


Looks like it did in fact write to the Incremental tape IM0093 instead 
of the requested Full tape BUT logged that it wrote to a Full tape 
FB0956.  This begs the questions 1) Why is it writing to a tape in 
another pool? and 2) Why is logging that it wrote to a different tape 
than it did?




mysql SELECT VolumeName,JobMedia.* FROM JobMedia,Media WHERE 
JobMedia.JobId=242323 AND JobMedia.MediaId=Media.MediaId;
++++-++---+---+-++--+--+
| VolumeName | JobMediaId | JobId  | MediaId | FirstIndex | LastIndex | 
StartFile | EndFile | StartBlock | EndBlock | VolIndex |
++++-++---+---+-++--+--+
| IM0093 |1946327 | 242323 |1072 |  1 |   429 | 
   285 | 285 |   4851 | 7628 |1 |
| IM0093 |1946330 | 242323 |1072 |429 |   435 | 
   286 | 286 |  0 | 7628 |2 |
| IM0093 |1946332 | 242323 |1072 |435 |   438 | 
   287 | 287 |  0 | 7628 |3 |
| IM0093 |1946334 | 242323 |1072 |438 |   446 | 
   288 | 288 |  0 | 7628 |4 |
| IM0093 |1946338 | 242323 |1072 |446 |   446 | 
   289 | 289 |  0 | 7628 |5 |
| IM0093 |1946344 | 242323 |1072 |446 |   484 | 
   290 | 290 |  0 | 7628 |6 |
| IM0093 |1946347 | 242323 |1072 |484 |   727 | 
   291 | 291 |  0 | 7628 |7 |
| IM0093 |1946351 | 242323 |1072 |727 |   727 | 
   292 | 292 |  0 | 7628 |8 |
| IM0093 |1946357 | 242323 |1072 |727 |  6237 | 
   293 | 293 |  0 | 7628 |9 |
| IM0093 |1946360 | 242323 |1072 |   6237 |  9134 | 
   294 | 294 |  0 | 7628 |   10 |
| IM0093 |1946363 | 242323 |1072 |   9134 | 12816 | 
   295 | 295 |  0 | 7628 |   11 |
| IM0093 |1946368 | 242323 |1072 |  12816 | 12950 | 
   296 | 296 |  0 | 7628 |   12 |
| IM0093 |1946371 | 242323 |1072 |  12950 | 12985 | 
   297 | 297 |  0 | 7628 |   13 |
| IM0093 |1946374 | 242323 |1072 |  12985 | 13140 | 
   298 | 298 |  0 | 7628 |   14 |
| IM0093 |1946378 | 242323 |1072 |  13140 | 13181 | 
   299 | 299 |  0 | 7628 |   15 |
| IM0093 |1946383 | 242323 |1072 |  13181 | 13283 | 
   300 | 300 |  0 | 7628 |   16 |
| IM0093 |1946386 | 242323 |1072 |  13283 | 19855 | 
   301 | 301 |  0 | 7628 |   17 |
| IM0093 |1946390 | 242323 |1072 |  19855 | 26710 | 
   302 | 302 |  0 | 7628 |   18 |
| IM0093 |1946391 | 242323 |1072 |  26710 | 33532 | 
   303 | 303 |  0 | 7628 |   19 |
| IM0093 |1946394 | 242323 |1072 |  33532 | 40378 | 
   304 | 304 |  0 | 7628 |   20 |
| IM0093 |1946397 | 242323 |1072 |  40378 | 47275 | 
   305 | 305 |  0 | 7628 |   21 |
| IM0093 |1946400 | 242323 |1072 |  47275 | 54271 | 
   306 | 306 |  0 | 7628 |   22 |
| IM0093 |1946403 | 242323 |1072 |  54271 | 58872 | 
   307 | 307 |  0 | 7628 |   23 |
| IM0093 |1946406 | 242323 |1072 |  58872 | 58872 | 
   308 | 308 |  0 | 7628 |   24 |
| IM0093 |1946409 | 242323 |1072 |  58872 | 58873 | 
   309 | 309 |  0 | 7628 |   25 |
| IM0093 |1946412 | 242323 |1072 |  58873 | 58873 | 
   310 | 310 |  0 | 7628 |   26 |
| IM0093   

Re: [Bacula-users] bacula jobs use volumes from the wrong pool - bug?

2012-07-05 Thread Stephen Thompson
On 07/05/2012 10:46 AM, Stephen Thompson wrote:


 Hello,

 Running 5.2.9, though I believe we've seen this sporadically in earlier
 versions.  Jobs are using volumes that are in pools to which they have
 not been assigned.

 This is likely a bug as I don't see anything peculiar about our
 configuration.  We are using a tape library with 2 drives, both set to
 autoselect. The library contains volumes that are properly assigned
 (i.e. database entries for volumes look fine) to various pools,
 including a Full pool and an Incremental pool.

 Twice in the past week, Full jobs which specify the use of the Full pool
 using jobdefs are using volumes from the Incremental pool.  I haven't
 narrowed down all the details, but I believe it's if the Incremental
 volume is already loaded in a drive when the Full job in question is
 launched.


 Example:

 01-Jul 22:00 DIR JobId 244098: Fatal error: JobId 243957 already
 running. Duplicate job not allowed.
 01-Jul 20:03 DIR JobId 243957: Start Backup JobId 243957,
 Job=JOB.2012-07-01_20.00.04_03
 01-Jul 20:04 DIR JobId 243957: Using Device SL500-Drive-0
 01-Jul 20:42 SD JobId 243957: Spooling data ...
 02-Jul 14:01 SD JobId 243957: Job write elapsed time = 17:19:00,
 Transfer rate = 33.39 M Bytes/second
 02-Jul 14:01 SD JobId 243957: Committing spooled data to Volume
 FB0161. Despooling 2,082,572,994,002 bytes ...
 02-Jul 15:38 SD JobId 243957: End of Volume IM0094 at 462:1559 on
 device SL500-Drive-0 (/dev/SL500-Drive-0). Write of 262144 bytes got -1.
 02-Jul 15:38 SD JobId 243957: Re-read of last block succeeded.
 02-Jul 15:38 SD JobId 243957: End of medium on Volume IM0094
 Bytes=924,097,352,704 Blocks=3,525,155 at 02-Jul-2012 15:38.
 02-Jul 15:38 SD JobId 243957: 3307 Issuing autochanger unload slot
 148, drive 0 command.
 02-Jul 15:39 SD JobId 243957: 3301 Issuing autochanger loaded? drive
 0 command.
 02-Jul 15:39 SD JobId 243957: 3302 Autochanger loaded? drive 0,
 result: nothing loaded.
 02-Jul 15:39 SD JobId 243957: 3301 Issuing autochanger loaded? drive
 0 command.
 02-Jul 15:39 SD JobId 243957: 3302 Autochanger loaded? drive 0,
 result: nothing loaded.
 02-Jul 15:39 SD JobId 243957: 3301 Issuing autochanger loaded? drive
 1 command.
 02-Jul 15:39 SD JobId 243957: 3302 Autochanger loaded? drive 1,
 result: nothing loaded.
 02-Jul 15:39 SD JobId 243957: 3304 Issuing autochanger load slot 6,
 drive 0 command.
 02-Jul 15:40 SD JobId 243957: 3305 Autochanger load slot 6, drive 0,
 status is OK.
 02-Jul 15:40 SD JobId 243957: Volume FB0161 previously written,
 moving to end of data.
 02-Jul 15:40 SD JobId 243957: Ready to append to end of Volume
 FB0161 at file=7.
 02-Jul 15:40 SD JobId 243957: New volume FB0161 mounted on device
 SL500-Drive-0 (/dev/SL500-Drive-0) at 02-Jul-2012 15:40.
 02-Jul 18:19 SD JobId 243957: End of Volume FB0161 at 274:5411 on
 device SL500-Drive-0 (/dev/SL500-Drive-0). Write of 262144 bytes got -1.
 02-Jul 18:19 SD JobId 243957: Re-read of last block succeeded.
 02-Jul 18:19 SD JobId 243957: End of medium on Volume FB0161
 Bytes=535,390,861,312 Blocks=2,042,367 at 02-Jul-2012 18:19.
 02-Jul 18:19 SD JobId 243957: 3307 Issuing autochanger unload slot 6,
 drive 0 command.
 02-Jul 18:20 SD JobId 243957: 3301 Issuing autochanger loaded? drive
 0 command.
 02-Jul 18:20 SD JobId 243957: 3302 Autochanger loaded? drive 0,
 result: nothing loaded.
 02-Jul 18:20 SD JobId 243957: 3301 Issuing autochanger loaded? drive
 0 command.
 02-Jul 18:20 SD JobId 243957: 3302 Autochanger loaded? drive 0,
 result: nothing loaded.
 02-Jul 18:20 SD JobId 243957: 3301 Issuing autochanger loaded? drive
 1 command.
 02-Jul 18:20 SD JobId 243957: 3302 Autochanger loaded? drive 1,
 result: nothing loaded.
 02-Jul 18:20 SD JobId 243957: 3304 Issuing autochanger load slot 4,
 drive 0 command.
 02-Jul 18:21 SD JobId 243957: 3305 Autochanger load slot 4, drive 0,
 status is OK.
 02-Jul 18:21 SD JobId 243957: Volume FB0158 previously written,
 moving to end of data.
 02-Jul 18:21 SD JobId 243957: Ready to append to end of Volume
 FB0158 at file=1.
 02-Jul 18:21 SD JobId 243957: New volume FB0158 mounted on device
 SL500-Drive-0 (/dev/SL500-Drive-0) at 02-Jul-2012 18:21.
 02-Jul 21:14 SD JobId 243957: End of Volume FB0158 at 274:1785 on
 device SL500-Drive-0 (/dev/SL500-Drive-0). Write of 262144 bytes got -1.
 02-Jul 21:14 SD JobId 243957: Re-read of last block succeeded.
 02-Jul 21:14 SD JobId 243957: End of medium on Volume FB0158
 Bytes=546,439,698,432 Blocks=2,084,507 at 02-Jul-2012 21:14.

 02-Jul 21:14 SD JobId 243957: 3307 Issui02-Jul 22:00 DIR JobId
 244271: Fatal error: JobId 244133 already running. Duplicate job not
 allowed.

This line is btw as it is in the log file.  Looks a bit mangled, like 
two lines combined in one line 02-Jul 21:14 SD JobId 243957: 3307 
Issui and 02-Jul 22:00 DIR JobId 244271: Fatal error: JobId 244133 
already running. Duplicate job not allowed.


 02-Jul 22:00 DIR JobId 244268: Fatal error: JobId 243957 already
 running. Duplicate job 

Re: [Bacula-users] bacula jobs use volumes from the wrong pool - bug?

2012-07-05 Thread Stephen Thompson


Hello again,

Here's something even stranger...  Another Full job logs that it's 
written to a volume in the Full pool (FB0956), but then the status 
output of the job lists a volume in the Incremental pool (IM0093).  This 
Incremental volume was never even mentioned in the log as a volume to 
which the job despooled.


22-Jun 20:00 DIR JobId 242323: Start Backup JobId 242323, 
Job=JOB.2012-06-22_20.00.02_06
22-Jun 20:01 DIR JobId 242323: Using Device SL500-Drive-1
22-Jun 20:06 SD JobId 242323: 3301 Issuing autochanger loaded? drive 
1 command.
22-Jun 20:06 SD JobId 242323: 3302 Autochanger loaded? drive 1, 
result: nothing loaded.
22-Jun 20:06 SD JobId 242323: 3301 Issuing autochanger loaded? drive 
1 command.
22-Jun 20:06 SD JobId 242323: 3302 Autochanger loaded? drive 1, 
result: nothing loaded.
22-Jun 20:06 SD JobId 242323: 3304 Issuing autochanger load slot 138, 
drive 1 command.
22-Jun 20:07 SD JobId 242323: 3305 Autochanger load slot 138, drive 
1, status is OK.
22-Jun 20:07 SD JobId 242323: Volume FB0956 previously written, 
moving to end of data.
22-Jun 20:08 SD JobId 242323: Ready to append to end of Volume 
FB0956 at file=4.
22-Jun 20:08 SD JobId 242323: Spooling data ...
23-Jun 00:01 SD JobId 242323: Job write elapsed time = 03:53:01, 
Transfer rate = 10.80 M Bytes/second
23-Jun 00:01 SD JobId 242323: Committing spooled data to Volume 
FB0956. Despooling 151,089,481,092 bytes ...
23-Jun 01:28 SD JobId 242323: Despooling elapsed time = 01:27:07, 
Transfer rate = 28.90 M Bytes/second
23-Jun 01:28 SD JobId 242323: Sending spooled attrs to the Director. 
Despooling 99,242,108 bytes ...
23-Jun 01:30 DIR JobId 242323: Bacula DIR 5.2.9 (11Jun12):

   Build OS:   x86_64-unknown-linux-gnu redhat Enterprise 
release
   JobId:  242323
   Job:JOB.2012-06-22_20.00.02_06
   Backup Level:   Full
   Client: FD 5.2.6 (21Feb12) 
x86_64-unknown-linux-gnu,redhat,
   FileSet:FS 2012-01-22 20:00:03
   Pool:   Full-Pool-2012-06 (From Job resource)
   Catalog:MyCatalog (From Client resource)
   Storage:SL500-changer (From Job resource)
   Scheduled time: 22-Jun-2012 20:00:02
   Start time: 22-Jun-2012 20:01:29
   End time:   23-Jun-2012 01:30:15
   Elapsed time:   5 hours 28 mins 46 secs
   Priority:   10
   FD Files Written:   337,622
   SD Files Written:   337,622
   FD Bytes Written:   150,974,593,977 (150.9 GB)
   SD Bytes Written:   151,023,845,596 (151.0 GB)
   Rate:   7653.6 KB/s
   Software Compression:   None
   VSS:no
   Encryption: no
   Accurate:   yes
   Volume name(s): IM0093
   Volume Session Id:  9
   Volume Session Time:1340291913
   Last Volume Bytes:  723,076,936,704 (723.0 GB)
   Non-fatal FD errors:0
   SD Errors:  0
   FD termination status:  OK
   SD termination status:  OK
   Termination:Backup OK

23-Jun 01:30 DIR JobId 242323: Begin pruning Jobs older than 10 years .
23-Jun 01:30 DIR JobId 242323: No Jobs found to prune.
23-Jun 01:30 DIR JobId 242323: Begin pruning Files.
23-Jun 01:30 DIR JobId 242323: No Files found to prune.
23-Jun 01:30 DIR JobId 242323: End auto prune.



Note:  FB0956 is in the Full pool, IM0093 in the Incremental.


The vast majority of our jobs are being successful, but when something 
like this happens, I lost all faith that I even have the backups I think 
I have!!!

I just tried a test restore of this particular job, and it did in fact 
use IM0093 to restore from.  Ugh.

Stephen





On 07/05/2012 10:46 AM, Stephen Thompson wrote:


 Hello,

 Running 5.2.9, though I believe we've seen this sporadically in earlier
 versions.  Jobs are using volumes that are in pools to which they have
 not been assigned.

 This is likely a bug as I don't see anything peculiar about our
 configuration.  We are using a tape library with 2 drives, both set to
 autoselect. The library contains volumes that are properly assigned
 (i.e. database entries for volumes look fine) to various pools,
 including a Full pool and an Incremental pool.

 Twice in the past week, Full jobs which specify the use of the Full pool
 using jobdefs are using volumes from the Incremental pool.  I haven't
 narrowed down all the details, but I believe it's if the Incremental
 volume is already loaded in a drive when the Full job in question is
 launched.


 Example:

 01-Jul 22:00 DIR JobId 244098: Fatal error: JobId 243957 already
 running. Duplicate job not allowed.
 01-Jul 20:03 DIR JobId 243957: Start Backup JobId 243957,
 Job=JOB.2012-07-01_20.00.04_03
 01-Jul 20:04 DIR JobId 243957: Using Device SL500-Drive-0
 01-Jul 20:42 SD JobId 243957: Spooling data ...
 02-Jul 14:01 SD JobId 243957: Job write elapsed time = 17:19:00,
 Transfer rate = 33.39 M Bytes/second
 02-Jul 14:01 SD JobId 243957: