Re: AW: [Bacula-users] Bacula director freezing

2005-05-31 Thread Alan Brown

On Mon, 30 May 2005, Masopust Christian wrote:



maybe you'll have a look at bugs.bacula.org at bug 331. i had a similar
problem
where bacula-dir randomly hangs. after applying kerns patch it didn't happen
until
now, but before closing this bug i would prefere to wait at least one week
;-))


As there is a HIGH chance that there will be attempts to use different 
volumes on the same drive, this fix won't work for me.


AB



---
This SF.Net email is sponsored by Yahoo.
Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
Search APIs Find out how you can build Yahoo! directly into your own
Applications - visit http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: AW: [Bacula-users] Bacula director freezing

2005-05-31 Thread Kern Sibbald
On Tuesday 31 May 2005 16:14, Alan Brown wrote:
 On Tue, 31 May 2005, Kern Sibbald wrote:
  On version 1.37.20, providing you are using the new Autochanger resource
  in the SD, your second job will automatically select another drive if one
  is available, otherwise wait.

 Management won't let me test unstable versions.

Yes, I imagined so.  This is the case with most bigger operations ...


  Within a week or so, I hope to ensure that Bacula will never try to load
  the same tape simultaneously on two drives -- I received my 2 drive
  autochanger this morning and confirmed that it works a few minutes ago.
  My biggest problem is programming with it turned on, since it makes
  almost as much noise as a vacuum cleaner :-)

 Yes, this is always a problem with the Overland beasties. They are
 designed for use in a machine room - and they can pick up dust as
 effectively as a vacuum cleaner too, so a suitable enclosure wouldn't
 hurt.

At the moment, it is probably the worst of all situations -- a desk top unit 
sitting on my carpeted floor next to a window.   As soon as possible (when I 
get another SCSI card, and when a couple guys with strong arms are around), 
it will go down two flights of stairs into the basement with closed windows 
on a real desktop next to my servers.

-- 
Best regards,

Kern

  (
  /\
  V_V


---
This SF.Net email is sponsored by Yahoo.
Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
Search APIs Find out how you can build Yahoo! directly into your own
Applications - visit http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


AW: [Bacula-users] Bacula director freezing

2005-05-29 Thread Masopust Christian
Title: AW: [Bacula-users] Bacula director freezing






maybe you'll have a look at bugs.bacula.org at bug 331. i had a similar problem
where bacula-dir randomly hangs. after applying kerns patch it didn't happen until
now, but before closing this bug i would prefere to wait at least one week ;-))



 -Ursprngliche Nachricht-
 Von: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED]] Im Auftrag 
 von Alan Brown
 Gesendet: Donnerstag, 26. Mai 2005 17:32
 An: Arno Lehmann
 Cc: Ali Zaidi; bacula-users@lists.sourceforge.net
 Betreff: Re: [Bacula-users] Bacula director freezing
 
 On Sat, 21 May 2005, Arno Lehmann wrote:
 
  However for the last two Friday nights the bacula
  director has been freezing after backing up the first
  seven clients.
 
  I did experience the same, couldn't find any reason, but 
 after the upgrade to 
  1.36.2 that didn't happen again. So, I suggest you do an 
 upgrade to that 
  version or the current release version 1.36.3.
 
 It's happening for me on 1.36.2 and 1.36.3
 
  Another peculiar thing i saw was that the number of
  bacula-dir processes on the system increases from 4 to
  around 21.
 
  That's normal - that's all the worker threads for the 
 different jobs and 
  bookkeeping. I don't know how many I had, but when the 
 director crashed I 
  usually had four jobs running and saw about 10-20 jobs / threads.
 
 It's about the same number as maximums set int he config files.
 
 AB
 
 
 
 ---
 This SF.Net email is sponsored by Yahoo.
 Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
 Search APIs Find out how you can build Yahoo! directly into your own
 Applications - visit 
 http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users
 





Re: AW: [Bacula-users] Bacula director freezing

2005-05-25 Thread Kern Sibbald
Please see bug report 331 (if I am not mistaken).  I've uploaded a correction 
that should fix the problem.

On Wednesday 25 May 2005 16:09, Jeffery P. Humes wrote:
 I am not going to be much help here, but just wanted to say that I am
 having the same issue with (I believe) the director freezing.

 It is seemingly random.  Sometimes it stops responding every other day,
 sometimes it will go 1-2 weeks.
 I have been running this version of bacula for about 2 months.

 Version:
 kninfratemp-dir Version: 1.36.2 (28 February 2005)
 (with Tape EOF restore patch applied)

 I will most likely upgrade to 1.36.3 in the near future.

 I just dont even know where to start troubleshooting this, I dont get a
 traceback at all when it freezes.

 -Jeff Humes

 Masopust Christian wrote:
  hi kern,
 
  all right, submitted this problem as a bug (331).
 
  i'm not sure if this is really a problem with timeout as i don't have any
  time limits configured in my config.  the freeze of director occured when
  trying to start the first job in the evening. the last job that run
  before
  was at 2pm and it finished without problems.
 
  anyway, bug is submitted and thank for your help!  (but first, please
  enjoy
  your holidays!!)
 
  chris
 
   -Ursprngliche Nachricht-
   Von: Kern Sibbald [mailto:[EMAIL PROTECTED]
   Gesendet: Dienstag, 24. Mai 2005 22:42
   An: bacula-users@lists.sourceforge.net
   Cc: Masopust Christian
   Betreff: Re: [Bacula-users] Bacula director freezing
  
   Hello,
  
   This appears to be a deadlock situation, and seems to be
   triggered by a
   watchdog timeout, which means you have probably set some
   maximum time limit
   for a job.
  
   Though the deadlock could be related to version 1.36.3, I'd be a bit
   surprised. At this point, I cannot exclude a 1.36.3 specific
   problem, so I'll
   carefully check that after returning from vacation.
  
   I'd appreciate it if you would submit this traceback as a bug
   report along
   with your Director's conf file.
  
   On Tuesday 24 May 2005 13:25, Masopust Christian wrote:
Yesterday in the evening, just when starting some jobs my director
again freezes...
   
here's the output of btraceback (my system is Fedora Core
  
   3, Bacula is
  
1.36.3):
   
From [EMAIL PROTECTED]  Mon May 23 22:01:32 2005
Return-Path: [EMAIL PROTECTED]
Received: from atpcc7fc.sie.siemens.at (atpcc7fc.sie.siemens.at
[127.0.0.1]) by atpcc7fc.sie.siemens.at (8.13.1/8.13.1) with SMTP id
j4NK1VFi027151
for [EMAIL PROTECTED]; Mon, 23 May 2005 22:01:31 +0200
Message-Id: [EMAIL PROTECTED]
From: [EMAIL PROTECTED]
Subject: Bacula GDB traceback of bacula-dir
Sender: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Date: Mon, 23 May 2005 22:01:31 +0200
Status: R
   
Using host libthread_db library /lib/libthread_db.so.1.
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 3346)]
[New Thread 32769 (LWP 3351)]
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 3346)]
[New Thread 32769 (LWP 3351)]
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 3346)]
[New Thread 32769 (LWP 3351)]
[New Thread 16386 (LWP 3352)]
[New Thread 32771 (LWP 3353)]
[New Thread 19726340 (LWP 26151)]
[New Thread 19742725 (LWP 26152)]
[New Thread 19759110 (LWP 26164)]
[New Thread 19775495 (LWP 26172)]
[New Thread 19791880 (LWP 26180)]
[New Thread 19808265 (LWP 26203)]
[New Thread 19824650 (LWP 26267)]
[New Thread 19841035 (LWP 26294)]
[New Thread 19857420 (LWP 26320)]
[New Thread 19873805 (LWP 26381)]
[New Thread 19890190 (LWP 26411)]
[New Thread 19906575 (LWP 26434)]
0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
$1 = atpcc7fc-dir, '\0' repeats 17 times
$2 = 0x80b5230 bacula-dir
$3 = 0x80b5dd0 /opt/bacula/sbin/
$4 = MySQL
$5 = 0x80a321c 1.36.3 (22 April 2005)
$6 = 0x809bfb8 i686-redhat-linux-gnu
$7 = 0x809bfb1 redhat
$8 = 0x809bfa4 (Heidelberg)
#0  0x004c80d4 in __pthread_sigsuspend () from
  
   /lib/i686/libpthread.so.0
  
#1  0x004c7708 in __pthread_wait_for_restart_signal () from
/lib/i686/libpthread.so.0
#2  0x004c9720 in __pthread_alt_lock () from
  
   /lib/i686/libpthread.so.0
  
#3  0x004c614e in pthread_mutex_lock () from
  
   /lib/i686/libpthread.so.0
  
#4  0x08057dab in jobq_add (jq=0x80b4300, jcr=0x80fc570) at
  
   jobq.c:240
  
#5  0x080566d8 in run_job (jcr=0x80fc570) at job.c:140
#6  0x0804c034 in main (argc=0, argv=0x8090b55) at dird.c:241
   
Thread 16 (Thread 19906575 (LWP 26434)):
#0  0x004c80d4 in __pthread_sigsuspend () from
  
   /lib/i686/libpthread.so.0
  
#1  0x004c7708 in __pthread_wait_for_restart_signal () from
/lib/i686/libpthread.so.0
#2  0x004c3fab in [EMAIL PROTECTED] () from
/lib/i686/libpthread.so.0
#3  0x08087a7a in rwl_writelock 

Re: AW: [Bacula-users] Bacula director freezing

2005-05-25 Thread Jeffery P. Humes




I tried this with the
version I have currently.

I got the below error:

g++ -c -I. -I.. -g -O2 -Wall jobq.c
jobq.c: In function `void* jobq_server(void*)':
jobq.c:489: error: `dird_free_jcr_pointers' undeclared (first use this
function)
jobq.c:489: error: (Each undeclared identifier is reported only once
for each function it appears in.)
make[1]: *** [jobq.o] Error 1
make[1]: Leaving directory `/usr/src/bacula-1.36.2/src/dird'


I will upgrade to 1.36.3.


-Jeff


Kern Sibbald wrote:

  Please see bug report 331 (if I am not mistaken).  I've uploaded a correction 
that should fix the problem.

On Wednesday 25 May 2005 16:09, Jeffery P. Humes wrote:
  
  
I am not going to be much help here, but just wanted to say that I am
having the same issue with (I believe) the director freezing.

It is seemingly random.  Sometimes it stops responding every other day,
sometimes it will go 1-2 weeks.
I have been running this version of bacula for about 2 months.

Version:
kninfratemp-dir Version: 1.36.2 (28 February 2005)
(with Tape EOF restore patch applied)

I will most likely upgrade to 1.36.3 in the near future.

I just dont even know where to start troubleshooting this, I dont get a
traceback at all when it freezes.

-Jeff Humes

Masopust Christian wrote:


  hi kern,

all right, submitted this problem as a bug (331).

i'm not sure if this is really a problem with timeout as i don't have any
time limits configured in my config.  the freeze of director occured when
trying to start the first job in the evening. the last job that run
before
was at 2pm and it finished without problems.

anyway, bug is submitted and thank for your help!  (but first, please
enjoy
your holidays!!)

chris

  
  
-Ursprngliche Nachricht-
Von: Kern Sibbald [mailto:[EMAIL PROTECTED]]
Gesendet: Dienstag, 24. Mai 2005 22:42
An: bacula-users@lists.sourceforge.net
Cc: Masopust Christian
Betreff: Re: [Bacula-users] Bacula director freezing

Hello,

This appears to be a deadlock situation, and seems to be
triggered by a
watchdog timeout, which means you have probably set some
maximum time limit
for a job.

Though the deadlock could be related to version 1.36.3, I'd be a bit
surprised. At this point, I cannot exclude a 1.36.3 specific
problem, so I'll
carefully check that after returning from vacation.

I'd appreciate it if you would submit this traceback as a bug
report along
with your Director's conf file.

On Tuesday 24 May 2005 13:25, Masopust Christian wrote:


  Yesterday in the evening, just when starting some jobs my director
again freezes...

here's the output of btraceback (my system is Fedora Core
  

3, Bacula is



  1.36.3):

>From [EMAIL PROTECTED]  Mon May 23 22:01:32 2005
Return-Path: [EMAIL PROTECTED]
Received: from atpcc7fc.sie.siemens.at (atpcc7fc.sie.siemens.at
[127.0.0.1]) by atpcc7fc.sie.siemens.at (8.13.1/8.13.1) with SMTP id
j4NK1VFi027151
for [EMAIL PROTECTED]; Mon, 23 May 2005 22:01:31 +0200
Message-Id: [EMAIL PROTECTED]
From: [EMAIL PROTECTED]
Subject: Bacula GDB traceback of bacula-dir
Sender: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Date: Mon, 23 May 2005 22:01:31 +0200
Status: R

Using host libthread_db library "/lib/libthread_db.so.1".
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 3346)]
[New Thread 32769 (LWP 3351)]
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 3346)]
[New Thread 32769 (LWP 3351)]
[Thread debugging using libthread_db enabled]
[New Thread 16384 (LWP 3346)]
[New Thread 32769 (LWP 3351)]
[New Thread 16386 (LWP 3352)]
[New Thread 32771 (LWP 3353)]
[New Thread 19726340 (LWP 26151)]
[New Thread 19742725 (LWP 26152)]
[New Thread 19759110 (LWP 26164)]
[New Thread 19775495 (LWP 26172)]
[New Thread 19791880 (LWP 26180)]
[New Thread 19808265 (LWP 26203)]
[New Thread 19824650 (LWP 26267)]
[New Thread 19841035 (LWP 26294)]
[New Thread 19857420 (LWP 26320)]
[New Thread 19873805 (LWP 26381)]
[New Thread 19890190 (LWP 26411)]
[New Thread 19906575 (LWP 26434)]
0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
$1 = "atpcc7fc-dir", '\0' repeats 17 times
$2 = 0x80b5230 "bacula-dir"
$3 = 0x80b5dd0 "/opt/bacula/sbin/"
$4 = "MySQL"
$5 = 0x80a321c "1.36.3 (22 April 2005)"
$6 = 0x809bfb8 "i686-redhat-linux-gnu"
$7 = 0x809bfb1 "redhat"
$8 = 0x809bfa4 "(Heidelberg)"
#0  0x004c80d4 in __pthread_sigsuspend () from
  

/lib/i686/libpthread.so.0



  #1  0x004c7708 in __pthread_wait_for_restart_signal () from
/lib/i686/libpthread.so.0
#2  0x004c9720 in __pthread_alt_lock () from
  

/lib/i686/libpthread.so.0



  #3  0x004c614e in pthread_mutex_lock () from
  

/lib/i686/libpthread.so.0



  #4  0x08057dab in jobq_add (jq=0x80b4300, jcr=0x80fc570) at
  

jobq.c:240


 

Re: AW: [Bacula-users] Bacula director freezing

2005-05-25 Thread Kern Sibbald
Sorry, but I'm not too surprised it doesn't work on prior versions. 

I built and tested (regression) the fix (actually code from 1.37) on version 
1.36.3.

On Wednesday 25 May 2005 21:55, Jeffery P. Humes wrote:
 I tried this with the version I have currently.

 I got the below error:

 g++   -c   -I. -I..  -g -O2 -Wall  jobq.c
 jobq.c: In function `void* jobq_server(void*)':
 jobq.c:489: error: `dird_free_jcr_pointers' undeclared (first use this
 function)
 jobq.c:489: error: (Each undeclared identifier is reported only once for
 each function it appears in.)
 make[1]: *** [jobq.o] Error 1
 make[1]: Leaving directory `/usr/src/bacula-1.36.2/src/dird'


 I will upgrade to 1.36.3.


 -Jeff

 Kern Sibbald wrote:
 Please see bug report 331 (if I am not mistaken).  I've uploaded a
  correction that should fix the problem.
 
 On Wednesday 25 May 2005 16:09, Jeffery P. Humes wrote:
 I am not going to be much help here, but just wanted to say that I am
 having the same issue with (I believe) the director freezing.
 
 It is seemingly random.  Sometimes it stops responding every other day,
 sometimes it will go 1-2 weeks.
 I have been running this version of bacula for about 2 months.
 
 Version:
 kninfratemp-dir Version: 1.36.2 (28 February 2005)
 (with Tape EOF restore patch applied)
 
 I will most likely upgrade to 1.36.3 in the near future.
 
 I just dont even know where to start troubleshooting this, I dont get a
 traceback at all when it freezes.
 
 -Jeff Humes
 
 Masopust Christian wrote:
 hi kern,
 
 all right, submitted this problem as a bug (331).
 
 i'm not sure if this is really a problem with timeout as i don't have
  any time limits configured in my config.  the freeze of director
  occured when trying to start the first job in the evening. the last job
  that run before
 was at 2pm and it finished without problems.
 
 anyway, bug is submitted and thank for your help!  (but first, please
 enjoy
 your holidays!!)
 
 chris
 
 -Ursprngliche Nachricht-
 Von: Kern Sibbald [mailto:[EMAIL PROTECTED]
 Gesendet: Dienstag, 24. Mai 2005 22:42
 An: bacula-users@lists.sourceforge.net
 Cc: Masopust Christian
 Betreff: Re: [Bacula-users] Bacula director freezing
 
 Hello,
 
 This appears to be a deadlock situation, and seems to be
 triggered by a
 watchdog timeout, which means you have probably set some
 maximum time limit
 for a job.
 
 Though the deadlock could be related to version 1.36.3, I'd be a bit
 surprised. At this point, I cannot exclude a 1.36.3 specific
 problem, so I'll
 carefully check that after returning from vacation.
 
 I'd appreciate it if you would submit this traceback as a bug
 report along
 with your Director's conf file.
 
 On Tuesday 24 May 2005 13:25, Masopust Christian wrote:
 Yesterday in the evening, just when starting some jobs my director
 again freezes...
 
 here's the output of btraceback (my system is Fedora Core
 
 3, Bacula is
 
 1.36.3):
 
 From [EMAIL PROTECTED]  Mon May 23 22:01:32 2005
 Return-Path: [EMAIL PROTECTED]
 Received: from atpcc7fc.sie.siemens.at (atpcc7fc.sie.siemens.at
 [127.0.0.1]) by atpcc7fc.sie.siemens.at (8.13.1/8.13.1) with SMTP id
 j4NK1VFi027151
 for [EMAIL PROTECTED]; Mon, 23 May 2005 22:01:31 +0200
 Message-Id: [EMAIL PROTECTED]
 From: [EMAIL PROTECTED]
 Subject: Bacula GDB traceback of bacula-dir
 Sender: [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Date: Mon, 23 May 2005 22:01:31 +0200
 Status: R
 
 Using host libthread_db library /lib/libthread_db.so.1.
 [Thread debugging using libthread_db enabled]
 [New Thread 16384 (LWP 3346)]
 [New Thread 32769 (LWP 3351)]
 [Thread debugging using libthread_db enabled]
 [New Thread 16384 (LWP 3346)]
 [New Thread 32769 (LWP 3351)]
 [Thread debugging using libthread_db enabled]
 [New Thread 16384 (LWP 3346)]
 [New Thread 32769 (LWP 3351)]
 [New Thread 16386 (LWP 3352)]
 [New Thread 32771 (LWP 3353)]
 [New Thread 19726340 (LWP 26151)]
 [New Thread 19742725 (LWP 26152)]
 [New Thread 19759110 (LWP 26164)]
 [New Thread 19775495 (LWP 26172)]
 [New Thread 19791880 (LWP 26180)]
 [New Thread 19808265 (LWP 26203)]
 [New Thread 19824650 (LWP 26267)]
 [New Thread 19841035 (LWP 26294)]
 [New Thread 19857420 (LWP 26320)]
 [New Thread 19873805 (LWP 26381)]
 [New Thread 19890190 (LWP 26411)]
 [New Thread 19906575 (LWP 26434)]
 0x004c80d4 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
 $1 = atpcc7fc-dir, '\0' repeats 17 times
 $2 = 0x80b5230 bacula-dir
 $3 = 0x80b5dd0 /opt/bacula/sbin/
 $4 = MySQL
 $5 = 0x80a321c 1.36.3 (22 April 2005)
 $6 = 0x809bfb8 i686-redhat-linux-gnu
 $7 = 0x809bfb1 redhat
 $8 = 0x809bfa4 (Heidelberg)
 #0  0x004c80d4 in __pthread_sigsuspend () from
 
 /lib/i686/libpthread.so.0
 
 #1  0x004c7708 in __pthread_wait_for_restart_signal () from
 /lib/i686/libpthread.so.0
 #2  0x004c9720 in __pthread_alt_lock () from
 
 /lib/i686/libpthread.so.0
 
 #3  0x004c614e in pthread_mutex_lock () from
 
 /lib/i686/libpthread.so.0
 
 #4  0x08057dab in jobq_add (jq=0x80b4300, jcr=0x80fc570) at