Re: [Bacula-users] bacula-dir 3.0.3 dies on second job run or manual reload.

2009-12-18 Thread Janusz Syrytczyk
On Monday 14 December 2009 09:27:40 Bruno Friedmann wrote:
 On 12/09/2009 12:11 AM, Janusz Syrytczyk wrote:
  Hi,
 
  I've upgraded to 3.0.3  from 3.0.2 a while ago and I'm facing serious
  problems with bacula-dir stability.
 
  Just after its start,  Director is able to perform any request I have
  (perform a backup, restore, reload etc.). But once I've got the task
  done, Director stops listening me - the second job is not starting when
  requested. Then bconsole stops, I have to exit ctrl+c, but reissuing
  bconsle and here typing status dir gives that the backup is running.
 
  The problem is that the backup is not running. Director keeps it almost
  fully silent. When I try to reload through bconsole, I'm experiencing
  Director going like zombie - cannot connect. Debugging gives only this:
 
  atom-dir: bnet.c:670-0 who=client host=192.168.1.150 port=36131
 
  What's interesting, when I leave the Director alone it works OK, it
  schedules backups and performs them. I had previously suspected that
  something is wrong with scheduler as on before this troubleshooting I
  couldn't even get the Director scheduling, but since few days it goes
  right.
 
  This is the same issue as the guy here, but he hasn't found a clue:
 
  http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg38279.h
 tml
 
  I've just moved backups and database, recompiled Bacula, recreated the
  database and started backups  but the same history goes. What this could
  be, anyone?
 
 Don't know if it's your case.
 
 We have same trouble here with dir hanging after having run the first job.
 I've restart it with -d100 just to check what's happen.
 In the meantime, on the bacula server (which has been upgraded from
  opensuse 11.1 to 11.2 ) I have found that postfix is throttling ... (
  missing relay.db file in /etc/postfix : issue a postmap relay and restart
  postfix ) After that all emails are working.
 
 As inside my dir-config message bsmtp are connected to the internal
  postfix, bsmtp was hanging ! And perharps bacula-dir too.
 
 I've now running three scheduled jobs, and bacula-dir have done it's jobs.
 
 What I suspect is : there's no bsmtp timeout ( if it could not connect it
  return, but if it connect and nothing goes right in postfix (the
  throttling case) it wait indefinitely and also the director 
 
 I will leave this configuration running 2 to 3 days just to be sure it was
  that.
 
 In the meantime, if you can check on your side, if you get some trouble
  with bstmp to infirm or confirm.
 
True, I've verified this too.

bsmtp goes zombie and bacula-dir waits on it. Solution is to usea another app 
for sending email or drop email notifications at all.

I wonder if its not a candidate to bug report?

Thanks,
JS

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] bacula-dir 3.0.3 dies on second job run or manual reload.

2009-12-18 Thread Bruno Friedmann
On 12/18/2009 09:20 AM, Janusz Syrytczyk wrote:
 On Monday 14 December 2009 09:27:40 Bruno Friedmann wrote:
 On 12/09/2009 12:11 AM, Janusz Syrytczyk wrote:
 Hi,

 I've upgraded to 3.0.3  from 3.0.2 a while ago and I'm facing serious
 problems with bacula-dir stability.

 Just after its start,  Director is able to perform any request I have
 (perform a backup, restore, reload etc.). But once I've got the task
 done, Director stops listening me - the second job is not starting when
 requested. Then bconsole stops, I have to exit ctrl+c, but reissuing
 bconsle and here typing status dir gives that the backup is running.

 The problem is that the backup is not running. Director keeps it almost
 fully silent. When I try to reload through bconsole, I'm experiencing
 Director going like zombie - cannot connect. Debugging gives only this:

 atom-dir: bnet.c:670-0 who=client host=192.168.1.150 port=36131

 What's interesting, when I leave the Director alone it works OK, it
 schedules backups and performs them. I had previously suspected that
 something is wrong with scheduler as on before this troubleshooting I
 couldn't even get the Director scheduling, but since few days it goes
 right.

 This is the same issue as the guy here, but he hasn't found a clue:

 http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg38279.h
 tml

 I've just moved backups and database, recompiled Bacula, recreated the
 database and started backups  but the same history goes. What this could
 be, anyone?

 Don't know if it's your case.

 We have same trouble here with dir hanging after having run the first job.
 I've restart it with -d100 just to check what's happen.
 In the meantime, on the bacula server (which has been upgraded from
  opensuse 11.1 to 11.2 ) I have found that postfix is throttling ... (
  missing relay.db file in /etc/postfix : issue a postmap relay and restart
  postfix ) After that all emails are working.

 As inside my dir-config message bsmtp are connected to the internal
  postfix, bsmtp was hanging ! And perharps bacula-dir too.

 I've now running three scheduled jobs, and bacula-dir have done it's jobs.

 What I suspect is : there's no bsmtp timeout ( if it could not connect it
  return, but if it connect and nothing goes right in postfix (the
  throttling case) it wait indefinitely and also the director 

 I will leave this configuration running 2 to 3 days just to be sure it was
  that.

 In the meantime, if you can check on your side, if you get some trouble
  with bstmp to infirm or confirm.

 True, I've verified this too.
 
 bsmtp goes zombie and bacula-dir waits on it. Solution is to usea another app 
 for sending email or drop email notifications at all.
 
 I wonder if its not a candidate to bug report?
 
 Thanks,
 JS
 

I think you could fill a bug report against it
(forward the number here so I could attach myself to it)
In fact director or bsmtp need somewhere a timeout in case of such trap.



-- 

 Bruno Friedmann


--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] bacula-dir 3.0.3 dies on second job run or manual reload.

2009-12-14 Thread Bruno Friedmann
On 12/09/2009 12:11 AM, Janusz Syrytczyk wrote:
 Hi,
 
 I've upgraded to 3.0.3  from 3.0.2 a while ago and I'm facing serious 
 problems 
 with bacula-dir stability. 
 
 Just after its start,  Director is able to perform any request I have 
 (perform 
 a backup, restore, reload etc.). But once I've got the task done, Director 
 stops listening me - the second job is not starting when requested. Then 
 bconsole stops, I have to exit ctrl+c, but reissuing bconsle and here typing 
 status dir gives that the backup is running. 
 
 The problem is that the backup is not running. Director keeps it almost fully 
 silent. When I try to reload through bconsole, I'm experiencing Director 
 going 
 like zombie - cannot connect. Debugging gives only this:
 
 atom-dir: bnet.c:670-0 who=client host=192.168.1.150 port=36131
 
 What's interesting, when I leave the Director alone it works OK, it schedules 
 backups and performs them. I had previously suspected that something is wrong 
 with scheduler as on before this troubleshooting I couldn't even get the 
 Director scheduling, but since few days it goes right.
 
 This is the same issue as the guy here, but he hasn't found a clue:
 
 http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg38279.html
 
 I've just moved backups and database, recompiled Bacula, recreated the 
 database and started backups  but the same history goes. What this could be, 
 anyone?
 

Don't know if it's your case.

We have same trouble here with dir hanging after having run the first job.
I've restart it with -d100 just to check what's happen.
In the meantime, on the bacula server (which has been upgraded from opensuse 
11.1 to 11.2 ) I have found that postfix is
throttling ... ( missing relay.db file in /etc/postfix : issue a postmap relay 
and restart postfix )
After that all emails are working.

As inside my dir-config message bsmtp are connected to the internal postfix, 
bsmtp was hanging ! And perharps bacula-dir too.

I've now running three scheduled jobs, and bacula-dir have done it's jobs.

What I suspect is : there's no bsmtp timeout ( if it could not connect it 
return, but if it connect and nothing goes right in
postfix (the throttling case) it wait indefinitely and also the director 

I will leave this configuration running 2 to 3 days just to be sure it was that.

In the meantime, if you can check on your side, if you get some trouble with 
bstmp to infirm or confirm.

-- 

 Bruno Friedmann

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] bacula-dir 3.0.3 dies on second job run or manual reload.

2009-12-08 Thread Janusz Syrytczyk
Hi,

I've upgraded to 3.0.3  from 3.0.2 a while ago and I'm facing serious problems 
with bacula-dir stability. 

Just after its start,  Director is able to perform any request I have (perform 
a backup, restore, reload etc.). But once I've got the task done, Director 
stops listening me - the second job is not starting when requested. Then 
bconsole stops, I have to exit ctrl+c, but reissuing bconsle and here typing 
status dir gives that the backup is running. 

The problem is that the backup is not running. Director keeps it almost fully 
silent. When I try to reload through bconsole, I'm experiencing Director going 
like zombie - cannot connect. Debugging gives only this:

atom-dir: bnet.c:670-0 who=client host=192.168.1.150 port=36131

What's interesting, when I leave the Director alone it works OK, it schedules 
backups and performs them. I had previously suspected that something is wrong 
with scheduler as on before this troubleshooting I couldn't even get the 
Director scheduling, but since few days it goes right.

This is the same issue as the guy here, but he hasn't found a clue:

http://www.mail-archive.com/bacula-users@lists.sourceforge.net/msg38279.html

I've just moved backups and database, recompiled Bacula, recreated the 
database and started backups  but the same history goes. What this could be, 
anyone?

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users