Hello, Just to close this. The current code for 3.1.4 seems to behave quite well. When the limit is reached, the job is canceled. Only the original error message and the cancel notification are printed -- at least on my system with my tests :-)
Best regards, Kern On Wednesday 04 November 2009 17:03:30 Alex Bramley wrote: > Hi Kern, > > 2009/11/4 Kern Sibbald <k...@sibbald.com>: > > Yes, this is very useful. It is not often that I am able to see a series > > of cascading errors generated by a real "database" error, so it gave me a > > chance to see how many times an error message is repeated, and where it > > gets distorted because the job thread must continue to the end but avoid > > trying to do anything that will cause another "false" error message. > > I'm glad to hear it! > > > I think I have cleaned up a good part of these error messages, but what > > is worrying me is that you say that Jobs still got stuck in the SD. So, > > what would be most useful would be for you to tell me the *exact* > > PostgreSQL config statement (including where it is) that I must change > > to invoke this error for documentation purposes. I am going to add debug > > code to Bacula force the error by allowing a maximum of 2 and trying to > > start 10 jobs. If I can duplicate the jobs getting "stuck", I can > > probably completely resolve it. > > Steps to reproduce, then: > > 1) Set up bacula with a large number of clients that run jobs > concurrently. 2) Edit postgresql.conf, set "max_connections = <small > number>". > Restart PostgreSQL completely. > 3) Restart director (so it reconnects to PostgreSQL -- it seemed to > have problems with "status dir" on the console if this wasn't done). > 4) Run all client jobs at once, so bacula-dir connects to PostgreSQL > and exceeds max_connections. > 5) ??? > 6) Profit! > > > I'll be submitting some more patches to clean the error handling up a lot > > more, but I wouldn't recommend at this point that you attempt to take > > them. > > Awesome, thankyou. I'll probably keep an eye on the git commit log but > I won't have much time to work on this from my end for a week or so > after this Friday due to other work commitments. > > > I recommend sticking with what you have and either going back to 3.0.3 or > > preferrably testing the patch carefully before putting it into > > production. > > I've given it a bit of a beating, but I am pretty satisfied that when > max_connections is set high enough I won't run into these problems > anyway, so I'm going to stick with the patched 3.0.3 packages I have > installed. > > > If I find out how to "unstick" the stuck jobs, I will let you know. > > Again, thanks for spending the time looking into this, it's really helpful! > > --Alex > > --------------------------------------------------------------------------- >--- Let Crystal Reports handle the reporting - Free Crystal Reports 2008 > 30-Day trial. Simplify your report design, integration and deployment - and > focus on what you do best, core application coding. Discover what's new > with Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Bacula-devel mailing list > Bacula-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bacula-devel ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Bacula-devel mailing list Bacula-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-devel