Hello, 17.07.2007 09:59,, Alfredo Marchini wrote:: > Hello, > > Arno Lehmann wrote: >> Hello, >> >> 16.07.2007 13:46,, Kern Sibbald wrote:: >> >>> On Monday 16 July 2007 13:17, Arno Lehmann wrote: >>> >>>> Hello, >>>> >> ... >> >>>>> Yes, either a kernel problem or a hardware problem seem the most >>>>> likely. >>> We >>>>> cannot exclude a Bacula bug, but the finger is pointing to the >>> CPU/hardware. >>> >>>> Well, this is problematic... Alfredo gave good reasons to assume >>>> that it's not purely hardware/OS related. Basically, the problem >>>> occurs when he runs certain jobs. >>>> >>> I didn't see that, but then I am no longer receive any email from the >>> bacula-users list. >>> >> >> Yes. I know, but it's hard moderating a discussion across two separate >> mailing list :-) >> >> >>>> I guess that the interworking of DIR, SD, catalog database, and OS >>>> might trigger some sort of resource exhaustion, but debugging this >>>> is beyond my abilities :-) >>>> >>> Or as I mentioned, it could be that Bacula is self destructing ... >>> >>> >>>>> I recommend shutting down your machine, rebooting it, running >>>>> memtest, and >>> if >>>>> all is OK, restarting Bacula and see what happens. >>>>> >>>> Fortunately, that's not my machine :-) >>>> >>>> Unfortunately, my backup server is dying, but I know and understand >>>> that problem :-( >>>> >>> If you and he *really* think it is a Bacula bug, I'd *strongly* >>> recommend that he upgrade to the latest 2.1.26 beta version. >>> >> >> > I think is a bacula bug, because on my first mail I thought that problem > was caused when bacula-dir stay alive for 2 weeks, but last week, on > wednesday my server reboot cause ups and bacula-dir restarted. > But on sunday he was blocked again. And my jobs, files and volumes > retention is set to 14 days (casually 2 weeks).
If I understand you correctly, this indicates that the problem does not only occur in the two week interval, but probably every sunday. > For experience (small), I think that if there is a bug in 2.0.3 version, > it's possible that bacula-dir 2.1.23 source code keeps the bug, so it > can be more useful search now for the bug and, if present, correct it > before developers publish final version 2.1.x. > But this is only what I think, I don't know anything about the source > code of bacula and what's changed in version 2.1.x from 2.0. > Because if is my hw or cpu problem, why all works fine for 2 weeks? It does look as though a certain job causes the problem. Well, not exactly a job, but a distinct volume configuration (caused by pruning et al.) together with a job. > Another think: I'm not sure but before I catch this problem I setted job > and file retentions different (file < job retention), and I had got no > problems. Then my customer asks me to set job and file retention at the > same value... and I setted it, and started problems. I can confirm that this is not a general problem. > If I reboot server (or more simply restarts bacula-dir) we'll loose for > 2 weeks the chance to make tests on the system. So I would be better if > someone tell me which tests should I do... Try setting up a test job like the one that causes the problem, but that's running on a daily schedule. Use a small fileset, a new schedule, but the same client and pools as when Bacula hangs. Run that job daily, and observe the volume / pool status. > I'm interested to help > developers to resolve this problem, because I like very much this > project and I want to continue to using it for my customers. > But I don't like very much to install on a 16 servers backup system a > beta version of bacula-dir, because can born new problems... but if it's > the only solution... Well, upgrading the DIR only should be enough. You can even keep the existing installation intact, just create the new version DIR unter a new name, and run that instead of the existing one. AFAIK, the catalog database schema has not been changed. (But check the changelog!) And, as this is going to the devel list only now, repost your configuration details - OS version, bacula version, catalog database, client version affected, and so on. Arno -- Arno Lehmann IT-Service Lehmann www.its-lehmann.de ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
