On Sunday 05 August 2007 20:25, Alfredo Marchini wrote: > Hi Kern, Arno, > > I resolved my problems with bacula 2.0.3 director!!!! > Now It doesn't block... > > I made this changes: > > Before all File, Job, and Volume retention were setted to 14 days. > > After the last crash (3 weeks ago) I setted only Job Retention to 21 > days, and now, after 3 weeks, it works (I checked this five minutes ago)!!! > > So, after this, I think that there is concurrency problem when > bacula-dir makes file, job and volume pruning at the same time... > > Have you ever made tests with this configuration?
No, but it seems to me to be *highly* unlikely that it is the cause of your problems. > I have only one server, and is a production server, and I cannot start > to make tests on it. If you can find a way to reproduce it, please open a bug report. Otherwise, we will just cross our fingers and hope it is resolved. > > It's all. > Thank you very much for you help > Regards > > Kern Sibbad wrote: > > On Tuesday 17 July 2007 13:00, Alfredo Marchini wrote: > >> Hello, > >> on sunday I make the full backup of 16 server, 8 at 09.00 and 8 at > >> 13.00. on the other week-days I make diff backup of 16 server, 8 at > >> 21.00 and 8 at 23.00. > >> I use Mysql Server vers. 5.0.x. > > > > Hmmm. Well MySQL did have a rather big memory loss in version 2.0.3 or > > lower, so let's hope that is the problem, and that it will be fixed. > > > >> the DIR never arrives at 13.00, It blocks at 09.00 backups. > >> I talk with my customer, he prefer to not install 2.1.26 beta, so before > >> I make another test: > >> I move the job retention to 21 days and mantain file and volume > >> retention to 14 days. > > > > Within the Bacula code changing the retention period would not increase > > or decrease any risk of problems. However, depending on how much pruning > > was going on, things could change a lot in terms of the memory loss > > situations that were occurring in the SQL code. In addition, in > > situations where the SD needs a tape and none are available, the new code > > does *much* less prunning than the old code -- it limits the pruning to > > only what is required to find a Volume rather than pruning the full pool. > > This should give significant performance improvements during the backup > > -- of course, at the end of the job, the same pruning must be done as > > before ... > > > >> This situation I think was the first and If I remember well didn't give > >> me any block. > >> I know the problem about the old jobs retention that will be maintained, > >> so perhaps on 29-07 it will block again, but on 12-08 I will know if it > >> really works (when it has pruned all old jobs with 14days job > >> retention). If It will block again (I think so but the server is not > >> mine, is customer's) my customer tell me that I can install beta > >> version. > > > > Nice customer :-) > > > >> So I hope you'll have good holidays (I remain here to work), and I'll > >> resend info on next 2 weeks or next month. > >> Thank you very much again > >> Bye > >> > >> Kern Sibbald wrote: > >>> On Tuesday 17 July 2007 10:47, Alfredo Marchini wrote: > >>>> Hello, so I will upgrade, at this point, to version 2.1.26 beta. > >>>> And wait for the next two weeks (14 days) that the block occurs. > >>>> The blocks arrive always every two weeks (14 days), on Sunday. > >>>> But is not dependent of the time of live of the bacula-dir, because, > >>>> as I said, bacula-dir has been restarted on last wednesday 11-07, and > >>>> blocked on sunday 15-07, and last block was on sunday 01-07. > >>>> Now I will restart bacula-dir, and all will restart fine, and for 2 > >>>> weeks it will be ok. > >>>> I'll upgrade to version 2.1.26 and wait for 2 weeks. I hope that on > >>>> 30-07 I will be able to say to you "all OK"!!! > >>> > >>> Yes, me too. > >>> > >>> In the mean time, think about anything special that happens on those > >>> Sundays -- the problem is most likely related to that. Of course, it > >>> may just be that you did a full save on those days, and it triggered > >>> the > > > > memory > > > >>> loss in the SQL server, which in turn blocks Bacula. If that is the > >>> case, then it is very likely the problem is already fixed. > >>> > >>> Regards, > >>> > >>> Kern > >>> > >>>> Thanks > >>>> Bye > >>>> > >>>> Arno Lehmann wrote: > >>>>> Hello, > >>>>> > >>>>> 17.07.2007 09:59,, Alfredo Marchini wrote:: > >>>>>> Hello, > >>>>>> > >>>>>> Arno Lehmann wrote: > >>>>>>> Hello, > >>>>>>> > >>>>>>> 16.07.2007 13:46,, Kern Sibbald wrote:: > >>>>>>>> On Monday 16 July 2007 13:17, Arno Lehmann wrote: > >>>>>>>>> Hello, > >>>>>>> > >>>>>>> ... > >>>>>>> > >>>>>>>>>> Yes, either a kernel problem or a hardware problem seem the most > >>>>>>>>>> likely. > >>>>>>>> > >>>>>>>> We > >>>>>>>> > >>>>>>>>>> cannot exclude a Bacula bug, but the finger is pointing to the > >>>>>>>> > >>>>>>>> CPU/hardware. > >>>>>>>> > >>>>>>>>> Well, this is problematic... Alfredo gave good reasons to assume > >>>>>>>>> that it's not purely hardware/OS related. Basically, the problem > >>>>>>>>> occurs when he runs certain jobs. > >>>>>>>> > >>>>>>>> I didn't see that, but then I am no longer receive any email from > >>>>>>>> the bacula-users list. > >>>>>>> > >>>>>>> Yes. I know, but it's hard moderating a discussion across two > >>>>>>> separate mailing list :-) > >>>>>>> > >>>>>>>>> I guess that the interworking of DIR, SD, catalog database, and > >>>>>>>>> OS might trigger some sort of resource exhaustion, but debugging > >>>>>>>>> this is beyond my abilities :-) > >>>>>>>> > >>>>>>>> Or as I mentioned, it could be that Bacula is self destructing ... > >>>>>>>> > >>>>>>>>>> I recommend shutting down your machine, rebooting it, running > >>>>>>>>>> memtest, and > >>>>>>>> > >>>>>>>> if > >>>>>>>> > >>>>>>>>>> all is OK, restarting Bacula and see what happens. > >>>>>>>>> > >>>>>>>>> Fortunately, that's not my machine :-) > >>>>>>>>> > >>>>>>>>> Unfortunately, my backup server is dying, but I know and > >>>>>>>>> understand that problem :-( > >>>>>>>> > >>>>>>>> If you and he *really* think it is a Bacula bug, I'd *strongly* > >>>>>>>> recommend that he upgrade to the latest 2.1.26 beta version. > >>>>>> > >>>>>> I think is a bacula bug, because on my first mail I thought that > > > > problem > > > >>>>>> was caused when bacula-dir stay alive for 2 weeks, but last week, on > >>>>>> wednesday my server reboot cause ups and bacula-dir restarted. > >>>>>> But on sunday he was blocked again. And my jobs, files and volumes > >>>>>> retention is set to 14 days (casually 2 weeks). > >>>>> > >>>>> If I understand you correctly, this indicates that the problem does > >>>>> not only occur in the two week interval, but probably every sunday. > >>>>> > >>>>>> For experience (small), I think that if there is a bug in 2.0.3 > > > > version, > > > >>>>>> it's possible that bacula-dir 2.1.23 source code keeps the bug, so > >>>>>> it can be more useful search now for the bug and, if present, > >>>>>> correct it before developers publish final version 2.1.x. > >>>>>> But this is only what I think, I don't know anything about the > >>>>>> source code of bacula and what's changed in version 2.1.x from 2.0. > >>>>>> Because if is my hw or cpu problem, why all works fine for 2 weeks? > >>>>> > >>>>> It does look as though a certain job causes the problem. Well, not > >>>>> exactly a job, but a distinct volume configuration (caused by pruning > >>>>> et al.) together with a job. > >>>>> > >>>>>> Another think: I'm not sure but before I catch this problem I setted > > > > job > > > >>>>>> and file retentions different (file < job retention), and I had got > >>>>>> no problems. Then my customer asks me to set job and file retention > >>>>>> at the same value... and I setted it, and started problems. > >>>>> > >>>>> I can confirm that this is not a general problem. > >>>>> > >>>>>> If I reboot server (or more simply restarts bacula-dir) we'll loose > >>>>>> for 2 weeks the chance to make tests on the system. So I would be > >>>>>> better if someone tell me which tests should I do... > >>>>> > >>>>> Try setting up a test job like the one that causes the problem, but > >>>>> that's running on a daily schedule. Use a small fileset, a new > >>>>> schedule, but the same client and pools as when Bacula hangs. > >>>>> > >>>>> Run that job daily, and observe the volume / pool status. > >>>>> > >>>>>> I'm interested to help > >>>>>> developers to resolve this problem, because I like very much this > >>>>>> project and I want to continue to using it for my customers. > >>>>>> But I don't like very much to install on a 16 servers backup system > >>>>>> a beta version of bacula-dir, because can born new problems... but > >>>>>> if > > > > it's > > > >>>>>> the only solution... > >>>>> > >>>>> Well, upgrading the DIR only should be enough. You can even keep the > >>>>> existing installation intact, just create the new version DIR unter a > >>>>> new name, and run that instead of the existing one. AFAIK, the > >>>>> catalog database schema has not been changed. (But check the > >>>>> changelog!) > >>>>> > >>>>> And, as this is going to the devel list only now, repost your > >>>>> configuration details - OS version, bacula version, catalog database, > >>>>> client version affected, and so on. > >>>>> > >>>>> Arno > >>>> > >>>> -- > >>>> Alfredo Marchini > >>>> Consulente IT > >>>> P.IVA: 05649240487 > >>>> CF: MRCLRD81R07D612B > >>>> Via Imbriani, 66 > >>>> 50019 Sesto Fiorentino (FI) > >>>> Tel. +39 393 9566375 > >>>> E-Mail: [EMAIL PROTECTED] > >>>> > >>>> > >>>> > >>>> ---------------------------------------------------------------------- > >>>>--- This SF.net email is sponsored by DB2 Express > >>>> Download DB2 Express C - the FREE version of DB2 express and take > >>>> control of your XML. No limits. Just data. Click to get it now. > >>>> http://sourceforge.net/powerbar/db2/ > >>>> _______________________________________________ > >>>> Bacula-devel mailing list > >>>> [email protected] > >>>> https://lists.sourceforge.net/lists/listinfo/bacula-devel > >> > >> -- > >> Alfredo Marchini > >> Consulente IT > >> P.IVA: 05649240487 > >> CF: MRCLRD81R07D612B > >> Via Imbriani, 66 > >> 50019 Sesto Fiorentino (FI) > >> Tel. +39 393 9566375 > >> E-Mail: [EMAIL PROTECTED] > >> > >> > >> > >> ------------------------------------------------------------------------ > >>- This SF.net email is sponsored by DB2 Express > >> Download DB2 Express C - the FREE version of DB2 express and take > >> control of your XML. No limits. Just data. Click to get it now. > >> http://sourceforge.net/powerbar/db2/ > >> _______________________________________________ > >> Bacula-devel mailing list > >> [email protected] > >> https://lists.sourceforge.net/lists/listinfo/bacula-devel ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
