Hi Kern, Arno, I resolved my problems with bacula 2.0.3 director!!!! Now It doesn't block...
I made this changes: Before all File, Job, and Volume retention were setted to 14 days. After the last crash (3 weeks ago) I setted only Job Retention to 21 days, and now, after 3 weeks, it works (I checked this five minutes ago)!!! So, after this, I think that there is concurrency problem when bacula-dir makes file, job and volume pruning at the same time... Have you ever made tests with this configuration? I have only one server, and is a production server, and I cannot start to make tests on it. It's all. Thank you very much for you help Regards Kern Sibbad wrote: > On Tuesday 17 July 2007 13:00, Alfredo Marchini wrote: > >> Hello, >> on sunday I make the full backup of 16 server, 8 at 09.00 and 8 at 13.00. >> on the other week-days I make diff backup of 16 server, 8 at 21.00 and 8 >> at 23.00. >> I use Mysql Server vers. 5.0.x. >> > > Hmmm. Well MySQL did have a rather big memory loss in version 2.0.3 or lower, > so let's hope that is the problem, and that it will be fixed. > > >> the DIR never arrives at 13.00, It blocks at 09.00 backups. >> I talk with my customer, he prefer to not install 2.1.26 beta, so before >> I make another test: >> I move the job retention to 21 days and mantain file and volume >> retention to 14 days. >> > > Within the Bacula code changing the retention period would not increase or > decrease any risk of problems. However, depending on how much pruning was > going on, things could change a lot in terms of the memory loss situations > that were occurring in the SQL code. In addition, in situations where the SD > needs a tape and none are available, the new code does *much* less prunning > than the old code -- it limits the pruning to only what is required to find a > Volume rather than pruning the full pool. This should give significant > performance improvements during the backup -- of course, at the end of the > job, the same pruning must be done as before ... > > >> This situation I think was the first and If I remember well didn't give >> me any block. >> I know the problem about the old jobs retention that will be maintained, >> so perhaps on 29-07 it will block again, but on 12-08 I will know if it >> really works (when it has pruned all old jobs with 14days job retention). >> If It will block again (I think so but the server is not mine, is >> customer's) my customer tell me that I can install beta version. >> > > Nice customer :-) > > >> So I hope you'll have good holidays (I remain here to work), and I'll >> resend info on next 2 weeks or next month. >> Thank you very much again >> Bye >> >> Kern Sibbald wrote: >> >>> On Tuesday 17 July 2007 10:47, Alfredo Marchini wrote: >>> >>> >>>> Hello, so I will upgrade, at this point, to version 2.1.26 beta. >>>> And wait for the next two weeks (14 days) that the block occurs. >>>> The blocks arrive always every two weeks (14 days), on Sunday. >>>> But is not dependent of the time of live of the bacula-dir, because, as >>>> I said, bacula-dir has been restarted on last wednesday 11-07, and >>>> blocked on sunday 15-07, and last block was on sunday 01-07. >>>> Now I will restart bacula-dir, and all will restart fine, and for 2 >>>> weeks it will be ok. >>>> I'll upgrade to version 2.1.26 and wait for 2 weeks. I hope that on >>>> 30-07 I will be able to say to you "all OK"!!! >>>> >>>> >>> Yes, me too. >>> >>> In the mean time, think about anything special that happens on those >>> Sundays -- the problem is most likely related to that. Of course, it may >>> just be that you did a full save on those days, and it triggered the >>> > memory > >>> loss in the SQL server, which in turn blocks Bacula. If that is the case, >>> then it is very likely the problem is already fixed. >>> >>> Regards, >>> >>> Kern >>> >>> >> >> >>> >>> >>>> Thanks >>>> Bye >>>> >>>> >>>> Arno Lehmann wrote: >>>> >>>> >>>> >>>>> Hello, >>>>> >>>>> 17.07.2007 09:59,, Alfredo Marchini wrote:: >>>>> >>>>> >>>>> >>>>>> Hello, >>>>>> >>>>>> Arno Lehmann wrote: >>>>>> >>>>>> >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> 16.07.2007 13:46,, Kern Sibbald wrote:: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Monday 16 July 2007 13:17, Arno Lehmann wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> ... >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>>>> Yes, either a kernel problem or a hardware problem seem the most >>>>>>>>>> likely. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> We >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>> cannot exclude a Bacula bug, but the finger is pointing to the >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> CPU/hardware. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Well, this is problematic... Alfredo gave good reasons to assume >>>>>>>>> that it's not purely hardware/OS related. Basically, the problem >>>>>>>>> occurs when he runs certain jobs. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> I didn't see that, but then I am no longer receive any email from the >>>>>>>> bacula-users list. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> Yes. I know, but it's hard moderating a discussion across two separate >>>>>>> mailing list :-) >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>>> I guess that the interworking of DIR, SD, catalog database, and OS >>>>>>>>> might trigger some sort of resource exhaustion, but debugging this >>>>>>>>> is beyond my abilities :-) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> Or as I mentioned, it could be that Bacula is self destructing ... >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>> I recommend shutting down your machine, rebooting it, running >>>>>>>>>> memtest, and >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> if >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>> all is OK, restarting Bacula and see what happens. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> Fortunately, that's not my machine :-) >>>>>>>>> >>>>>>>>> Unfortunately, my backup server is dying, but I know and understand >>>>>>>>> that problem :-( >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> If you and he *really* think it is a Bacula bug, I'd *strongly* >>>>>>>> recommend that he upgrade to the latest 2.1.26 beta version. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> I think is a bacula bug, because on my first mail I thought that >>>>>> > problem > >>>>>> was caused when bacula-dir stay alive for 2 weeks, but last week, on >>>>>> wednesday my server reboot cause ups and bacula-dir restarted. >>>>>> But on sunday he was blocked again. And my jobs, files and volumes >>>>>> retention is set to 14 days (casually 2 weeks). >>>>>> >>>>>> >>>>>> >>>>> If I understand you correctly, this indicates that the problem does >>>>> not only occur in the two week interval, but probably every sunday. >>>>> >>>>> >>>>> >>>>> >>>>>> For experience (small), I think that if there is a bug in 2.0.3 >>>>>> > version, > >>>>>> it's possible that bacula-dir 2.1.23 source code keeps the bug, so it >>>>>> can be more useful search now for the bug and, if present, correct it >>>>>> before developers publish final version 2.1.x. >>>>>> But this is only what I think, I don't know anything about the source >>>>>> code of bacula and what's changed in version 2.1.x from 2.0. >>>>>> Because if is my hw or cpu problem, why all works fine for 2 weeks? >>>>>> >>>>>> >>>>>> >>>>> It does look as though a certain job causes the problem. Well, not >>>>> exactly a job, but a distinct volume configuration (caused by pruning >>>>> et al.) together with a job. >>>>> >>>>> >>>>> >>>>> >>>>>> Another think: I'm not sure but before I catch this problem I setted >>>>>> > job > >>>>>> and file retentions different (file < job retention), and I had got no >>>>>> problems. Then my customer asks me to set job and file retention at the >>>>>> same value... and I setted it, and started problems. >>>>>> >>>>>> >>>>>> >>>>> I can confirm that this is not a general problem. >>>>> >>>>> >>>>> >>>>> >>>>>> If I reboot server (or more simply restarts bacula-dir) we'll loose for >>>>>> 2 weeks the chance to make tests on the system. So I would be better if >>>>>> someone tell me which tests should I do... >>>>>> >>>>>> >>>>>> >>>>> Try setting up a test job like the one that causes the problem, but >>>>> that's running on a daily schedule. Use a small fileset, a new >>>>> schedule, but the same client and pools as when Bacula hangs. >>>>> >>>>> Run that job daily, and observe the volume / pool status. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> I'm interested to help >>>>>> developers to resolve this problem, because I like very much this >>>>>> project and I want to continue to using it for my customers. >>>>>> But I don't like very much to install on a 16 servers backup system a >>>>>> beta version of bacula-dir, because can born new problems... but if >>>>>> > it's > >>>>>> the only solution... >>>>>> >>>>>> >>>>>> >>>>> Well, upgrading the DIR only should be enough. You can even keep the >>>>> existing installation intact, just create the new version DIR unter a >>>>> new name, and run that instead of the existing one. AFAIK, the catalog >>>>> database schema has not been changed. (But check the changelog!) >>>>> >>>>> And, as this is going to the devel list only now, repost your >>>>> configuration details - OS version, bacula version, catalog database, >>>>> client version affected, and so on. >>>>> >>>>> Arno >>>>> >>>>> >>>>> >>>>> >>>> -- >>>> Alfredo Marchini >>>> Consulente IT >>>> P.IVA: 05649240487 >>>> CF: MRCLRD81R07D612B >>>> Via Imbriani, 66 >>>> 50019 Sesto Fiorentino (FI) >>>> Tel. +39 393 9566375 >>>> E-Mail: [EMAIL PROTECTED] >>>> >>>> >>>> >>>> ------------------------------------------------------------------------- >>>> This SF.net email is sponsored by DB2 Express >>>> Download DB2 Express C - the FREE version of DB2 express and take >>>> control of your XML. No limits. Just data. Click to get it now. >>>> http://sourceforge.net/powerbar/db2/ >>>> _______________________________________________ >>>> Bacula-devel mailing list >>>> [email protected] >>>> https://lists.sourceforge.net/lists/listinfo/bacula-devel >>>> >>>> >>>> >>> >>> >> -- >> Alfredo Marchini >> Consulente IT >> P.IVA: 05649240487 >> CF: MRCLRD81R07D612B >> Via Imbriani, 66 >> 50019 Sesto Fiorentino (FI) >> Tel. +39 393 9566375 >> E-Mail: [EMAIL PROTECTED] >> >> >> >> ------------------------------------------------------------------------- >> This SF.net email is sponsored by DB2 Express >> Download DB2 Express C - the FREE version of DB2 express and take >> control of your XML. No limits. Just data. Click to get it now. >> http://sourceforge.net/powerbar/db2/ >> _______________________________________________ >> Bacula-devel mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/bacula-devel >> >> > > > -- Alfredo Marchini Consulente IT P.IVA: 05649240487 CF: MRCLRD81R07D612B Via di Ripoli, 22 50126 Firenze (FI) Tel. +39 393 9566375 E-Mail: [EMAIL PROTECTED] ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
