Hi,

05.08.2007 20:25,, Alfredo Marchini wrote::
> Hi Kern, Arno,
> 
> I resolved my problems with bacula 2.0.3 director!!!!
> Now It doesn't block...

Good to read that.

> I made this changes:
> 
> Before all File, Job, and Volume retention were setted to 14 days.
> 
> After the last crash (3 weeks ago) I setted only Job Retention to 21 
> days, and now, after 3 weeks, it works (I checked this five minutes ago)!!!
> 
> So, after this, I think that there is concurrency problem when 
> bacula-dir makes file, job and volume pruning at the same time...

Well, it looks like that could be the problem then, but ...

> Have you ever made tests with this configuration?

... I do run parts of my normal production setup like this and never 
noticed the sort of problems you ran into. But, for my setup, pruning 
is happening much more distributed because there are volumes and jobs 
with different retention times.

It might be possible to set up a regression test to check for this, 
but to get good results, that test should collect a larger number of 
jobs and file entries, so it would need to run for very long. I don't 
think this is a sort of test Kern wants in the routine tests.

Perhaps some sort of an extra test, running a setup for a week or so, 
and working with one day retention times for a number of jobs...

> I have only one server, and is a production server, and I cannot start 
> to make tests on it.

Quite understandable :-)

> 
> It's all.

Fine.

Arno

> Thank you very much for you help
> Regards
> 
> 
> Kern Sibbad wrote:
>> On Tuesday 17 July 2007 13:00, Alfredo Marchini wrote:
>>  
>>> Hello,
>>> on sunday I make the full backup of 16 server, 8 at 09.00 and 8 at 
>>> 13.00.
>>> on the other week-days I make diff backup of 16 server, 8 at 21.00 
>>> and 8 at 23.00.
>>> I use Mysql Server  vers. 5.0.x.
>>>     
>>
>> Hmmm. Well MySQL did have a rather big memory loss in version 2.0.3 or 
>> lower, so let's hope that is the problem, and that it will be fixed.
>>
>>  
>>> the DIR never arrives at 13.00, It blocks at 09.00 backups.
>>> I talk with my customer, he prefer to not install 2.1.26 beta, so 
>>> before I make another test:
>>> I move the job retention to 21 days and mantain file and volume 
>>> retention to 14 days.
>>>     
>>
>> Within the Bacula code changing the retention period would not 
>> increase or decrease any risk of problems.  However, depending on how 
>> much pruning was going on, things could change a lot in terms of the 
>> memory loss situations that were occurring in the SQL code.  In 
>> addition, in situations where the SD needs a tape and none are 
>> available, the new code does *much* less prunning than the old code -- 
>> it limits the pruning to only what is required to find a Volume rather 
>> than pruning the full pool.  This should give significant performance 
>> improvements during the backup -- of course, at the end of the job, 
>> the same pruning must be done as before ...
>>
>>  
>>> This situation I think was the first and If I remember well didn't 
>>> give me any block.
>>> I know the problem about the old jobs retention that will be 
>>> maintained, so perhaps on 29-07 it will block again, but on 12-08 I 
>>> will know if it really works (when it has pruned all old jobs with 
>>> 14days job retention).
>>> If It will block again (I think so but the server is not mine, is 
>>> customer's) my customer tell me that I can install beta version.
>>>     
>>
>> Nice customer :-)
>>
>>  
>>> So I hope you'll have good holidays (I remain here to work), and I'll 
>>> resend info on next 2 weeks or next month.
>>> Thank you very much again
>>> Bye
>>>
>>> Kern Sibbald wrote:
>>>    
>>>> On Tuesday 17 July 2007 10:47, Alfredo Marchini wrote:
>>>>        
>>>>> Hello, so I will upgrade, at this point, to version 2.1.26 beta.
>>>>> And wait for the next two weeks (14 days) that the block occurs.
>>>>> The blocks arrive always every two weeks (14 days), on Sunday.
>>>>> But is not dependent of the time of live of the bacula-dir, 
>>>>> because, as I said, bacula-dir has been restarted on last wednesday 
>>>>> 11-07, and blocked on sunday 15-07, and last block was on sunday 
>>>>> 01-07.
>>>>> Now I will restart bacula-dir, and all will restart fine, and for 2 
>>>>> weeks it will be ok.
>>>>> I'll upgrade to version 2.1.26 and wait for 2 weeks. I hope that on 
>>>>> 30-07 I will be able to say to you "all OK"!!!
>>>>>             
>>>> Yes, me too.
>>>>
>>>> In the mean time, think about anything special that happens on those 
>>>> Sundays -- the problem is most likely related to that.  Of course, 
>>>> it may just be that you did a full save on those days, and it 
>>>> triggered the       
>> memory  
>>>> loss in the SQL server, which in turn blocks Bacula.  If that is the 
>>>> case, then it is very likely the problem is already fixed.
>>>>
>>>> Regards,
>>>>
>>>> Kern
>>>>         
>>>
>>>    
>>>>        
>>>>> Thanks
>>>>> Bye
>>>>>
>>>>>
>>>>> Arno Lehmann wrote:
>>>>>
>>>>>            
>>>>>> Hello,
>>>>>>
>>>>>> 17.07.2007 09:59,, Alfredo Marchini wrote::
>>>>>>                  
>>>>>>> Hello,
>>>>>>>
>>>>>>> Arno Lehmann wrote:
>>>>>>>                        
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> 16.07.2007 13:46,, Kern Sibbald wrote::
>>>>>>>>  
>>>>>>>>                              
>>>>>>>>> On Monday 16 July 2007 13:17, Arno Lehmann wrote:
>>>>>>>>>                                       
>>>>>>>>>> Hello,
>>>>>>>>>>                                                 
>>>>>>>> ...
>>>>>>>>  
>>>>>>>>                              
>>>>>>>>>>> Yes, either a kernel problem or a hardware problem seem the 
>>>>>>>>>>> most likely.                                      
>>>>>>>>>>>                     
>>>>>>>>> We                                       
>>>>>>>>>>> cannot exclude a Bacula bug, but the finger is pointing to 
>>>>>>>>>>> the                                                         
>>>>>>>>> CPU/hardware.
>>>>>>>>>                                       
>>>>>>>>>> Well, this is problematic... Alfredo gave good reasons to 
>>>>>>>>>> assume that it's not purely hardware/OS related. Basically, 
>>>>>>>>>> the problem occurs when he runs certain jobs.
>>>>>>>>>>                                                 
>>>>>>>>> I didn't see that, but then I am no longer receive any email 
>>>>>>>>> from the bacula-users list.
>>>>>>>>>                                         
>>>>>>>> Yes. I know, but it's hard moderating a discussion across two 
>>>>>>>> separate mailing list :-)
>>>>>>>>
>>>>>>>>  
>>>>>>>>                              
>>>>>>>>>> I guess that the interworking of DIR, SD, catalog database, 
>>>>>>>>>> and OS might trigger some sort of resource exhaustion, but 
>>>>>>>>>> debugging this is beyond my abilities :-)
>>>>>>>>>>                                                 
>>>>>>>>> Or as I mentioned, it could be that Bacula is self destructing ...
>>>>>>>>>
>>>>>>>>>                                       
>>>>>>>>>>> I recommend shutting down your machine, rebooting it, running 
>>>>>>>>>>> memtest, and                                     
>>>>>>>>>>>                     
>>>>>>>>> if                                       
>>>>>>>>>>> all is OK, restarting Bacula and see what happens.
>>>>>>>>>>>                                                         
>>>>>>>>>> Fortunately, that's not my machine :-)
>>>>>>>>>>
>>>>>>>>>> Unfortunately, my backup server is dying, but I know and 
>>>>>>>>>> understand that problem :-(
>>>>>>>>>>                                                 
>>>>>>>>> If you and he *really* think it is a Bacula bug, I'd *strongly* 
>>>>>>>>> recommend that he upgrade to the latest 2.1.26 beta version.
>>>>>>>>>                                         
>>>>>>>>                                 
>>>>>>> I think is a bacula bug, because on my first mail I thought that 
>>>>>>>             
>> problem  
>>>>>>> was caused when bacula-dir stay alive for 2 weeks, but last week, 
>>>>>>> on wednesday my server reboot cause ups and bacula-dir restarted.
>>>>>>> But on sunday he was blocked again. And my jobs, files and 
>>>>>>> volumes retention is set to 14 days (casually 2 weeks).
>>>>>>>                         
>>>>>> If I understand you correctly, this indicates that the problem 
>>>>>> does not only occur in the two week interval, but probably every 
>>>>>> sunday.
>>>>>>
>>>>>>                  
>>>>>>> For experience (small), I think that if there is a bug in 2.0.3 
>>>>>>>             
>> version,  
>>>>>>> it's possible that bacula-dir 2.1.23 source code keeps the bug, 
>>>>>>> so it can be more useful search now for the bug and, if present, 
>>>>>>> correct it before developers publish final version 2.1.x.
>>>>>>> But this is only what I think, I don't know anything about the 
>>>>>>> source code of bacula and what's changed in version 2.1.x from 2.0.
>>>>>>> Because if is my hw or cpu problem, why all works fine for 2 weeks?
>>>>>>>                         
>>>>>> It does look as though a certain job causes the problem. Well, not 
>>>>>> exactly a job, but a distinct volume configuration (caused by 
>>>>>> pruning et al.) together with a job.
>>>>>>
>>>>>>                  
>>>>>>> Another think: I'm not sure but before I catch this problem I 
>>>>>>> setted             
>> job  
>>>>>>> and file retentions different (file < job retention), and I had 
>>>>>>> got no problems. Then my customer asks me to set job and file 
>>>>>>> retention at the same value... and I setted it, and started 
>>>>>>> problems.
>>>>>>>                         
>>>>>> I can confirm that this is not a general problem.
>>>>>>
>>>>>>                  
>>>>>>> If I reboot server (or more simply restarts bacula-dir) we'll 
>>>>>>> loose for 2 weeks the chance to make tests on the system. So I 
>>>>>>> would be better if someone tell me which tests should I do...
>>>>>>>                         
>>>>>> Try setting up a test job like the one that causes the problem, 
>>>>>> but that's running on a daily schedule. Use a small fileset, a new 
>>>>>> schedule, but the same client and pools as when Bacula hangs.
>>>>>>
>>>>>> Run that job daily, and observe the volume / pool status.
>>>>>>
>>>>>>
>>>>>>                  
>>>>>>> I'm interested to help developers to resolve this problem, 
>>>>>>> because I like very much this project and I want to continue to 
>>>>>>> using it for my customers.
>>>>>>> But I don't like very much to install on a 16 servers backup 
>>>>>>> system a beta version of bacula-dir, because can born new 
>>>>>>> problems... but if             
>> it's  
>>>>>>> the only solution...
>>>>>>>                         
>>>>>> Well, upgrading the DIR only should be enough. You can even keep 
>>>>>> the existing installation intact, just create the new version DIR 
>>>>>> unter a new name, and run that instead of the existing one. AFAIK, 
>>>>>> the catalog database schema has not been changed. (But check the 
>>>>>> changelog!)
>>>>>>
>>>>>> And, as this is going to the devel list only now, repost your 
>>>>>> configuration details - OS version, bacula version, catalog 
>>>>>> database, client version affected, and so on.
>>>>>>
>>>>>> Arno
>>>>>>
>>>>>>                   
>>>>> -- 
>>>>> Alfredo Marchini
>>>>> Consulente IT
>>>>> P.IVA: 05649240487
>>>>> CF: MRCLRD81R07D612B
>>>>> Via Imbriani, 66
>>>>> 50019 Sesto Fiorentino (FI)
>>>>> Tel. +39 393 9566375
>>>>> E-Mail: [EMAIL PROTECTED]
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------- 
>>>>>
>>>>> This SF.net email is sponsored by DB2 Express
>>>>> Download DB2 Express C - the FREE version of DB2 express and take
>>>>> control of your XML. No limits. Just data. Click to get it now.
>>>>> http://sourceforge.net/powerbar/db2/
>>>>> _______________________________________________
>>>>> Bacula-devel mailing list
>>>>> [email protected]
>>>>> https://lists.sourceforge.net/lists/listinfo/bacula-devel
>>>>>
>>>>>             
>>>>         
>>> -- 
>>> Alfredo Marchini
>>> Consulente IT
>>> P.IVA: 05649240487
>>> CF: MRCLRD81R07D612B
>>> Via Imbriani, 66
>>> 50019 Sesto Fiorentino (FI)
>>> Tel. +39 393 9566375
>>> E-Mail: [EMAIL PROTECTED]
>>>
>>>
>>>
>>> ------------------------------------------------------------------------- 
>>>
>>> This SF.net email is sponsored by DB2 Express
>>> Download DB2 Express C - the FREE version of DB2 express and take
>>> control of your XML. No limits. Just data. Click to get it now.
>>> http://sourceforge.net/powerbar/db2/
>>> _______________________________________________
>>> Bacula-devel mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/bacula-devel
>>>
>>>     
>>
>>
>>   
> 
> 

-- 
Arno Lehmann
IT-Service Lehmann
www.its-lehmann.de

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to