Hi,

05.07.2007 12:07,, Alfredo Marchini wrote::
> Hi Arno,
> I don't know if you know bacula source code,

A bit, but I usually look for problems in the configuration as, in my 
experience, the source code is quite stable. Of course, there are 
bugs, but these should be reproduceable in other installations, too. 
Unless I find a setup that looks unique to me, I'm assuming the source 
is ok and the problem lies in the configuration or general system.

> so I post you some 
> parameters and information in my configuration that I think can cause 
> this problem or I think is not well configured because I don't well 
> understand the manual:
> 
> Arno Lehmann wrote:
>> Hi,
>>
>> 04.07.2007 17:40,, Alfredo Marchini wrote::
>>   
>>> Hi,
>>> The system and db logs doesn't tell me anything about this problem, like 
>>> all the director processes or thread are locked concurrently.
>>> If I restart only bacula-dir without restarting bacula-sd and 16 
>>> bacula-fd the system restart working fine.
>>>     
>> It might be possible that the DIR is busy working on the catalog (like 
>> pruning data) and just needs more time. You can check this using 
>> 'mysqladmin processlist', for example.
>>   
> 
>     ok, when rehappen I'll make also this test, but if bacula makes jobs 
> and files pruning when volumes are all used, and there are no more 
> appendable volumes, I don't have this problem because I've got used only 
> 10 volumes of 50Gb and have other 8 volumes avalaible and not already 
> created.

Are you saying that there are always volumes available and thus no 
pruning happens?

That would indeed rule out the catalog as a bottle neck.

>>   
>>> Now I have already restarted bacula-dir  and all works fine (I backup 16 
>>> servers, I cannot take it in offline mode or someone kill me this 
>>> evening), so I'm not able to reproduce the error until about 10-15 days.
>>> Last time that I'd got this problem I used top and I didn't find 
>>> anything strange.
>>>     
>> Ok, so let's assume the hardware, OS and relevant applications are 
>> running ok.
>>
>>   
>     Yes, I think is the right way.
>>> But the test with time command will be the first when It will rehappen.
>>> I don't think that the problem is with database, when I connect to 
>>> database with mysql command line to db bacula it works fine and quickly.
>>>     
>> Bacula uses its own, internal locking, so you won't necessarily notice 
>> anything from outside of Bacula.
>>
>>   
> Ok
...
>>> Another thing is the maximum concurrent jobs :
>>> On director = 30
>>> On storage side director configuration file = 60
>>> On storage = 60
>>>     
>> Quite a lot, I think. Running up to 30 jobs in parallel might load 
>> your backup server beyond its reasonable working maximum, but that 
>> depends on your hardware, software, and requirements.
>>
>>   
> 
> I've set this value because:
> director = 30 because i've 16 fd that can connects concurrently (it is 
> not the truth) plus
> one job for fd to ask the status (16x2 = 32 rounded to 30).

I don't understand why you reserve job slots for the FDs... the FDs 
don't connect to the DIR to as for a status as far as I know. Or do 
you refer to some sort of tray monitor?

> storage = 60 because when 16 fd connects concurrently to the storage 
> i've go also 16 connections from the director to the storage (when jobs 
> starts).

The limit for the SD refers to running jobs, not to connections as far 
as I know.

For example, I run four jobs concurrently, and even if these jobs are 
all running, the DIR can connect for status display and the monitoring 
application can ask for the SD status, too.

> I thought that the not responding problem was caused by this params, so 
> I setted high values because I don't know how (at devel level) bacula 
> works with tcp connections (I thought that the problem was caused by 
> missing sufficient concurrent threads).

I don't think so... the limits you set do not control how many threads 
can be created, or how many network connections can exist 
simultaneously. At least my impression is different.

> 
> Another thing:
> I've setted for all fd the messages that points to the director messages.
> Example:
> on director named = bacula-dir I've created messages named = 
> bacula-dir-messages
> on all fd I've setted message named = bacula-dir-message that points to 
> director bacula-dir

I don't think this is relevant here, unless you have reason to believe 
that messages are not sent to the DIR.

> Last thing and I've got no more:
> 
> If I go to working directory of bacula-dir, when is not responding, I 
> find the files of mail that have to be send via e-mail to the operators 
> old 2-3 days, as the bacula-dir is blocked and cannot send the e-mail 
> (when is working fine the mail are correctly sent to all the operators).

Obviously, when the DIR is blocked, it will not finish jobs and thus 
not send mail.

Does your above statement imply that your DIR is stuck for some days, 
when it happens? That would probably rule out catalog performance 
issues as even an underpowered database server should finish the 
queries after a few days...

> I use a postfix smtp server configured for local and bsmtp to send email 
> to a smtp server
> installed in my LAN on another linux server.

That's not important here, too.

Arno

-- 
Arno Lehmann
IT-Service Lehmann
www.its-lehmann.de

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to