Martin Simmons wrote:
>>>>>> On Tue, 29 Dec 2009 11:05:09 +0100, Jesper Krogh said:
>> Kern Sibbald wrote:
>>> The Kaboom chapter of the manual tells you how to run the Director under 
>>> the 
>>> debugger.  You can also attach to the Director while it is running, using:
>>>
>>>   cd <bacula-binary-directory>
>>>   gdb bacula-dir <pid-of-director>
>> A small month, the problem is still present.. takes hours to get "from
>> done" and to actual restore starts. I've managed to get a backtrace:
>>
>> Thread 2 (Thread 0x42767950 (LWP 10832)):
>> #0  0x000000000040a4a8 in add_findex (bsr=0x6dd468, JobId=32927,
>> findex=132808) at bsr.c:554
>> #1  0x0000000000432965 in restore_cmd (ua=0x6dcb08, cmd=<value optimized
>> out>) at ua_restore.c:1094
>> #2  0x0000000000425e56 in do_a_command (ua=0x6dcb08, cmd=0x6d5b10 "1")
>> at ua_cmds.c:180
>> #3  0x0000000000438781 in handle_UA_client_request (arg=<value optimized
>> out>) at ua_server.c:147
>> #4  0x000000000046cb8b in workq_server (arg=<value optimized out>) at
>> workq.c:357
>> #5  0x00007f233e9553f7 in start_thread () from /lib/libpthread.so.0
>> #6  0x00007f233db1db4d in clone () from /lib/libc.so.6
>> #7  0x0000000000000000 in ?? ()
>>
>> Repeating this with "continue/interrrupt" gives the same trace but with
>> different findex= values.
>>
>> The restore block looks like this:
>>
>> +--------+-------+-----------+-----------------+---------------------+------------+
>> | JobId  | Level | JobFiles  | JobBytes        | StartTime           |
>> VolumeName |
>> +--------+-------+-----------+-----------------+---------------------+------------+
>> | 32,927 | F     | 3,183,314 | 684,965,311,013 | 2009-12-12 17:31:41 |
>> 000779L3   |
>> | 32,927 | F     | 3,183,314 | 684,965,311,013 | 2009-12-12 17:31:41 |
>> 000789L3   |
>> | 32,927 | F     | 3,183,314 | 684,965,311,013 | 2009-12-12 17:31:41 |
>> 000804L3   |
>> | 32,927 | F     | 3,183,314 | 684,965,311,013 | 2009-12-12 17:31:41 |
>> 001805L3   |
>> | 32,927 | F     | 3,183,314 | 684,965,311,013 | 2009-12-12 17:31:41 |
>> 001806L3   |
>> | 32,927 | F     | 3,183,314 | 684,965,311,013 | 2009-12-12 17:31:41 |
>> 001807L3   |
>> | 33,446 | D     |   136,256 |  50,695,957,124 | 2009-12-28 08:01:50 |
>> 004048L4   |
>> | 33,473 | I     |     1,224 |  16,023,974,683 | 2009-12-28 14:41:19 |
>> 004059L4   |
>> | 33,501 | I     |    11,188 |  24,448,676,227 | 2009-12-29 01:40:23 |
>> 004059L4   |
>> +--------+-------+-----------+-----------------+---------------------+------------+
>>
>> I'm on 2.4.3 and the bsr.c:554 is
>>
>>    /* Walk down fi chain and find where to insert insert new FileIndex */
>>    for ( ; fi; fi=fi->next) {
>>       if (findex == (fi->findex2 + 1)) {  /* extend up */
>>          RBSR_FINDEX *nfi;
>>          fi->findex2 = findex;
>>
>> It I get some more time I'll try to add debug information to find out
>> where it's actually looping. Suggestions are certainly welcome.
> 
> It might be a variant of this problem:
> 
> http://article.gmane.org/gmane.comp.bacula.user/54164/match=add%5ffindex

It looks quite a lot like the same problem. But I did a diff of the
bsr.c of the freshest one with the 2.4.3 one and there are not changes.
Since an upgrade is non-reversibel I would prefer not to be "forced" to
do it but take it at a time where I had sufficient amount of testing time.

Can you point to the changes that are supposed to deal with the problem?

Thanks.
-- 
Jesper

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to