Re: data recovery tool progress

Adam Kocoloski Mon, 09 Aug 2010 23:27:30 -0700

With Randall's help we hooked the new node scanner up to the lost+found DB 
generator.  It seems to work well enough for small DBs; for large DBs with lots 
of missing nodes the O(N^2) complexity of the problem catches up to the code 
and generating the lost+found DB takes quite some time.  Mikeal is running 
tests tonight.  The algo appears pretty CPU-limited, so a little 
parallelization may be warranted.


http://github.com/kocolosk/couchdb/tree/db_repair

Adam

(I sent this previous update to myself instead of the list, so I'll forward it 
here ...)

On Aug 10, 2010, at 12:01 AM, Adam Kocoloski wrote:

> On Aug 9, 2010, at 10:10 PM, Adam Kocoloski wrote:
> 
>> Right, make_lost_and_found still relies on code which reads through 
>> couch_file one byte at a time, that's the cause of the slowness.  The newer 
>> scanner will improve that pretty dramatically, and we can tune it further by 
>> increasing the length of the pattern that we match when looking for 
>> kp/kv_node terms in the files, at the expense of some extra complexity 
>> dealing with the block prefixes (currently it does a 1-byte match, which as 
>> I understand it cannot be split across blocks).
> 
> The scanner now looks for a 7 byte match, unless it is within 6 bytes of a 
> block boundary, in which case it looks for the longest possible match at that 
> position.  The more specific match condition greatly reduces the # of calls 
> to couch_file, and thus boosts the throughput.  On my laptop it can scan the 
> testwritesdb.couch from Mikeal's couchtest repo (52 MB) in 18 seconds.
> 
>> Regarding the file_corruption error on the larger file, I think this is 
>> something we will just naturally trigger when we take a guess that random 
>> positions in a file are actually the beginning of a term.  I think our best 
>> recourse here is to return {error, file_corruption} from couch_file but 
>> leave the gen_server up and running instead of terminating it.  That way the 
>> repair code can ignore the error and keep moving without having to reopen 
>> the file.
> 
> I committed this change (to my db_repair branch) after consulting with Chris. 
>  The longer match condition makes these spurious file_corruption triggers 
> much less likely, but I think it's still a good thing not to crash the server 
> when they happen.
> 
>> Next steps as I understand them - Randall is working on integrating the 
>> in-memory scanner into Volker's code that finds all the dangling by_id 
>> nodes.  I'm working on making sure that the scanner identifies bt node 
>> candidates which span block prefixes, and on improving its pattern-matching.
> 
> Latest from my end
> http://github.com/kocolosk/couchdb/tree/db_repair
> 
>> 
>> Adam
>> 
>> On Aug 9, 2010, at 9:50 PM, Mikeal Rogers wrote:
>> 
>>> I pulled down the latest code from Adam's branch @
>>> 7080ff72baa329cf6c4be2a79e71a41f744ed93b.
>>> 
>>> Running timer:tc(couch_db_repair, make_lost_and_found, ["multi_conflict"]).
>>> on a database with 200 lost updates spanning 200 restarts (
>>> http://github.com/mikeal/couchtest/blob/master/multi_conflict.couch ) took
>>> about 101 seconds.
>>> 
>>> I tried running against a larger databases (
>>> http://github.com/mikeal/couchtest/blob/master/testwritesdb.couch ) and I
>>> got this exception:
>>> 
>>> http://gist.github.com/516491
>>> 
>>> -Mikeal
>>> 
>>> 
>>> 
>>> On Mon, Aug 9, 2010 at 6:09 PM, Randall Leeds 
>>> <[email protected]>wrote:
>>> 
>>>> Summing up what went on in IRC for those who were absent.
>>>> 
>>>> The latest progress is on Adam's branch at
>>>> http://github.com/kocolosk/couchdb/tree/db_repair
>>>> 
>>>> couch_db_repair:make_lost_and_found/1 attempts to create a new
>>>> lost+found/DbName database to which it merges all nodes not accessible
>>>> from anywhere (any other node found in a full file scan or any header
>>>> pointers).
>>>> 
>>>> Currently, make_lost_and_found uses Volker's repair (from
>>>> couch_db_repair_b module, also in Adam's branch).
>>>> Adam found that the bottleneck was couch_file calls and that the
>>>> repair process was taking a very long time so he added
>>>> couch_db_repair:find_nodes_quickly/1 that reads 1MB chunks as binary
>>>> and tries to process it to find nodes instead of scanning back one
>>>> byte at a time. It is currently not hooked up to the repair mechanism.
>>>> 
>>>> Making progress. Go team.
>>>> 
>>>> On Mon, Aug 9, 2010 at 13:52, Mikeal Rogers <[email protected]>
>>>> wrote:
>>>>> jchris suggested on IRC that I try a normal doc update and see if that
>>>> fixes
>>>>> it.
>>>>> 
>>>>> It does. After a new doc was created the dbinfo doc count was back to
>>>>> normal.
>>>>> 
>>>>> -Mikeal
>>>>> 
>>>>> On Mon, Aug 9, 2010 at 1:39 PM, Mikeal Rogers <[email protected]
>>>>> wrote:
>>>>> 
>>>>>> Ok, I pulled down this code and tested against a database with a ton of
>>>>>> missing writes right before a single restart.
>>>>>> 
>>>>>> Before restart this was the database:
>>>>>> 
>>>>>> {
>>>>>> db_name: "testwritesdb"
>>>>>> doc_count: 124969
>>>>>> doc_del_count: 0
>>>>>> update_seq: 124969
>>>>>> purge_seq: 0
>>>>>> compact_running: false
>>>>>> disk_size: 54857478
>>>>>> instance_start_time: "1281384140058211"
>>>>>> disk_format_version: 5
>>>>>> }
>>>>>> 
>>>>>> After restart it was this:
>>>>>> 
>>>>>> {
>>>>>> db_name: "testwritesdb"
>>>>>> doc_count: 1
>>>>>> doc_del_count: 0
>>>>>> update_seq: 1
>>>>>> purge_seq: 0
>>>>>> compact_running: false
>>>>>> disk_size: 54857478
>>>>>> instance_start_time: "1281384593876026"
>>>>>> disk_format_version: 5
>>>>>> }
>>>>>> 
>>>>>> After repair, it's this:
>>>>>> 
>>>>>> {
>>>>>> db_name: "testwritesdb"
>>>>>> doc_count: 1
>>>>>> doc_del_count: 0
>>>>>> update_seq: 124969
>>>>>> purge_seq: 0
>>>>>> compact_running: false
>>>>>> disk_size: 54857820
>>>>>> instance_start_time: "1281385990193289"
>>>>>> disk_format_version: 5
>>>>>> committed_update_seq: 124969
>>>>>> }
>>>>>> 
>>>>>> All the sequences are there and hitting _all_docs shows all the
>>>> documents
>>>>>> so why is the doc_count only 1 in the dbinfo?
>>>>>> 
>>>>>> -Mikeal
>>>>>> 
>>>>>> On Mon, Aug 9, 2010 at 11:53 AM, Filipe David Manana <
>>>> [email protected]>wrote:
>>>>>> 
>>>>>>> For the record (and people not on IRC), the code at:
>>>>>>> 
>>>>>>> http://github.com/fdmanana/couchdb/commits/db_repair
>>>>>>> 
>>>>>>> is working for at least simple cases. Use
>>>>>>> couch_db_repair:repair(DbNameAsString).
>>>>>>> There's one TODO:  update the reduce values for the by_seq and by_id
>>>>>>> BTrees.
>>>>>>> 
>>>>>>> If anyone wants to give some help on this, your welcome.
>>>>>>> 
>>>>>>> On Mon, Aug 9, 2010 at 6:12 PM, Mikeal Rogers <[email protected]
>>>>>>>> wrote:
>>>>>>> 
>>>>>>>> I'm starting to create a bunch of test db files that expose this bug
>>>>>>> under
>>>>>>>> different conditions like multiple restarts, across compaction,
>>>>>>> variances
>>>>>>>> in
>>>>>>>> updates the might cause conflict, etc.
>>>>>>>> 
>>>>>>>> http://github.com/mikeal/couchtest
>>>>>>>> 
>>>>>>>> The README outlines what was done to the db's and what needs to be
>>>>>>>> recovered.
>>>>>>>> 
>>>>>>>> -Mikeal
>>>>>>>> 
>>>>>>>> On Mon, Aug 9, 2010 at 9:33 AM, Filipe David Manana <
>>>>>>> [email protected]
>>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> On Mon, Aug 9, 2010 at 5:22 PM, Robert Newson <
>>>>>>> [email protected]
>>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Doesn't this bit;
>>>>>>>>>> 
>>>>>>>>>> -        Db#db{waiting_delayed_commit=nil};
>>>>>>>>>> +        Db;
>>>>>>>>>> +        % Db#db{waiting_delayed_commit=nil};
>>>>>>>>>> 
>>>>>>>>>> revert the bug fix?
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> That's intentional, for my local testing.
>>>>>>>>> That patch isn't obviously anything close to final, it's too
>>>>>>> experimental
>>>>>>>>> yet.
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> B.
>>>>>>>>>> 
>>>>>>>>>> On Mon, Aug 9, 2010 at 5:09 PM, Jan Lehnardt <[email protected]>
>>>>>>> wrote:
>>>>>>>>>>> Hi All,
>>>>>>>>>>> 
>>>>>>>>>>> Filipe jumped in to start working on the recovery tool, but he
>>>>>>> isn't
>>>>>>>>> done
>>>>>>>>>> yet.
>>>>>>>>>>> 
>>>>>>>>>>> Here's the current patch:
>>>>>>>>>>> 
>>>>>>>>>>> http://www.friendpaste.com/4uMngrym4r7Zz4R0ThSHbz
>>>>>>>>>>> 
>>>>>>>>>>> it is not done and very early, but any help on this is greatly
>>>>>>>>>> appreciated.
>>>>>>>>>>> 
>>>>>>>>>>> The current state is (in Filipe's words):
>>>>>>>>>>> - i can detect that a file needs repair
>>>>>>>>>>> - and get the last btree roots from it
>>>>>>>>>>> - "only" missing: get last db seq num
>>>>>>>>>>> - write new header
>>>>>>>>>>> - and deal with the local docs btree (if exists)
>>>>>>>>>>> 
>>>>>>>>>>> Thanks!
>>>>>>>>>>> Jan
>>>>>>>>>>> --
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Filipe David Manana,
>>>>>>>>> [email protected]
>>>>>>>>> 
>>>>>>>>> "Reasonable men adapt themselves to the world.
>>>>>>>>> Unreasonable men adapt the world to themselves.
>>>>>>>>> That's why all progress depends on unreasonable men."
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Filipe David Manana,
>>>>>>> [email protected]
>>>>>>> 
>>>>>>> "Reasonable men adapt themselves to the world.
>>>>>>> Unreasonable men adapt the world to themselves.
>>>>>>> That's why all progress depends on unreasonable men."
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> 
>

Re: data recovery tool progress

Reply via email to