Re: data recovery tool progress

Robert Newson Tue, 10 Aug 2010 01:57:42 -0700

slight correction, this was with delayed_commits=false. My framework
does a PUT to ensure that on every test run.


B.

On Tue, Aug 10, 2010 at 9:55 AM, Robert Newson <[email protected]> wrote:
> In ran the db_repair code on a healthy database produced with
> delayed_commits=true.
>
> The source db had 3218 docs. db_repair recovered 3120 and then returned with 
> ok.
>
> I'm redoing that test, but this indicates we're not finding all roots.
>
> I note that the output file was 36 times the input file, which is a
> consequence of folding all possible roots. I think that needs to be in
> the release notes for the repair tool if that behavior remains when it
> ships.
>
> B.
>
> On Tue, Aug 10, 2010 at 9:09 AM, Mikeal Rogers <[email protected]> 
> wrote:
>> I think I found a bug in the current lost+found repair.
>>
>> I've been running it against the testwritesdb and it's in a state that is
>> never finishing.
>>
>> It's still spitting out these lines:
>>
>> [info] [<0.32.0>] writing 1001 updates to lost+found/testwritesdb
>>
>> Most are 1001 but there are also other random variances 452, 866, etc.
>>
>> But the file size and dbinfo hasn't budged in over 30 minutes. The size is
>> stuck at 34300002 with the original db file being 54857478 .
>>
>> This database only has one document in it that isn't "lost" so if it's
>> finding *any* new docs it should be writing them.
>>
>> I also started another job to recover a production db that is quite large,
>> 500megs, with the missing data a week or so back. This has been running for
>> 2 hours and has still not output anything or created the lost and found db
>> so I can only assume that it is in the same state.
>>
>> Both machines are still churning 100% CPU.
>>
>> -Mikeal
>>
>>
>> On Mon, Aug 9, 2010 at 11:26 PM, Adam Kocoloski <[email protected]> wrote:
>>
>>> With Randall's help we hooked the new node scanner up to the lost+found DB
>>> generator.  It seems to work well enough for small DBs; for large DBs with
>>> lots of missing nodes the O(N^2) complexity of the problem catches up to the
>>> code and generating the lost+found DB takes quite some time.  Mikeal is
>>> running tests tonight.  The algo appears pretty CPU-limited, so a little
>>> parallelization may be warranted.
>>>
>>> http://github.com/kocolosk/couchdb/tree/db_repair
>>>
>>> Adam
>>>
>>> (I sent this previous update to myself instead of the list, so I'll forward
>>> it here ...)
>>>
>>> On Aug 10, 2010, at 12:01 AM, Adam Kocoloski wrote:
>>>
>>> > On Aug 9, 2010, at 10:10 PM, Adam Kocoloski wrote:
>>> >
>>> >> Right, make_lost_and_found still relies on code which reads through
>>> couch_file one byte at a time, that's the cause of the slowness.  The newer
>>> scanner will improve that pretty dramatically, and we can tune it further by
>>> increasing the length of the pattern that we match when looking for
>>> kp/kv_node terms in the files, at the expense of some extra complexity
>>> dealing with the block prefixes (currently it does a 1-byte match, which as
>>> I understand it cannot be split across blocks).
>>> >
>>> > The scanner now looks for a 7 byte match, unless it is within 6 bytes of
>>> a block boundary, in which case it looks for the longest possible match at
>>> that position.  The more specific match condition greatly reduces the # of
>>> calls to couch_file, and thus boosts the throughput.  On my laptop it can
>>> scan the testwritesdb.couch from Mikeal's couchtest repo (52 MB) in 18
>>> seconds.
>>> >
>>> >> Regarding the file_corruption error on the larger file, I think this is
>>> something we will just naturally trigger when we take a guess that random
>>> positions in a file are actually the beginning of a term.  I think our best
>>> recourse here is to return {error, file_corruption} from couch_file but
>>> leave the gen_server up and running instead of terminating it.  That way the
>>> repair code can ignore the error and keep moving without having to reopen
>>> the file.
>>> >
>>> > I committed this change (to my db_repair branch) after consulting with
>>> Chris.  The longer match condition makes these spurious file_corruption
>>> triggers much less likely, but I think it's still a good thing not to crash
>>> the server when they happen.
>>> >
>>> >> Next steps as I understand them - Randall is working on integrating the
>>> in-memory scanner into Volker's code that finds all the dangling by_id
>>> nodes.  I'm working on making sure that the scanner identifies bt node
>>> candidates which span block prefixes, and on improving its pattern-matching.
>>> >
>>> > Latest from my end
>>> > http://github.com/kocolosk/couchdb/tree/db_repair
>>> >
>>> >>
>>> >> Adam
>>> >>
>>> >> On Aug 9, 2010, at 9:50 PM, Mikeal Rogers wrote:
>>> >>
>>> >>> I pulled down the latest code from Adam's branch @
>>> >>> 7080ff72baa329cf6c4be2a79e71a41f744ed93b.
>>> >>>
>>> >>> Running timer:tc(couch_db_repair, make_lost_and_found,
>>> ["multi_conflict"]).
>>> >>> on a database with 200 lost updates spanning 200 restarts (
>>> >>> http://github.com/mikeal/couchtest/blob/master/multi_conflict.couch )
>>> took
>>> >>> about 101 seconds.
>>> >>>
>>> >>> I tried running against a larger databases (
>>> >>> http://github.com/mikeal/couchtest/blob/master/testwritesdb.couch )
>>> and I
>>> >>> got this exception:
>>> >>>
>>> >>> http://gist.github.com/516491
>>> >>>
>>> >>> -Mikeal
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Mon, Aug 9, 2010 at 6:09 PM, Randall Leeds <[email protected]
>>> >wrote:
>>> >>>
>>> >>>> Summing up what went on in IRC for those who were absent.
>>> >>>>
>>> >>>> The latest progress is on Adam's branch at
>>> >>>> http://github.com/kocolosk/couchdb/tree/db_repair
>>> >>>>
>>> >>>> couch_db_repair:make_lost_and_found/1 attempts to create a new
>>> >>>> lost+found/DbName database to which it merges all nodes not accessible
>>> >>>> from anywhere (any other node found in a full file scan or any header
>>> >>>> pointers).
>>> >>>>
>>> >>>> Currently, make_lost_and_found uses Volker's repair (from
>>> >>>> couch_db_repair_b module, also in Adam's branch).
>>> >>>> Adam found that the bottleneck was couch_file calls and that the
>>> >>>> repair process was taking a very long time so he added
>>> >>>> couch_db_repair:find_nodes_quickly/1 that reads 1MB chunks as binary
>>> >>>> and tries to process it to find nodes instead of scanning back one
>>> >>>> byte at a time. It is currently not hooked up to the repair mechanism.
>>> >>>>
>>> >>>> Making progress. Go team.
>>> >>>>
>>> >>>> On Mon, Aug 9, 2010 at 13:52, Mikeal Rogers <[email protected]>
>>> >>>> wrote:
>>> >>>>> jchris suggested on IRC that I try a normal doc update and see if
>>> that
>>> >>>> fixes
>>> >>>>> it.
>>> >>>>>
>>> >>>>> It does. After a new doc was created the dbinfo doc count was back to
>>> >>>>> normal.
>>> >>>>>
>>> >>>>> -Mikeal
>>> >>>>>
>>> >>>>> On Mon, Aug 9, 2010 at 1:39 PM, Mikeal Rogers <
>>> [email protected]
>>> >>>>> wrote:
>>> >>>>>
>>> >>>>>> Ok, I pulled down this code and tested against a database with a ton
>>> of
>>> >>>>>> missing writes right before a single restart.
>>> >>>>>>
>>> >>>>>> Before restart this was the database:
>>> >>>>>>
>>> >>>>>> {
>>> >>>>>> db_name: "testwritesdb"
>>> >>>>>> doc_count: 124969
>>> >>>>>> doc_del_count: 0
>>> >>>>>> update_seq: 124969
>>> >>>>>> purge_seq: 0
>>> >>>>>> compact_running: false
>>> >>>>>> disk_size: 54857478
>>> >>>>>> instance_start_time: "1281384140058211"
>>> >>>>>> disk_format_version: 5
>>> >>>>>> }
>>> >>>>>>
>>> >>>>>> After restart it was this:
>>> >>>>>>
>>> >>>>>> {
>>> >>>>>> db_name: "testwritesdb"
>>> >>>>>> doc_count: 1
>>> >>>>>> doc_del_count: 0
>>> >>>>>> update_seq: 1
>>> >>>>>> purge_seq: 0
>>> >>>>>> compact_running: false
>>> >>>>>> disk_size: 54857478
>>> >>>>>> instance_start_time: "1281384593876026"
>>> >>>>>> disk_format_version: 5
>>> >>>>>> }
>>> >>>>>>
>>> >>>>>> After repair, it's this:
>>> >>>>>>
>>> >>>>>> {
>>> >>>>>> db_name: "testwritesdb"
>>> >>>>>> doc_count: 1
>>> >>>>>> doc_del_count: 0
>>> >>>>>> update_seq: 124969
>>> >>>>>> purge_seq: 0
>>> >>>>>> compact_running: false
>>> >>>>>> disk_size: 54857820
>>> >>>>>> instance_start_time: "1281385990193289"
>>> >>>>>> disk_format_version: 5
>>> >>>>>> committed_update_seq: 124969
>>> >>>>>> }
>>> >>>>>>
>>> >>>>>> All the sequences are there and hitting _all_docs shows all the
>>> >>>> documents
>>> >>>>>> so why is the doc_count only 1 in the dbinfo?
>>> >>>>>>
>>> >>>>>> -Mikeal
>>> >>>>>>
>>> >>>>>> On Mon, Aug 9, 2010 at 11:53 AM, Filipe David Manana <
>>> >>>> [email protected]>wrote:
>>> >>>>>>
>>> >>>>>>> For the record (and people not on IRC), the code at:
>>> >>>>>>>
>>> >>>>>>> http://github.com/fdmanana/couchdb/commits/db_repair
>>> >>>>>>>
>>> >>>>>>> is working for at least simple cases. Use
>>> >>>>>>> couch_db_repair:repair(DbNameAsString).
>>> >>>>>>> There's one TODO:  update the reduce values for the by_seq and
>>> by_id
>>> >>>>>>> BTrees.
>>> >>>>>>>
>>> >>>>>>> If anyone wants to give some help on this, your welcome.
>>> >>>>>>>
>>> >>>>>>> On Mon, Aug 9, 2010 at 6:12 PM, Mikeal Rogers <
>>> [email protected]
>>> >>>>>>>> wrote:
>>> >>>>>>>
>>> >>>>>>>> I'm starting to create a bunch of test db files that expose this
>>> bug
>>> >>>>>>> under
>>> >>>>>>>> different conditions like multiple restarts, across compaction,
>>> >>>>>>> variances
>>> >>>>>>>> in
>>> >>>>>>>> updates the might cause conflict, etc.
>>> >>>>>>>>
>>> >>>>>>>> http://github.com/mikeal/couchtest
>>> >>>>>>>>
>>> >>>>>>>> The README outlines what was done to the db's and what needs to be
>>> >>>>>>>> recovered.
>>> >>>>>>>>
>>> >>>>>>>> -Mikeal
>>> >>>>>>>>
>>> >>>>>>>> On Mon, Aug 9, 2010 at 9:33 AM, Filipe David Manana <
>>> >>>>>>> [email protected]
>>> >>>>>>>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>>> On Mon, Aug 9, 2010 at 5:22 PM, Robert Newson <
>>> >>>>>>> [email protected]
>>> >>>>>>>>>> wrote:
>>> >>>>>>>>>
>>> >>>>>>>>>> Doesn't this bit;
>>> >>>>>>>>>>
>>> >>>>>>>>>> -        Db#db{waiting_delayed_commit=nil};
>>> >>>>>>>>>> +        Db;
>>> >>>>>>>>>> +        % Db#db{waiting_delayed_commit=nil};
>>> >>>>>>>>>>
>>> >>>>>>>>>> revert the bug fix?
>>> >>>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> That's intentional, for my local testing.
>>> >>>>>>>>> That patch isn't obviously anything close to final, it's too
>>> >>>>>>> experimental
>>> >>>>>>>>> yet.
>>> >>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> B.
>>> >>>>>>>>>>
>>> >>>>>>>>>> On Mon, Aug 9, 2010 at 5:09 PM, Jan Lehnardt <[email protected]>
>>> >>>>>>> wrote:
>>> >>>>>>>>>>> Hi All,
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Filipe jumped in to start working on the recovery tool, but he
>>> >>>>>>> isn't
>>> >>>>>>>>> done
>>> >>>>>>>>>> yet.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Here's the current patch:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> http://www.friendpaste.com/4uMngrym4r7Zz4R0ThSHbz
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> it is not done and very early, but any help on this is greatly
>>> >>>>>>>>>> appreciated.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> The current state is (in Filipe's words):
>>> >>>>>>>>>>> - i can detect that a file needs repair
>>> >>>>>>>>>>> - and get the last btree roots from it
>>> >>>>>>>>>>> - "only" missing: get last db seq num
>>> >>>>>>>>>>> - write new header
>>> >>>>>>>>>>> - and deal with the local docs btree (if exists)
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Thanks!
>>> >>>>>>>>>>> Jan
>>> >>>>>>>>>>> --
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> --
>>> >>>>>>>>> Filipe David Manana,
>>> >>>>>>>>> [email protected]
>>> >>>>>>>>>
>>> >>>>>>>>> "Reasonable men adapt themselves to the world.
>>> >>>>>>>>> Unreasonable men adapt the world to themselves.
>>> >>>>>>>>> That's why all progress depends on unreasonable men."
>>> >>>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>> Filipe David Manana,
>>> >>>>>>> [email protected]
>>> >>>>>>>
>>> >>>>>>> "Reasonable men adapt themselves to the world.
>>> >>>>>>> Unreasonable men adapt the world to themselves.
>>> >>>>>>> That's why all progress depends on unreasonable men."
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>>>>
>>> >>>>
>>> >>
>>> >
>>>
>>>
>>
>

Re: data recovery tool progress

Reply via email to