Another update.  This morning I took a different tack and, rather than try to 
find root nodes, I just looked for all kv_nodes in the file and treated each of 
those as a separate virtual DB to be replicated.  This reduces the algorithmic 
complexity of the repair, and it looks like testwritesdb repairs in ~30 minutes 
or so.  Also, this method results in the lost+found DB containing every 
document, not just the missing ones.

My branch does not currently include Randall's parallelization of the 
replications.  It's still CPU-limited, so that may be a worthwhile 
optimization.  On the other hand, I think we may be reaching a stage at which 
performance for this repair tool is 'good enough', and pmaps can make error 
handling a bit dicey.

In short, I think this tool is now in good shape.

http://github.com/kocolosk/couchdb/tree/db_repair

Reply via email to