Berkeley DB Java Edition is not sufficiently robust for our usage. By robust I 
mean we need to be able to automatically recover from a corrupted datastore 
on startup. We cannot do this with BDBJE. At least, not without a lot of 
hacking...

My latest effort was to try to automatically use the documented recovery 
procedure of DbDump -r and DbLoad. The problem is that this produces 
duplicate elements in the database, which cannot afaics be cleaned 
automatically. This results in it being impossible to reconstruct the 
secondary index for block numbers. Observe:

INFO   | jvm 1    | 2007/05/16 15:10:47 | Opening block db index
INFO   | jvm 1    | 2007/05/16 15:10:49 | Database chk-cache-CHK_blockNum does 
not exist deleting it
INFO   | jvm 1    | 2007/05/16 15:10:49 | Reconstructing block numbers 
index... (com.sleepycat.je.DatabaseNotFoundException: (JE 3.2.23) Database 
chk-cache-CHK_blockNum not found.)
INFO   | jvm 1    | 2007/05/16 15:10:49 | Creating new block DB index
INFO   | jvm 1    | 2007/05/16 15:10:58 | Error opening block nums db: 
com.sleepycat.je.DatabaseException: (JE 3.2.23) Could not insert secondary 
key in chk-cache-CHK_blockNum OperationStatus.KEYEXIST
INFO   | jvm 1    | 2007/05/16 15:10:58 | com.sleepycat.je.DatabaseException: 
(JE 3.2.23) Could not insert secondary key in chk-cache-CHK_blockNum 
OperationStatus.KEYEXIST
INFO   | jvm 1    | 2007/05/16 15:10:58 |       at 
com.sleepycat.je.SecondaryDatabase.insertKey(SecondaryDatabase.java:698)
INFO   | jvm 1    | 2007/05/16 15:10:58 |       at 
com.sleepycat.je.SecondaryDatabase.updateSecondary(SecondaryDatabase.java:567)
INFO   | jvm 1    | 2007/05/16 15:10:58 |       at 
com.sleepycat.je.SecondaryDatabase.init(SecondaryDatabase.java:181)
INFO   | jvm 1    | 2007/05/16 15:10:58 |       at 
com.sleepycat.je.SecondaryDatabase.initNew(SecondaryDatabase.java:118)
INFO   | jvm 1    | 2007/05/16 15:10:58 |       at 
com.sleepycat.je.Environment.openDb(Environment.java:473)
INFO   | jvm 1    | 2007/05/16 15:10:58 |       at 
com.sleepycat.je.Environment.openSecondaryDatabase(Environment.java:375)
INFO   | jvm 1    | 2007/05/16 15:10:58 |       at 
freenet.store.BerkeleyDBFreenetStore.<init>(BerkeleyDBFreenetStore.java:639)
INFO   | jvm 1    | 2007/05/16 15:10:58 |       at 
freenet.store.BerkeleyDBFreenetStore.openStore(BerkeleyDBFreenetStore.java:383)
INFO   | jvm 1    | 2007/05/16 15:10:58 |       at 
freenet.store.BerkeleyDBFreenetStore.construct(BerkeleyDBFreenetStore.java:150)
INFO   | jvm 1    | 2007/05/16 15:10:58 |       at 
freenet.node.Node.<init>(Node.java:1333)
INFO   | jvm 1    | 2007/05/16 15:10:58 |       at 
freenet.node.NodeStarter.start(NodeStarter.java:148)
INFO   | jvm 1    | 2007/05/16 15:10:58 |       at 
org.tanukisoftware.wrapper.WrapperManager$12.run(WrapperManager.java:2788)

What the code has done is dumped the databases into .dump files using DbDump, 
deleted the old logs, and restored them via DbLoad. The secondary indexes (on 
access time and block number) are to be restored. On access time, duplicates 
are allowed, but on block number, duplicates are not allowed, because only 
one key can be stored in any given block. At the beginning of the above, the 
secondary index does not exist:
Reconstructing block numbers index... 
(com.sleepycat.je.DatabaseNotFoundException: (JE 3.2.23) Database 
chk-cache-CHK_blockNum not found.)

So we open the secondary database with AllowCreate and AllowPopulate enabled. 
And it throws a DatabaseException as above, as far as I can see because there 
are blocks recovered in the main database with the same block number. So our 
code picks this up and takes the only option available to it: It deletes the 
database and reconstructs from the contents of the store file. There are two 
main problems with this:
1) It doesn't work at all for SSKs. The SSK store is dropped on every 
reconstruction, because we don't have the key, and can't generate it from the 
headers. The fix is to store the key somewhere. We will do this, eventually.
2) It doesn't restore the LRU order of the stored data. A fix for this would 
be to store the access time counter somewhere on disk too.

So our options are:
1) Open the block numbers database with sorted duplicates enabled. Then scan 
through it for dupes, keep the correct one in each case, close the database, 
and re-open it again with sorted duplicates disabled.
2) Keep the block numbers index open with sorted duplicates enabled. When we 
need to look up a block number, deal with the fact that there may be multiple 
keys referring to it, and delete as appropriate. Deal with the fact that this 
may cause keys in the main database that don't exist in the store.
3) Improve the data stored on disk: Store the LRU access time, and the key, on 
disk. Probably we would need migration code.
4) Use a completely different database for the index.
5) Use a completely different database for the whole store, and trust it not 
to lose the data. (from our experience with BDB, it is likely that it will!)
6) Roll our own database code.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20070516/7e6fc42b/attachment.pgp>

Reply via email to