----- aku at zXHaWG+kLKkd+kWTIIPLcB3mpv4 ----- 2007.02.17 - 01:42:41GMT -----
The bdb library comes with some built in utilities which can be used to examine the databases you have stored on disk. Before getting started I'm going to explain something important. The data in your store is not kept in the database. The database only contains metadata which is used to find and manage the content itself which is saved in special data files. The actual store data is in these 6 files in the Freenet root directory: chk-12345.cache chk-12345.store ssk-12345.cache ssk-12345.store pubkey-12345.cache pubkey-12345.store The database is in the directory '/database-12345' off of the Freenet root directory and contains files which look like this: -rw-r--r-- 1 aku aku 9999979 Feb 15 23:06 00000054.jdb -rw-r--r-- 1 aku aku 9998161 Feb 16 03:01 00000055.jdb -rw-r--r-- 1 aku aku 9999139 Feb 16 17:12 00000056.jdb -rw-r--r-- 1 aku aku 9999996 Feb 16 20:54 00000057.jdb -rw-r--r-- 1 aku aku 3493292 Feb 16 21:59 00000058.jdb -rw-r--r-- 1 aku aku 0 Jan 18 13:54 je.lck We will begin by using the 'DbDump' command to list the 'tables' stored in the database. ('-l' flag to list them) $ java -cp freenet-ext.jar com.sleepycat.je.util.DbDump -h ./database-12345 -l chk-cache-CHK chk-cache-CHK_accessTime chk-cache-CHK_blockNum chk-store-CHK chk-store-CHK_accessTime chk-store-CHK_blockNum pubkey-cache-CHK pubkey-cache-CHK_accessTime pubkey-cache-CHK_blockNum pubkey-store-CHK pubkey-store-CHK_accessTime pubkey-store-CHK_blockNum ssk-cache-CHK ssk-cache-CHK_accessTime ssk-cache-CHK_blockNum ssk-store-CHK ssk-store-CHK_accessTime ssk-store-CHK_blockNum Ok, that's 18 different 'tables' (BDB documentation calls them databases) in the database (BDB calls the whole thing the environment). You can see from the names that each store data file has 3 associated tables. The ones which end with _accessTime store data about which records have been accessed most recently and _blockNum stores an index which is ordered by data block number (why? just don't even ask). The main table is indexed by a 32 byte content key (the Freenet 'routing' keys). With the tool DbStat we can take a closer look at one of these. Let's start with chk-cache-CHK which is the main table of metadata for the CHK key cache. $ java -cp freenet-ext.jar com.sleepycat.je.util.DbStat -h ./database-12345 -s chk-cache-CHK numBottomInternalNodes=354 level 1: count=354 numInternalNodes=5 level 2: count=4 level 3: count=1 numLeafNodes=15344 numDeletedLeafNodes=0 numDuplicateCountLeafNodes=0 mainTreeMaxDepth=3 duplicateTreeMaxDepth=0 The most interesting thing here is numLeafNodes, which is the count of records in this table. Let's look at some stats in FProxy: Store size * Cached keys: 15,344 (479 MiB) * Stored keys: 15,344 (479 MiB) * Overall size: 30,688 / 30,688 (959 MiB / 959 MiB) (100%) (Yes, I know what a stingy motherfucking network leech I am) So that matches the information in the database. 15344 keys (for 15344 data blocks) in the CHK cache and 15344 records to describe them. How about those other tables, the access time and block ordering: $ java -cp freenet-ext.jar com.sleepycat.je.util.DbStat -h ./database-12345 \ -s chk-cache-CHK_accessTime [ ... ] numLeafNodes=15344 [ ... ] $ java -cp freenet-ext.jar com.sleepycat.je.util.DbStat -h ./database-12345 \ -s chk-cache-CHK_blockNum [ ... ] numLeafNodes=15344 [ ... ] They are both the same size as the main table, which is no surprise since they are indices on the main table. Because of the way the store is allocated, each of the 6 stores and caches use exactly the same number of keys. For 1GB of data (my store size, the default), about 15K keys are allocated to each one, but there are 6 of them, and each one has 3 tables with 15K keys. So there is 18 times as much metadata as there is in one table. BDB comes with a utility called DbCacheSize which can make some calculations based of database parameters you provide and try to guess how much cache memory your database will require. Here is an example using the Freenet data and a table size of 15344 records. $ mkdir testdb $ java -Xmx256m -cp freenet-ext.jar com.sleepycat.je.util.DbCacheSize -records 15344 \ -key 32 -data 16 -measure testdb -measurerandom Inputs: records=15344 keySize=32 dataSize=16 nodeMax=128 density=80% overhead=10% Cache Size Btree Size Description -------------- -------------- ----------- 1,132,231 1,019,008 Minimum, internal nodes only 1,434,880 1,291,392 Maximum, internal nodes only 2,086,968 1,878,272 Minimum, internal nodes and leaf nodes 2,389,617 2,150,656 Maximum, internal nodes and leaf nodes Btree levels: 3 Measuring with cache size: 238,937,702 .. Stats for internal and leaf nodes (after insert): CacheSize=5,036,404 BtreeSize=1,890,676 Preloading with cache size: 238,937,702 . Stats for internal nodes only (after preload): CacheSize=4,253,352 BtreeSize=1,107,624 So a few megs of memory here and there, in both the best and worst cases. But don't forget to multiply by 18! :) ---- Most people are going to want a store a little bit larger than the default so lets run the numbers for a 25GB store. 15344 * 25 = 383600 keys per table $ rm -rf testdb $ mkdir testdb $ java -Xmx256m -cp freenet-ext.jar com.sleepycat.je.util.DbCacheSize -records 383600 \ -key 32 -data 16 -measure testdb -measurerandom Inputs: records=383600 keySize=32 dataSize=16 nodeMax=128 density=80% overhead=10% Cache Size Btree Size Description -------------- -------------- ----------- 28,231,288 25,408,160 Minimum, internal nodes only 35,777,600 32,199,840 Maximum, internal nodes only 52,099,733 46,889,760 Minimum, internal nodes and leaf nodes 59,646,044 53,681,440 Maximum, internal nodes and leaf nodes Btree levels: 3 Measuring with cache size: 238,937,702 ....................................... Stats for internal and leaf nodes (after insert): CacheSize=52,348,510 BtreeSize=49,202,782 Preloading with cache size: 238,937,702 . Stats for internal nodes only (after preload): CacheSize=30,774,600 BtreeSize=27,628,872 When you create 18 tables of that size and fill them up you need a cache size of around 500mb - 900mb of memory!! ---- So, how much memory does the new store design I am building use to manage a 25GB store? 25GB = 15344 * 6 * 25 = 2301600 keys total LRU queues 8.8 megs (4 bytes per key) Bloom Filters 2.2 - 3.3 megs (8 - 12 bits per key) B+Tree 24 kbytes ( + 3 disk reads of 4k each to locate data) or B+Tree 2.4 megs ( + 2 disk reads of 4k each to locate data) ----------------------------- 11mb - 15mb If I really wanted to, I could remove 2.2 megs from the LRU queue by using 24 bit indices. ----- toad at zceUWxlSaHLmvEMnbr4RHnVfehA ----- 2007.02.20 - 00:35:20GMT ----- Great work. Will repost to devl. How does BDB calculate how much RAM is determined? What is the basic principle here? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20070220/74a194db/attachment.pgp>