OK, that seems to be working. Thanks for the tip!

One thing that hung me up along the way: in my pruning, I am using a cursor to 
iterate through the records and then mdb_cursor_del() to chop dead ones. 
mdb_cursor_del() seems to set the cursor to the next record, so I'm actually 
starting at the last record and moving backward through the entries with 
mdb_cursor_get(…, MDB_PREV). So far so good.

The problem is that mdb_cursor_get(…, MDB_PREV) will continue to return 0 when 
there are no records left in the database.  Obviously this is easy to work 
around, but it seems like a bug to me.

Thanks so much for your assistance, happy to have this simplified version 
working.

Jeremy
 
Am 11.06.2013 um 22:16 schrieb Howard Chu <[email protected]>:

> Jeremy Bernstein wrote:
>> Thanks Howard,
>> 
>> OK, so I tried this again with a slightly more modest toy database (after 
>> reading the presentation, thanks), 1MB (256 pages). Blasting a bunch of 
>> records into it at once (with a transaction grain of 100 records) I am 
>> getting MDB_MAP_FULL with 1 branch, 115 leaf and 0 overflow nodes. So I 
>> suppose that I can use 1/3 of the database size (85 leaf pages in this 
>> example) as a rough guideline as to when I should prune. My real database 
>> are between 4 and 128MB, 32MB being typical and my real transactions are 
>> generally a bit smaller.
>> 
>> Does that seem reasonable to you, or do I need to be working on a different 
>> scale entirely?
> 
> I doubt that the cutover point will scale as linearly as that, you should 
> just experiment further with your real data.
>> 
>> Jeremy
>> 
>> Am 11.06.2013 um 20:11 schrieb Howard Chu <[email protected]>:
>> 
>>> Your entire mapsize was only 64K, 16 pages? That's not going to work well. 
>>> Please read the LMDB presentations to understand why not. Remember that in 
>>> addition to the main data pages, there is also a 2nd DB maintaining a list 
>>> of old pages, and since LMDB uses copy-on-write every single write you make 
>>> is going to dirty multiple pages, and dirty pages cannot be reused until 2 
>>> transactions after they were freed. So you need enough free space in the 
>>> map to store ~3 copies of your largest transaction, in addition to the 
>>> static data.
>>> 
>>>> Thanks
>>>> Jeremy
>>>> 
>>>> Am 11.06.2013 um 19:32 schrieb Howard Chu <[email protected]>:
>>>> 
>>>>> Jeremy Bernstein wrote:
>>>>>> Although I didn't figure out a good way to do what I want, this is what 
>>>>>> I am
>>>>>> now doing:
>>>>>> 
>>>>>> if (MDB_MAP_FULL while putting) {
>>>>>>   abort txn, close the database
>>>>>>   reopen the database @ larger mapsize
>>>>>>   perform some pruning of dead records
>>>>>>   commit txn, close the database
>>>>>>   reopen the database @ old mapsize
>>>>>>   try to put again
>>>>>> }
>>>>>> 
>>>>>> At this point, the database is probably larger than the old mapsize. To 
>>>>>> handle
>>>>>> that, I make a copy  of the DB, kill the original, open a new database 
>>>>>> and
>>>>>> copy the records from the old DB to the  new one.
>>>>>> 
>>>>>> All of this is a lot more complicated and code-verbose than I want, but 
>>>>>> it
>>>>>> works and seems to be reliable.
>>>>>> 
>>>>>> Nevertheless, if there's an easier way, I'm all ears. Thanks for your
>>>>>> thoughts.
>>>>> 
>>>>> Use mdb_stat() before performing the _put(). If the total number of pages 
>>>>> in use is large (whatever threshold you choose, e.g. 90%) then start 
>>>>> pruning.
>>>>> 
>>>>> Look at the mdb_stat command's output to get an idea of what you're 
>>>>> looking for.
> 
> 
> -- 
>  -- Howard Chu
>  CTO, Symas Corp.           http://www.symas.com
>  Director, Highland Sun     http://highlandsun.com/hyc/
>  Chief Architect, OpenLDAP  http://www.openldap.org/project/


Reply via email to