OK, that seems to be working. Thanks for the tip! One thing that hung me up along the way: in my pruning, I am using a cursor to iterate through the records and then mdb_cursor_del() to chop dead ones. mdb_cursor_del() seems to set the cursor to the next record, so I'm actually starting at the last record and moving backward through the entries with mdb_cursor_get(…, MDB_PREV). So far so good.
The problem is that mdb_cursor_get(…, MDB_PREV) will continue to return 0 when there are no records left in the database. Obviously this is easy to work around, but it seems like a bug to me. Thanks so much for your assistance, happy to have this simplified version working. Jeremy Am 11.06.2013 um 22:16 schrieb Howard Chu <[email protected]>: > Jeremy Bernstein wrote: >> Thanks Howard, >> >> OK, so I tried this again with a slightly more modest toy database (after >> reading the presentation, thanks), 1MB (256 pages). Blasting a bunch of >> records into it at once (with a transaction grain of 100 records) I am >> getting MDB_MAP_FULL with 1 branch, 115 leaf and 0 overflow nodes. So I >> suppose that I can use 1/3 of the database size (85 leaf pages in this >> example) as a rough guideline as to when I should prune. My real database >> are between 4 and 128MB, 32MB being typical and my real transactions are >> generally a bit smaller. >> >> Does that seem reasonable to you, or do I need to be working on a different >> scale entirely? > > I doubt that the cutover point will scale as linearly as that, you should > just experiment further with your real data. >> >> Jeremy >> >> Am 11.06.2013 um 20:11 schrieb Howard Chu <[email protected]>: >> >>> Your entire mapsize was only 64K, 16 pages? That's not going to work well. >>> Please read the LMDB presentations to understand why not. Remember that in >>> addition to the main data pages, there is also a 2nd DB maintaining a list >>> of old pages, and since LMDB uses copy-on-write every single write you make >>> is going to dirty multiple pages, and dirty pages cannot be reused until 2 >>> transactions after they were freed. So you need enough free space in the >>> map to store ~3 copies of your largest transaction, in addition to the >>> static data. >>> >>>> Thanks >>>> Jeremy >>>> >>>> Am 11.06.2013 um 19:32 schrieb Howard Chu <[email protected]>: >>>> >>>>> Jeremy Bernstein wrote: >>>>>> Although I didn't figure out a good way to do what I want, this is what >>>>>> I am >>>>>> now doing: >>>>>> >>>>>> if (MDB_MAP_FULL while putting) { >>>>>> abort txn, close the database >>>>>> reopen the database @ larger mapsize >>>>>> perform some pruning of dead records >>>>>> commit txn, close the database >>>>>> reopen the database @ old mapsize >>>>>> try to put again >>>>>> } >>>>>> >>>>>> At this point, the database is probably larger than the old mapsize. To >>>>>> handle >>>>>> that, I make a copy of the DB, kill the original, open a new database >>>>>> and >>>>>> copy the records from the old DB to the new one. >>>>>> >>>>>> All of this is a lot more complicated and code-verbose than I want, but >>>>>> it >>>>>> works and seems to be reliable. >>>>>> >>>>>> Nevertheless, if there's an easier way, I'm all ears. Thanks for your >>>>>> thoughts. >>>>> >>>>> Use mdb_stat() before performing the _put(). If the total number of pages >>>>> in use is large (whatever threshold you choose, e.g. 90%) then start >>>>> pruning. >>>>> >>>>> Look at the mdb_stat command's output to get an idea of what you're >>>>> looking for. > > > -- > -- Howard Chu > CTO, Symas Corp. http://www.symas.com > Director, Highland Sun http://highlandsun.com/hyc/ > Chief Architect, OpenLDAP http://www.openldap.org/project/
