Ivan Unknown wrote: > Hello! > > I have been looking at the LMDB project code trying to learn and understand > how a database could be implemented; however, I have been struggling to > answer the > questions below for quite some time, partially due to my limited knowledge of > C: > > - How does LMDB store keys and values in the page? I have learned that a page > consists of the header, slots, and key-value pairs but how does the database > handle keys and values that are too large to fit within a page?
That is already documented in the code and Doxygen. > > - I found that LMDB does not keep a fixed number of keys per page, so that > would depend on the key-value pairs' sizes already inserted into the page. Is > this > correct? Correct. This is a major difference from textbook Btree or B+tree implementations, but it is essential for good storage utilization. > Does it mean that B+tree pages (branch or leaf) could have a different number > of keys depending on the key size? Yes. > How does it affect performance or implementation of the B+tree? No particular impact. > > - Are there any limitations on the size of a key? Yes. > Can it be of an arbitrary length? No. There is work underway to remove length limits on keys in LMDB 1.0 but that feature isn't working yet. > > - What is the difference between IS_LEAF and IS_LEAF2 flags in the page > header? What is the difference between these pages? That is already documented in the code and Doxygen. > > - How do overflow pages work in LMDB? From what IĀ could understand, if a key > or a value does not fit in the page, it will be stored in the overflow page > (the > entire page is allocated for that specific key or value). Is this correct? Yes for values. Not for keys since they have a max length smaller than a page. > What happens when the key size is several times larger than the page size, > e.g. 1MB > value with 4KB pages? That is already documented in the code and Doxygen. > > - What is a sub-page in LMDB (F_SUBDATA)? How does it work? That is already documented in the code and Doxygen. > I would greatly appreciate it if someone could share links to the > documentation that covers internals of the database, online videos, research > papers, mailing > lists, or any notes you could share to help me understand the above. Thank > you very much! Doxygen docs are embedded in the source code already. You can format them using the doxygen tool. Other info is linked at https://www.symas.com/symas-lmdb-tech-info > Cheers, > Ivan > -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
