Here are some ideas on how I would like to structure the documentation.
For every topic, I'd like to create: Introduction to concepts - A general introduction to the topic. Derby implementation details - This will be main substance of the document where I will describe how Derby works. References - these will point to published papers, books, etc. that discuss the topic in question. In terms of topics, here is what I have come up with (let me know if things should be added): Terms - this will be a glossary of Derby specific terms so that I don't have to keep explaining the same terms in every document. Row ID - How rows are identified - RecordId and SlotId. Row management within a page - storage format of rows, slot table, etc. Handling of large rows - how does Derby handle rows that won't fit into one page. Container - what is a container? Space management in Containers - how is space management implemented? How does Derby locate an empty page? Latches - are latches used by Derby? How are they implemented? Lock management - description of lock management, lock conflict resolution, deadlocks, lock escalation. Buffer cache - how does Derby implement the buffer cache, and what is interaction between Buffer cache and Log, and Buffer cache and Transaction manager. Write ahead log - description of how the log is implemented - this would mainly cover the format of log records, how log records are represented in memory and in disk, log archiving, checkpointing, etc. Transactions - how is an Xid allocated? What is the representation of a transaction? How is a transaction related to a thread? Transaction Manager - description of how Derby implements ARIES. What happens at system restart. How rollbacks and commits work. Different types of log records used by the transaction manager - such as do, redo, undo, compensation, checkpoint, etc. Row locking in tables - how are rows locked? What happens when a row spans pages? Row recovery - Do/Redo/Undo actions for rows - inserts, updates, deletes. BTree - page organisation and structure BTree - concurrency - how does Derby handle concurrent updates to the tree - inserts and deletes? How are structural changes serialised? Do updates block readers (not as in locking but while the change is being made) or can they progress concurrently? BTree - locking - data row locking or index row loacking? Is next-key locking used for serializable reads? BTree - recovery - Do/Redo/Undo actions for key inserts, updates, deletes. Row scans on tables - is this handled by "store"? Row scans in BTrees - is this handled by "store"? Conglomerates - what is a Conglomerate?
