IO operationsPage edited by Emmanuel LécharnyChanges (2)
Full ContentOne of the major burden and cost of managing a BTree efficiently is when it comes to write on a disk. We don't really care too much if the in-memory operation on a BTree are as fast as possible, considering that writing on a physical support is three order of magnitude slower than writing in memory. Physical supportNow, there are different aspects we should also consider when we think about IOs : the data will be written on a physical memory, but depending on the kind of physical device we are using, the performances may vary a lot. We basically have three kind of physical support :
Spinning hard diskThere are a lot of different disks, but basically we will consider that the rotation speed is one of the most important factor. Rotation speed dictates the seek time (the time it takes to move the head at the right place on disk). The following table shows the impact on a seek for various speed rotation :
Add to this latency the time it takes to move the head to the right sector, and the time it takes to transfer the data from the disk to the memory. Anyway, enough to know that in order to improve the performance of a BTree, we should minimize the disk IO, and also have all the data being contiguous on disk to avoid time consuming seek operations. SSDSSD are working totally differently, and one other factor has to be put into the big picture : writing data on a SSD are destructive in the long run, and when we modify something on a SSD, we will write a block, not a page (a block can be quite big, something like 2Mb). So if we can differ the write until we get enough data to write into a block, that would be better. NASWe could also think about a solution where the data are pushed to a NAS, as it will have different kind of performances. Relation between the in-memory B+Tree and physical supportA MVCC Btree in-memory is a good thing, but at some point we are limited by two factors :
There are two ways to mitigate those constraints :
We will describe the two strategies in the next paragraphs. Persistent In-Memory BTreeThe idea is to keep all the btree in memory, while saving the newly added/removed data on disk, so that we can reload the bTree at startup. This is done using a Journal which is flush on disk periodically by a separate thread. We may still lose some data, but once the journal is written on disk, we can restore the BTree from what we have on disk. In order to keep it simple, we don't modify a file containing some data : we create a new one with the current tree content. When we load this file into the in-memory BTree, we then have to apply the journal to get back into the same state than when we stopped the process.
Change Notification Preferences
View Online
|
View Changes
|
Add Comment
|
- [CONF] Apache Labs > IO operations confluence
- [CONF] Apache Labs > IO operations confluence
- [CONF] Apache Labs > IO operations confluence
- [CONF] Apache Labs > IO operations confluence
- [CONF] Apache Labs > IO operations confluence
