> Your suggestion was to reconstruct the index from original
> table data on a rollback. For a large transaction that touches
> most pages of the index, this can give (at best) a 2:1 speedup.
> But the other side, the potential slowdown is extreme.
Yeah, there is that drawback. Other DBMSs avoid this problem by some
(decidedly NOT simple) mechanisms such as keeping before images in
main memory, keeping semantic records of changes in main memory,
keeping page serial numbers (to aid recovery), etc.
A decision can be made dynamically to stop recording before-images
(say, because some threshold percentage of the index's pages are being
changed, or because your main memory buffer is full), and transition
to my suggested approach.
Presently if P% of the pages are modified, 2P% pages are written for a
commit, and 3P% for a rollback. With the dynamic scheme, these change
to
smaller of 2P or P + X
smaller of 3P or P + X + 100
percent, respectively, where X is the threshold.
If the threshold was 25% then you'd never write more than 125% of the
size of the index for a committed transaction, and never write more
than 225% for a rolled-back transaction. Note that the present scheme
writes up to 200% for a committed and 300% for a rolled-back
transaction. Here is a table comparing the approaches for various
percentage of index pages modified. The left column is the percentage
of index pages modified, the right columns are the percentage of index
pages written for each of the approaches and situations (present c, r
and dynamic c' and r' for committed and rolled-back).
25% 50% 25% 50%
% c c' c' r r' r'
10 20 20 20 30 30 30
20 40 40 40 60 60 60
30 60 55 60 90 155 90
40 80 65 80 120 165 120
50 100 75 100 150 175 200
60 120 85 110 180 185 210
70 140 95 120 210 195 220
80 160 105 130 240 205 230
90 180 115 140 270 215 240
100 200 125 150 300 225 250
Note that the dynamic approach is always better or equivalent for
committed transactions, and only worse for rolled-back transactions
when the percentage of pages modified is between X and (X+1)/2 where X
is the threshold. Finding an optimal value for X is application
dependent, unfortunately.
Another approach...
One database I implemented used newly allocated pages for all updates
to indexes, relying on the original versions to remain intact so
rollback was "free" (not counting the reclaiming of the free space).
Commit was implemented by writing the pointer to the root of the index
BTree to point to the new pages. The advantage of this method is
journaling for free. The drawback to this method is that even small
changes require writing the BTree along an entire path from leaf to
root so a single disk write can commit the index changes, and a free
space recovery mechanism. Overall it was very reliable and effective.
e
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]