Thoughts about DIRSERVER-1663

Emmanuel Lecharny Mon, 03 Oct 2011 02:19:28 -0700

Hi guys,

this error is a pretty annoying one. We had a convo with Selcuk lastfriday about it, which is sum up here.

Basically, what happens is that when we have multiple threads doing asearch while some other are adding/deleting some entries which arepotentially part of the returned results, we get some NPE. This is dueto the fact that we use a cursor on an index which uses IDs of entriesthat can have been removed when we try to read them.

The discussion we had led to the fact that we need to implement atransaction system to protect the client from such problem. This canprobably be implemented on top of what we have, even if it kills theperformances.

OTOH, at some point, what we really need is to implement a MVCC systemon top of the backend.

MVCC is a system which keeps old versions of elements until they aren'tneeded anymore. For instance, when we do a search, we will browse someentries using their IDs, provided by an index. When we start the search,we select the best possible index to browse the entries, and we get backa set of IDs. If we associate this operation with an unique transactionID, we must guarantee that all the IDs from the set will be presentuntil the cursor is totally read (or the search cancelled). If amodification is done on one of the entry associated with one of thoseIDs, then we still should be able to access to the previous entry. Sucha modification must create a copy of the entry itself, but also of allthe tuples in the indexes, associated with a revision number. Theincoming transaction will use this revision number to get an immutableIDs set.

Now, at some point, that will create a hell lots of new entries andtuples in index tables. We must implement a system to clean up thoseduplicates once they are not in use. There are two ways to handle such aclean up :- keep all the duplicates in the backend, removing them when nooperation is associated with the old revision- or create a rollback table, where the old elements are stored, with alimited size

The second solution is what Oracle is using. It's efficient, except whenyou have to grab old revisions, as you don't have to update the maindatabase. All the old elements are simply pushed into this rollbacktable (rollback segment), and are available as long as they are notpushed out by newer elements (the table has a limited size).

Postgresql has implemented the first solution. The biggest advantage isthat you can't have an error, but the database may be huge. You alsoneed a thread to do the cleanup.

In any case, I just wanted to initiate a discussion about this problemand the potential solutions, so feel free to add your vision andknowledge in your response. It would be valuable to define a roadmap forsuch an implementation, and to discuss the different steps before divinginto the code...


Thanks !

--
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com

Thoughts about DIRSERVER-1663

Reply via email to