https://bugs.openldap.org/show_bug.cgi?id=10395
Issue ID: 10395
Summary: Support multiple readers on uncommitted changes
Product: LMDB
Version: unspecified
Hardware: All
OS: All
Status: UNCONFIRMED
Keywords: needs_review
Severity: normal
Priority: ---
Component: liblmdb
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: ---
Created attachment 1086
--> https://bugs.openldap.org/attachment.cgi?id=1086&action=edit
A patch to support multiple readers on uncommitted changes
Hello,
The attached patch is not meant to be merged immediately into LMDB. Still, it
demonstrates how I added a helpful feature to the key-value store: reading
uncommitted changes from multiple transactions. I am conscious that the patch
still requires some work and uses non-C99 features, i.e., atomics are C11,
which could be a blocker for it to be merged upstream. I would also be
delighted to merge these changes under a flag/define to ensure we don't impact
other users with non-C99 stuff.
The main feature we need at Meilisearch is to read uncommitted changes from
multiple threads, compute parallel post-processing of different data structures
[1], and speed up the search requests. We could have done the post-processing
in a following transaction by opening multiple read transactions, but this
would mean that the post-processed data structure would not include newly
inserted or modified document IDs. Both data structures would be desync.
Regarding the design choice, I decided to follow the same design as the nested
write transactions: use the parent argument of the mdb_txn_begin [2], and allow
the MDB_RDONLY flag, which was disallowed when the parent argument was non-NULL
[3]. I find it clear enough that, by calling the mdb_txn_begin function with
these arguments, you can call it multiple times (I need to update the doc) to
obtain nested read-only transactions from the parent write transaction.
ret = mdb_txn_begin(env, parent_txn, MDB_RDONLY, new_nested_rtxn);
Note that this early proposal lacks security and error handling. The generated
transactions are fake-read-only and actually write transactions that share the
underlying parent allocations and data structures. This is unsafe and must be
changed or reviewed carefully, but most importantly, we need to add read-only
capabilities to these transactions to disallow writes. Using a Rust wrapper on
top of LMDB, I wrapped the fake read-only transactions into ReadTxn, which
disallows any writes at compile time. However, I haven't checked the conflict
database creations or openings.
The main issues I encountered were concurrent free of the main shared data
structures when the different threads owning the transactions were dropping the
transactions simultaneously. So, I decided to implement the equivalent of an
ARC to free resources only when the last nested transaction was freed.
I can share numbers about how this feature improves the post-processing by
4x-9x or from 1200s to 120s [1]. You can look at this PR, which I would be
happy to merge once an improved version of this patch lands on LMDB upstream.
I would be very happy if you could guide me a bit on how I could improve this
patch to make it mergeable into LMDB. We want to contribute useful features
like this to LMDB and not keep a deviant fork. LMDB works great; we are happy
about it, and its performance is predictable.
Have a lovely week,
kero
[1]: https://github.com/meilisearch/meilisearch/pull/5307
[2]:
https://github.com/LMDB/lmdb/blob/14d6629bc8a9fe40d8a6bee1bf71c45afe7576b6/libraries/liblmdb/lmdb.h#L979-L1017
[3]:
https://github.com/LMDB/lmdb/blob/14d6629bc8a9fe40d8a6bee1bf71c45afe7576b6/libraries/liblmdb/mdb.c#L3225-L3226
--
You are receiving this mail because:
You are on the CC list for the issue.