Howard Chu via Cyrus-devel wrote:
Bron Gondwana wrote:
A good example of what I want is the way that the xapianactive file works in
Cyrus
search at FastMail:
https://blog.fastmail.com/2014/12/01/email-search-system/
Because only the most recent database is writable (in this case on tmpfs,
because
we don't need 100% reliability for search, it only takes about 20 minutes to
scan
every mailbox and reindex the stuff that was on tmpfs after a crash)
Also, since you're using tmpfs, this in-memory benchmark is relevant.
http://lmdb.tech/bench/inmem/
Every other database is read-only - and you can compact multiple of them
together
into a single database and then atomically switch the old ones out and the
new one
in with a single very quick xapianactive rewrite - so it's acceptable to
stop the world
while doing that.
This sounds like a lot of bother, particularly the bit about "checking if
tmpfs is full". It's also a bit confusing because you talk about "compacting"
which I interpret as "cleaning out empty/unused space inside a DB" but in
context it sounds like you really mean "merging" - combining multiple DBs into
a single DB.
If I were building this system with LMDB there would be no separate temp and
meta tiers. LMDB would just mmap the DB on the SSD and let the OS buffer cache
keep the hot pages in RAM. I'm not really sure I'd bother with multiple DBs
either, there's nothing to compact. The data tier would be no different from
the meta tier.
When you say you can reindex the stuff on tmpfs quickly, that means you're
only reindexing the most recent N emails?
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/