Rich Pieri wrote on 2026-01-22 08:02:

You dropped the "arbitrary". When everything fits neatly into
tables or key/value stores then sure, a database might work. Email
messages are not neat. They are very much like medical records:
arbitrary in size and structure. Few databases can deal well with
this kind of data.

Quite a few actually *can* deal with this kind of data; it's been a
fairly mature field for a decade or more already.


You can read about databases designed to work with unstructured data at
your leisure:

https://en.wikipedia.org/wiki/NoSQL


You can read up on all of Oracle's failures in the EMR space, and the history of WinFS, at your own pace, as examples of this.

So, two companies that are bad at software had failures, so in this
field a database is doomed to failure?  Disagree.


What about Gmail? They use a database (from which RocksDB was forked) and it undeniably works:

Google Infrastructure Supporting Gmail

Technology      Purpose

MapReduce       Processes large volumes of data, such as email indexing

BigTable        Stores structured metadata and user preferences

https://www.dhiwise.com/post/understanding-gmail-architecture-a-
comprehensive-guide








In this case, Stalwart is attempting, with funding from the European Commission’s Next Generation Internet programme¹ and GitHub's OSSF (Open Source Secure Fund)², to replace reliance on Gmail / Yahoo / Outlook / iCloud / etc.





With the goal of competing on a scale of

to host hundreds of millions of email accounts reliably? How do they
store petabytes of messages, survive hardware failures without losing data, and keep spam at bay across billions of daily deliveries?

the thought of billions (or millions) of Maildir files per day is laughable.


A *lot* of people use web mail, so searching those millions / billions of messages *must* be fast.




I'm not yet sold on converting to Stalwart - it's extremely promising and that feels uncommon these days.


But I've encountered some issues that may be its fault, or the reverse proxy's fault, or KDE's fault, or my fault,... Undiagnosed.


And the schema for their data is all serialized key/value storage, even in PostgreSQL. That bothers me a bit and I'll have to evaluate some more.


What these developers have created in a short time is so highly polished and feature complete that I have nothing but respect for their abilities and reasonable confidence in their skills & choices.

We'll see.




Their discussion / recommendations on data storage are here:

https://stalw.art/docs/install/store/


¹ https://stalw.art:8443/blog/github-ossf#about-githubs-ossf

² https://stalw.art/blog/nlnet-grant-collaboration/

_______________________________________________
Discuss mailing list
[email protected]
https://lists.blu.org/mailman/listinfo/discuss

Reply via email to