Not that anyone cares or doesn't know, but…
Files systems seem simple (store blobs of arbitrary, opaque
data--files--that have arbitrary text strings for names of limited
length, and are usually organized in a hierarchy, plus a few other
features) but they are complicated, are vital to be reliable, and hard
to get right.
Databases care very much care about the data they store, care deeply
about the "naming" of the data, usually offer complicated ways of
organizing data, offer a more complicates set of features than do file
systems, are vital to be reliable, and hard to get right.
They are different. A common case is to use them together. Store the
metadata in a database (title, genre, date, running time, director,
actors, screenwriter, MPEG path). And index most of that metadata to
make it easy to search for things. But the actually fundamental data
itself, the stuff we really care about (in this example MPEG data) is
probably not going to be stored in the database, but will be opaque
blobs---files--hashed into a directory path, stored in a file system.
Use the database for the stuff it is good at, use the file system for
the stuff it is good at. Appreciate the difference.
Even if one gets into new "AI" stuff where the database might know much
more about the data and can search on lots of fuzzy internal details,
this extra knowledge is still just a kind of indexing, and the
fundamental data will still be stored out in files.
In the case of e-mail there is naturally a bunch of structured metadata,
and it is very suited to store in a database. There might be interesting
indexing on the contents of e-mail, but the bodies of the messages
(which might range from several bytes long to megabytes long and might
be any kind of text-represented data) are really well suited to live in
files. Maybe the e-mail system wants to digest various standards for
attachments, but those are even more suited to be stored in files.
Different file systems will have different features, different databases
will, too. In some cases the two will blur into each other, but they
should be thought about differently as one chooses how to use one of
another in any design.
Don't underestimate how hard a file system is to make. My wife's work
uses Google web apps but they might switch to MS's competing products
(horrors). One of the complaints is Google can't reliably store "files",
they move around and get lost. Maybe Google stores the files themselves
as blobs out in a file system, but all the metadata about the file,
including the simulated "location" that is presented to the user, is
being stored in a database. And it is hard to get that right.
(Particularly in this world of continuous integration/continuous
deployment, that worships feature velocity, is not designed before it is
built, and fired the QA department.)
-kb, the Kent who will shut up now.
_______________________________________________
Discuss mailing list
[email protected]
https://lists.blu.org/mailman/listinfo/discuss