On Fri, Oct 22, 2021 at 8:39 AM Miles Malone
<[email protected]> wrote:
>
> small files... (Certainly dont quote me here, but wasnt JFS the king
> of that back in the day?  I cant quite recall)
>

It is lightning fast on lizardfs due to garbage collection, but
metadata on lizardfs is expensive, requiring RAM on the master server
for every inode.  I'd never use it for lots of small files.

My lizardfs master is using 609MiB for 1,111,394 files (the bulk of
which are in snapshots, which create records for every file inside, so
if you snapshot 100k files you end up with 200k files).  Figure 1kB
per file to be safe.  Not a big deal if you're storing large files
(which is what I'm mostly doing).  Performance isn't eye-popping
either - I have no idea how well it would work for something like a
build system where IOPS matters.  For bulk storage of big stuff though
it is spectacular, and scales very well.

Cephfs also uses delayed deletion.  I have no idea how well it
performs, or what the cost of metadata is, though I suspect it is a
lot smarter about RAM requirements on the metadata server.  Well,
maybe, at least in the past it wasn't all that smart about RAM
requirements on the object storage daemons.  I'd seriously look at it
if doing anything new.

Distributed filesystems tend to be garbage collected simply due to
latency.  There are data integrity benefits to synchronous writes, but
there is rarely much benefit on blocking on delections, so why do it?
These filesystems already need all kinds of synchronization
capabilities due to node failures, so syncing deletions is just a
logical design.

For conventional filesystems a log-based filesystem is naturally
garbage-collected, but those can have their own issues.

-- 
Rich

Reply via email to