Re: [sqlite] presentation about ordering and atomicity of filesystems

Nico Williams Mon, 15 Sep 2014 14:45:12 -0700

On Fri, Sep 12, 2014 at 6:47 PM, James K. Lowden
<jklow...@schemamania.org> wrote:
> On Fri, 12 Sep 2014 19:38:53 +0100
> Simon Slavin <slav...@bigfraud.org> wrote:
>
>> I don't think it can be done by trying to build it on top of an
>> existing file system.  I think we need a file system (volume format,
>> drivers, etc.) built from the ground up with
>> atomicity/ACID/transactions in mind.  Since the greatest of these is
>> transactions, I want a transactional file system.
>
> Funny you should mention that.  About 6 years ago Mike McKusick gave a
> presentation on then-recent updates to FFS in FreeBSD, including the
> birthdate.  Among other things, I remember he explored using a tree
> instead of an array for a directory file, but found that because the
> vast majority of directories hold a small number of names, the overall
> performance is better with a simple array.


ZFS uses a hash table for this.

> I asked your question: why not add transactions to FFS?
>
> His answer: that's the province of a database.

I agree, but the filesystem ought to provide a write barrier, and it
ought to provide an async fsync() with event completion notification.
That should be enough to implement high-performance ACID at the
application layer.

ZFS provides a write barrier: but it's fsync(), so if you want it to
go fast you must either disable sync writes (oof) or use a fast intent
log device (oof).

With a first-class write barrier we could have both, it and synchronous writes.

That's what the proposed osync() is all about, and I say godspeed to them!

Of course, the biggest problem with new filesystem interfaces is
adoption, but here a handful of apps (e.g., SQLite3) could adopt
osync() very quickly, vastly improving safety and performance for a
great many users.

Nico
--
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] presentation about ordering and atomicity of filesystems

Reply via email to