On 18 Jun 2008, Juliusz Chroboczek outgrape:

>>> All modern file-systems use either B-trees or B*-trees for
>>> directories.
>
>> As an aside, ext[234]fs do not, nor will they ever as far as I know. FFS
>> does not. A lot of Polipo-using systems probably run atop those :/
>
> Indeed, and I cannot understand why filesystems still use linear
> search in directories.  I realise that system designers don't want to
> use bleeding-edge algorithms, but hashtables have been known since the
> late 1950s, B-tress since the early 1970s.

I think the constraints of NFS support (-> a 32-bit cookie must be able
to hold all state, persistently, across reboots), and the horrible and
almost totally unused seekdir()/telldir() syscalls have a nasty effect
on tree-based filesystems. :/ plus, say what you like about ext2, it's
simple and will work even on nanoscopic systems. (Of course, these
days such systems probably don't have any rotating storage at all,
making ext2 inappropriate...)

>> (ext3fs and ext4fs have a hack whereby filenames are reordered such
>> that they can be looked up by means of a hash lookup, which is just
>> about as good for most purposes.)
>
> And modern versions of Berklix will load the whole directory in
> memory, where it is stored as a hash table.  But I still don't
> understand why they're avoiding a proper data structure in the on-disk
> format.

Simplicity of recovery is part of it. ext[234] fsck is *very* good at
recovering data from horrendously mangled filesystems, and we still have
the horrible example of reiser3 fsck to tell us how not to do
it. Everyone agrees, these days, that the right way to do this is to
embed integrity management into the filesystem rather than needing
anything that sweeps over the whole fs at any time, but it's hard to
rejig existing fses that way.

I'm not aware of any *new* filesystem development that's not tree-based:
ext4 is an incremental development of ext3 rather than something new.
Among other things, doing flash-based storage robustly pretty much
requires tree-based filesystems working sort of like a cross between the
old trees-on-tape schemes from Knuth (so the representation is linear
and makes maximally efficient use of the relatively large eraseblock-
sized writes) and the way functional languages do mutable graphs, with
rewriting from the leaves up on changes (but I don't need to tell *you*
about that!)

(btrfs is especially interesting to me. Having unbounded numbers of
nearly-free filesystem/directory snapshots seems neat.)

>> That's the performance problem in Firefox, for what it's worth: the db
>> fsync()s fairly frequency,
>
> I don't quite agree with your analysis.  The performance problem with
> Firefox is that it's written by a bunch of people who, when they have
> a problem, think ``I know, I'll use a database.''  Now they have two
> problems.

Well, yeah, that too. I entirely agree with JWZ's opinions on this
subject. (I do relational DB stuff for a living. I *know* how much it
sucks.)

>   http://www.jwz.org/doc/mailsum.html

But it's better than what they were using before! Mind you, anything
else would be impossible. `mork' is the single most hairbrained file
format I have ever heard of, bar none. It even beats financial market
protocols, which generally start at `demented' and go up in aggravation
from there. Read and weep:

<http://jwz.livejournal.com/312657.html>


... anyway, this is way off topic for the list now and the hayfever
drugs are making me babble, so having horrified you with that I'll shut
up.

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Polipo-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/polipo-users

Reply via email to