On 6/30/20 4:53 AM, D. Hugh Redelmeier via talk wrote:
Warning: it is the middle of the night and I'm going to ramble.

[snip]

The following are some random thoughts about filesystems.  I'm
interested in any reactions to these.

The UNIX model of a file being a randomly-accessed array of fixed-size
blocks doesn't fit very well with compression.  Even if a large
portion of files are accessed purely as a byte stream.  That's perhaps
a flaw in UNIX but it is tough to change.
Fixed size disk blocks is not just a UNIX thing.
Admittedly I have not seen all the hardware out there but I do not believe that there has been a disk drive that has not been formatted with fixed block sizes outside of some very old or very new stuff.

Think of it from a hardware perspective If you have random sized blocks you need to manage fragmentation and then likely some form of free space clean up. And that level of compute power was not available in a disk controller until fairly recently. By which time the standard design was so entrenched that any other layouts were not gain enough traction to be worth designing and trying to sell.

For example there are disk drives with Ethernet interfaces and have a native object store.
The Segate Kinetic drives never seemed to get beyond the sampling phase.

In modern systems, with all kinds of crazy containerization, I guess
de-duplication might be very useful.  As well as COW, I think.  Is
this something for the File System, or a layer below, like LVM?
I have a problem with de-duplication.
I am not sure how well it actually works in practice.
At the file system level its just linking of the 2 identical files until one is changed so you need COW.

At the block level you have to look at the overhead of the hash function and the storage of the hash data. The size of the hash key relates to the likelihood an error duplicate match but the size of the block.
The duplicate blocks still need to be compared causing an extra reads.
Lets say you use SHA-2 for your hash you have a key of 32 bytes if you use 512 bytes for your block size then your hash table is about a 6% overhead. If you go for larger blocks then you will get less hits because the filesystems want to allocate smaller blocks for small file efficiency.
If you use LVM extents then the hit rate drops even more.

It may work well where you have a large number of VMs where the disk images tend to start out all the sameĀ  and where the OS will tend to stay static leaving large parts of the disk untouched for a long time. It may also be possible to develop file systems that are amenable to de-duplication.



There's something appealing about modularizing the FS code by
composable layers.  But not if the overhead is observable.  Or the
composability leaves rough edges.

Here's a natural order for layers:
        FS (UNIX semantics + ACLS etc, more than just POSIX)
        de-duplication
        compression
        encryption
        aggregation for efficient use of device?
This appears to be what Redhat is pushing with their VDO(Virtual Data Optimizer )

I don't know where to fit in checksums.  Perhaps it's a natural part
of encryption (encryption without integrity checking has interesting
weaknesses).
nothing beats dd if=/dev/zero of=your_secret_file for security ;)


I don't know how to deal with the variable-sized blocks that come out
of compression.  Hardware has co-evolved with file-systems to expect
blocks of 512 or 4096 bytes.  (I remember IBM/360 disk drives which
supported a range of block sizes as if each track was a short piece of
magnetic tape.)
Move from disks to object stores(Key/Value).

I don't know how to have file systems more respectfully reflect the
underlying nature of SSDs and shingled HDDs

I also am still waiting for translucent mounts like Plan 9.
How would translucent mounts compare to overlay mounts?
I think that many or most drives do whole-volume encryption invisible
to the OS.  This really isn't useful to the OS since the whole volume
has a single key.

The most secure encryption is end-to-end.  It tends to be less
convenient.  Maybe my placement of encryption near the bottom of the
stack isn't good enough.
I would argue that encryption should be as high in the stack as possible.
encrypting the disk provides "at rest" security so when the drives are sold to someone at the bankrupcy sale they cannot use the data. It does not help the hacker who has gained access to the system from dumping the database of credit card info.

[snip]

--
Alvin Starr                   ||   land:  (647)478-6285
Netvel Inc.                   ||   Cell:  (416)806-0133
al...@netvel.net              ||

---
Post to this mailing list talk@gtalug.org
Unsubscribe from this mailing list https://gtalug.org/mailman/listinfo/talk

Reply via email to