On Mon, Jun 24, 2019 at 07:45:23AM +0930, Brett Lymn wrote: > On Sat, Jun 22, 2019 at 03:13:21PM +0200, tlaro...@polynum.com wrote: > > > > Is there something like that existing? the idea being to combine > > as much as possible existing facilities and just to insert a simple > > client/server encapsulating "disk" data at the right place (the > > pseudo-device) to make it work. > > > > Have you looked at coda? It is a disconnectable file system that will > automatically cache files locally and allow modifications while the file > server is unavailable, when the server is available again it will write > the changes back to the server.
I have read the presentation of the features (I have a copy of a book on AFS that I need to read to), so it has features that I'm after but it lacks also one feature: the reality of disks and the need for dissymmetry: RAID1 puts an equal burden on both disks meaning that to serve one data, you stress two disks; whether you put two high end server's disks---and it is a waste---or youp put two desktop/terminal disks, and you will spend your time replacing disks praying that they not both die at the same moment. I have in mind a simple block upon which one could imagine building distributed data. This fondamental and elementary block is a couple of storages (I say "storage" because it doesn't say these are "disks" neither where they are). It is a "couple", because it is dissymmetric: (a,b) != (b,a) (I already here the politically correct gangs shouting...). A couple is thus composed: ( (st1, /dev/null), (st2, /dev/null) ). /dev/null is here because it is the more reliable and vast storage ever: you can put in it whatever you personnally don't care to keep). It is here because on both members there can be filters (files you replicate and files you don't). The fundamental feature is that in the couple it is Write Always, Read Perhaps. The first member writes and reads always (potentially to/from /dev/null). The second member writes always but reads only on failure from the first member. Of course a write to /dev/null will always succeeds; a read from /dev/null will always fail. Meaning that depending on the filters passed, data can be duplicated or not (there can be also temporary memory files that are not kept in any member). One can see that distributed file system and disconnection can be handled with this element: since writes (depending on filters) is always duplicated, in one member is a "local" storage, there is always (if not discarded by filter) a local copy of written data on the node. So if the other component is a remote file service, the disconnected client can work. In the same spirit, the first component can have a lot of data, but the local storage will have only what is written by the node (another dissymmetry on size). In case of disaster on some server with "all" the files, the files (if the disaster is handled in a decent time) are all in the local storages, spread around (not for ever; in my mind the local storage is limited and can recycle; but in the case of almost full, the administrator has to be sure the files are in the backups before deleting/reclaiming space). For my first application, there is no distributed file system. Such a couple will be put uphill, on the file server, so that high end disks serve as the first member (write and read always, without a filter) while the second member are decent but cheap desktop disks for backup that write always (filtered; not everything is backup-ed) and read only exceptionnally on failure. And the question is where to put this logic (I think on the pseudo-devices level) so it can be implemented with the minimum of code and kernel changes, more complicated things (distributed, replicated, fault tolerant etc.) being developped on this fundamental element, but in user space. Best, -- Thierry Laronde <tlaronde +AT+ polynum +dot+ com> http://www.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C