[reiserfs-list] First there were Files, then Attributes, then XML Storage and in the Far Future - Objects

Alexander G. M. Smith Sat, 05 Jan 2002 10:25:18 -0800

Philipp G�hring <[EMAIL PROTECTED]> wrote on Sat, 5 Jan 2002 03:35:23 +0100:
> Then I would suggest using XML instead, (perhaps a special
> form of it).  That would give the needed flexibility.


He also wrote in another message:
> I looked at MacOS and other ideas of attaching metadata to data,
> but I only found one real solution. Get the metadata inside.
> Use one structured Fileformat, that includes all the needed
> Metadata. For example XML. [...]

Then we could represent the whole disk volume as one big XML file,
and directory operations would become traversals of subchunks of
the XML?  It's just using XML to get the functionality of a file
system.  But under the hood, inserting and deleting things in the
XML blob corresponds to inserting and deleting files from a file
system.  Hmmmm.  You may be on to something here.  Let me expand
on your idea and take it to the next level...

The XML Object File System!

I can see that the next-next generation file system will be
dealing purely with objects.  Kind of a object database.  Besides
representing an object (which would be equivalent to today's files
plus attributes - the type info specifies which programs (methods)
can process it), you'd also want to represent objects being inside
objects, and represent object references (what you now know as
pointers, symbolic links, hard links, etc).  Plus it's lots more
fun if you have a way of finding objects other than doing
name space hierarchy traversals - the indexing system or
database aspect.

You could conceptually do all this with a big XML file.  Object
references would be serial numbers (GUIDs, inodes, URLs etc).  It
should allow cycles in the connection-by-reference graph.
Containing objects work with the usual nested XML hierarchy
technique, no cycles allowed.  Actual object data is stored via
the usual XML tags to identify field names and corresponding values.

Keep the same high level concept, but turn it into a file system.
If you copy the root node's XML view of its contents you get the
big XML blob.  Could make backups easier too - copy the big XML
stuff to the root node's XML virtual attribute to restore your
file system.

Under the hood, subobjects become subdirectories or subfiles in the
root object.  Yup, looks like you have to treat everything as both
a file and a directory.  The key/value tags become attributes.
References to other objects become links.  One design decision is
how to implement the equivalent of directories - whether to list
subobjects as attributes of type "hard link" and have an
attribute/link pair (file name is the attribute name, value is
an object reference) for each item (a directory listing would be
generated by iterating through all attributes and printing out the
ones of type "link").  Or to have an attribute which is of type
"dictionary of links" that contains an array of name/link
pairs in sorted order (or unsorted?).  I'd mash them all together
into one pool of attributes to save on redundant code.

Finally, there's the indexing service.  Attributes with names of
interest (not everything needs to be indexed) are added to the
root's indices.  Perhaps as Hans suggested, these could be magic
directories that contain all things that have that attribute, the
names being the attribute values and the values being links
to the objects.  This implies sorted directories are needed.  As
new objects get created, they also get magically added to the
relevant index directories.  Automagically update the index
directories for renames, deletions and attribute value changes.

For searching the index directories, you could either have a
standard library which will parse your query string and search
the indices for you, or you could have it built into the file
system.  Built-in permits you to do live update notification
when new objects that match a query appear or disappear.
Alternatively, if change notification is implemented then the
query library could monitor the index directories of interest
for changes.  Hmmm.  Change notification from the index directories
and an external query library would let you more easily change
your query language to whatever you like.

Well.  Sounds like another file system project for me to experiment
with, after I've finished my BeOS RAM file system first!

- Alex

[reiserfs-list] First there were Files, then Attributes, then XML Storage and in the Far Future - Objects

Reply via email to