Hi... (see below)...

On 21/06/2011, at 3:42 AM, BGB wrote:

> On 6/20/2011 3:22 AM, Julian Leviston wrote:
>> On 20/06/2011, at 8:06 PM, BGB wrote:
>> 
>>> hmm... S-Expression database?...
>>> sort of like a hybrid between a database and a persistent store.
>>> 
>>> 
>>> or such...
>> 
>> I'd like to know if you think there's a difference between a filesystem and 
>> a database... conceptually...
>> 
>> or such...
> 
> interesting thought...
> 

Note that I asked if you think there's a difference not how they differ. I'd be 
surprised if there were any people on this list who didn't know how they 
differed.

I don't consider there to be much of a difference between the two, conceptually 
- they are both concerned with the retrieval and storage of data (I'm using the 
term 'data' here to mean any form of raw information at all, useful or 
otherwise, including programs).

> well, filesystems are generally hierarchical, don't generally support 
> user-defined fields, and store all data as large binary globs. some 
> filesystems (such as HFS/HFS+ and NTFS) allow multiple forks per file, others 
> don't (FAT32 / EXT* / ...).
> 
> generally, filesystems store data in terms of disk-blocks, and often make 
> little/no provision for tiny files (partial exception being EXT* and 
> ReiserFS, the former supporting inlining the contents of small files into the 
> inode, the latter supporting the storage of sub-block files into B+Tree 
> leaves).
> 
> 
> databases may come in any number of varieties, and generally store structured 
> or semi-structured information;
> the contents of a database are generally individual datums, rather than 
> binary globs;
> typically, databases store contents according to user-defined structures;
> typically, databases are stored using structures such as B+Trees;
> ...
> 
> 
> for example, the HDB used for my VM framework has a structure like:
> each node is identified by a path, generally with a POSIX-style name 
> "foo/bar/baz";
> each node may contain a number of key/value pairs, with a special 
> null/default key internally named '_', and the key is identified following 
> ':', and sometimes key indices are used following a '.', ...
> stored values are generally untyped, and treated as if they were strings.
> 
> "foo/bar/baz:name"
>    node: "foo/bar/baz"
>    key: "name"
> "foo/bar/baz:item.3"
>    node: "foo/bar/baz"
>    key: "item"
>    index: 3
> 
> the index allows treating the key like a small array.
> 
> possible, but not currently supported (at least not efficiently) would be to 
> apply an index to a node, allowing treating the node as a makeshift indexed 
> table:
> "foo/bar/baz.59:name"
> 
> the optimization would be to transform the above into a key-index:
> "foo/bar/baz:name.59", allowing for storing the values as arrays.
> 
> however, in this case, table-like nodes would not allow for indexed keys:
> "foo/bar/baz.59:item.5"
> since this would require double-indexing, but the keys are presently only 
> defined as 1D arrays.
> 
> at present, this will work, as the node index would just store multiple nodes 
> with dotted names ("baz.59" will simply be interpreted as a nodes' name).
> 
> as-is, node names may also begin with '.', however these nodes are treated as 
> "volatile" and effectively will not be saved back into the database (this 
> mechanism is generally used for storing "volatile" information, namely 
> information about things like environment variables, which libraries or 
> modules are currently loaded, ...).
> 
> so ".var/foo:name" is volatile, as neither ".var" nor any of its children 
> will be saved to disk.
> 
> 
> note:
> the present system does not allow for non-trivial queries (besides "?next", 
> "?prev", ...), but interestingly, for most uses of this system, one doesn't 
> generally need to perform queries (much like, for the most part, application 
> code doesn't really need "readdir" either, since its main use is mostly for 
> presenting directory listings to the user).
> 
> it is worth noting that the design of this system was originally partly 
> inspired by the Windows Registry, but there are differences (the nodes being 
> untyped, and the use of '/' rather than '\', and the present lack of "hives" 
> albeit a similar mechanism may be needed for other reasons, and may need to 
> be retrofitted on, that or supporting in-database "mounts" and "links"...).
> 
> the external serialized database format somewhat resembles '.reg' files 
> (albeit, technically if ".reg" and ".ini" files hybridized).
> 
> unlike more proper databases, the current implementation uses AVL trees 
> rather than B+Trees internally. the main reason for AVL trees was that they 
> were less effort to implement (actually, it is a tree of AVL trees, as each 
> node-level starts a new AVL tree).
> the internal structure of the code for managing the HDB is a bit nasty, 
> luckily, nearly all access is via named path-strings (the path-key then is 
> its canonical structure, and all of the mucking around with AVL trees/... is 
> hidden behind the API).
> 
> as I think noted previously, the main reason I chose an HDB style design 
> (rather than an RDB / Relational Database) was because an HDB is generally 
> much easier to work with via a programmatic interface, whereas working with 
> relational databases is a bit more painful, as nearly every operation will 
> involve queries and having to work with tables or similar...
> 
> 
> or such...
> 


_______________________________________________
fonc mailing list
[email protected]
http://vpri.org/mailman/listinfo/fonc

Reply via email to