Re: Simplicity

Tod Harter Mon, 04 Feb 2002 16:22:26 -0800

> No, sorry. This is stupid. Your OS already has clever knowledge about how
> to look up documents fast, why defeat that? Storing lots of very small
> fragments is rubbish, too. Rather, I agree with the way the two OSS
> XML-database projects I know do: Store medium-sized documents, each on its
> own, with a sophisticated index that encompasses a complete collection of
> documents. This way you can do quite fast queries (depending on indexing
> scheme), and still have the power (flexibility) of XPath and/or XQL. Have a
> look at
> xml.apache.org, one of these DBs is part of Apache XML, just like AxKit.


This was what Hans Reiser developed reiserfs for, to store MILLIONS of tiny 
bits of information efficiently. reiserfs basically IS on disk the 
implementation of this. Put a reiserfs partition on your drive and store your 
XML tree as a directory/ structure, with files representing only textual 
data. The remaining hurdle is that the OS file system interface doesn't 
support attributes, and it doesn't support a container with both content and 
other containers in it (though you could get around that easy enough). The 
other slight killer is system call overhead, opening 100 levels of 
directories is a lot of permissions checking.

There is a Sourceforge project I saw the other day that aims to fix those 
problems though. Forgot the name of it though :(.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Simplicity

Reply via email to