> No, sorry. This is stupid. Your OS already has clever knowledge about how > to look up documents fast, why defeat that? Storing lots of very small > fragments is rubbish, too. Rather, I agree with the way the two OSS > XML-database projects I know do: Store medium-sized documents, each on its > own, with a sophisticated index that encompasses a complete collection of > documents. This way you can do quite fast queries (depending on indexing > scheme), and still have the power (flexibility) of XPath and/or XQL. Have a > look at > xml.apache.org, one of these DBs is part of Apache XML, just like AxKit.
This was what Hans Reiser developed reiserfs for, to store MILLIONS of tiny bits of information efficiently. reiserfs basically IS on disk the implementation of this. Put a reiserfs partition on your drive and store your XML tree as a directory/ structure, with files representing only textual data. The remaining hurdle is that the OS file system interface doesn't support attributes, and it doesn't support a container with both content and other containers in it (though you could get around that easy enough). The other slight killer is system call overhead, opening 100 levels of directories is a lot of permissions checking. There is a Sourceforge project I saw the other day that aims to fix those problems though. Forgot the name of it though :(. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
