Sivakatirswami, I sent your message on to a colleague who is an expert in text markup schemes and this was his reply. It may imply a slightly different direction from what you have started with.
HTH Devin Begin forwarded message: > From: Jarom McDonald > Date: May 11, 2010 8:42:40 AM MDT > To: Devin Asay <[email protected]> > Subject: Re: OT: Resources for Data Base Design > > Hi Devin, > > At least in the world of academia, what he's looking for just isn't done. > Whether for philosophical reasons, common practice reasons, or whatever, > there is very little work done in decomposing texts for relational models. > Rather, texts are kept whole and marked up in XML, which to most people > preserves the complexity of the text and facilitates publishing and > dissemination. > > This isn't to say that relational models can't be useful; I have seen > products where texts are marked up in the TEI schema (the standard for XML > encoding of text) and then elements are chopped up and put into a DB; > however, you can achieve similar levels of performance with an XML database. > The most used is called eXist; there are plenty of scripts you can find by > googling TEI + eXist that can help in storing XML docs in the XML database, > querying with xQuery and XPath to find documents, creating indices, etc. > > Of course, this probably doesn't help much, as Revolution has native support > for RDBMs but not for XML databases. But for full texts, the relational route > just isn't used in academia on any sort of wide scale. > > Jarom > > On Mon, May 10, 2010 at 9:36 AM, Devin Asay <[email protected]> wrote: > Jarom, > > This came over the Revolution mail list. Any recommendations I could point > him to? The last long paragraph details what he's looking for. > > Devin > > > Begin forwarded message: > >> From: Sivakatirswami <[email protected]> >> Date: May 8, 2010 7:53:08 PM MDT >> To: How to use Revolution <[email protected]> >> Subject: OT: Resources for Data Base Design >> Reply-To: How to use Revolution <[email protected]> >> >> I'm working on a content management database based on the Dublin Core >> and the Media Annotation Initiative. Much of the whole mode of discourse >> and terms translate well into a database scheme but when the discourse >> starts to talking about fine tuning and switches to an RDF framework it >> is difficult to grok in terms of translating some of the principles into >> actual table-field structures in a PostGreSQL dbase. the Dubline Core >> seems in some respects a very abstract realm... but things are different >> where rubber hits the road. >> >> I've looked pretty closely at the databases generated by XOOPS, Drupal >> and Word Press and frankly, they are freaky scary. I see a hodge podge >> of strategies, each differing -- depends on whose design the module >> whose tables you are looking at. That's why I want to stay with Dublin >> Core where the "human readability" principle is kept in the forefront of >> design. I'm pretty close to designing a schema that I think can contain >> pretty much all the metadata for any video, text or audio, translations >> pamphlets etc. FAQ that we have. I supposed we are re-inventing the >> wheel a bit, but in the end we will get something that is a good match >> for our needs and we will not be boxed into framework of a monster CMS >> that we cannot customize without spending huge $ on PHP-module >> consultants... (been there, done that, nightmare) >> >> Metadata for a video or a sound file or an image is simple enough.... >> >> The part of the data base I'm unable to finish of is that which deals >> with text fragments. I think I posted this before on this list but got >> no responses. If anyone knows what would be the best list or group I >> should go to, to get help, let me know. What I'm interested in should be >> pretty standard stuff in the world of academia: e.g. if you want a data >> base to contain the most atomic elements of a text resource (one record >> for every single verse of every single poem from a book where the poems >> are divided into chapters and the chapters into sections and the >> sections into parts of a book, and the book is one volume in a >> series...) what is the best schema which allows you to query the data >> base to re-aggregate all those elements into it's original source >> document, run time (or on a cron or periodically post modifications) >> AND OR what other approaches might better serve the end game (be able >> to query for a single verse with complete citation; be able to query for >> an entire poem with citation; be able to query for a complete chapter of >> poems with a citation ... etc.) I have some solutions in mind, and I >> may just proceed with those, and refactor later if something better >> comes along...but I would love to hear from some experts and seem some >> existing models. >> >> Any ideas of where to go looking for mangos? >> > Devin Asay Humanities Technology and Research Support Center Brigham Young University _______________________________________________ use-revolution mailing list [email protected] Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-revolution
