By the way, it has occurred to me in re-reading Patrick's page that he probably said the same thing I just did, only he said so much in such a way it's hard to absorb.... ;o)
Dave At 11:55 PM 10/11/2001 -0700, David Burry wrote: >I've been thinking for a long time about how to provide a reasonable >storage/index mechanism, and still give the end user interface designer >access to the complete the Bible in a variety of XML ways depending on the >needs of the application. There has been previous discussion on this list >regarding this, I called it looking at the data in different "slices" and >Patrick Durusau called it "concurrent markup" >(http://www.sbl-site2.org/Extreme2001/Concur.html). > >However!!! <light goes on> I just thought of a great idea today about >this (I think, you tell me).... What if the Bible were stored in >compressed and/or indexed form on disk, yet "virtually" >available/queryable as a large repetitious XML type object, from which you >could extract just the portion/format you need, with say, an XPath or >XQuery statement. > >What I mean is that, suppose the Bible were stored in a binary/text >compressed and/or indexed format, but available for query _as_if_ it were >in this kind of format: > ><version name="kjv"> > <book name="genesis"> > <chapter> > <verse><paragraphmarker/>contents of verse 1</verse> > <verse>contents of verse 2</verse> > <verse><paragraphmarker/>etc</verse> > ... > </chapter> > ... > <paragraph><chaptermarker/><versemarker/>contents of verse > 1<versemarker/>contents of verse 2</paragraph> > <paragraph><versemarker/>etc</paragraph> > ... > </book> ></version> > >(Notice I didn't put paragraphs inside chapters because in fact paragraphs >can occasionally straddle chapter boundaries.) > >You can see I'm proposing that the entire thing be duplicated 2 times for >the simple example above, but it only has to be "vitrually" duplicated, >not actually recorded twice anywhere on disk nor in memory. It allows you >to specify an XPath of >"/version[@name='kjv']/book[@name='genesis']/chapter/verse" to grab the >contents of all the verses in genesis in a verse-by-verse fashion with >paragraph markers, but >"/version[@name='kjv']/book[@name='genesis']/paragraph" to grab the same >contents in a paragraph-by-paragraph fashion with chapter/verse >markers. It's great because a properly extended thing like this could >allow you to query the Bible and get your results in many different >chapter/verse/paragraph/sentence/word/etc forms! > >This would mean that we'd have to glue an XPath or XQuery parser into our >data store in a way it probably wasn't originally designed, so that we can >interpret the query first and then reconstitute the requested XML from our >data store without doing the entire extended duplicated XML tree. But >it's certainly possible, and more and more of this kind of stuff is >getting more modularized like this so it can only get easier to do in >time... perhaps someone else has even already thought of and done stuff >like this. Anyone know of any? > >thoughts? comments? > >Dave
