OT: Metadata for Text Fragments - Help with Dbase Schema

Sivakatirswami Sun, 28 Mar 2010 15:07:42 -0700

Aloha, as usual here I come begging..

I'm working on a schema for a new data base that will "hold anything andeverything" utilizing Dublin Core modified/extended with the MediaAnnotation specification where field names are very "generic" and allowyou to maintain metadata and text content in the dbase for a widevariety of "just about anything" (i.e if you wanted to put into thesame tableimages of species of flowers, or for YouTube video you haveuploaded or chapters of a book, they will all "fit") all to beaccessed later by RunRev desktop apps, Revlets, iRev, iPhone apps etc.

Of course even the Dublin Core Metadata Initiative says that when itgets down to doing your "application profile" things will start gettingcustomized pretty quickly... I've drafted a dozen input data sets andas many output requirement scenariosI have pretty much sorted almost all possible input and outputrequirements and use cases for most resources. But I am stumped byone, which I thought would be obvious, because in the world of Academia,this would seem to be an common requirement:

metadata for text fragments, otherwise known as "the citation" but inthis case publisher, author, source are already present fields in theDublin Core. The problem comes with how to store metadata where the textfragment is part of a whole.

OK, what are we talking about. Let's use a real and specific example.The Hindu vedas are vast and voluminous. We have selected many versesfor specific use. These need to be a) discreet b) re-assemble-able

For example here are some fun thoughts from the Rig Veda about gambling(dice have been around for millenia!):


---------

Downward they roll, then jump in the air! Though handless themselves,they can keep the upper hand over those who have! On the board, likemagic coals, they consume, though cold, the player's heart to ashes.


Rig Veda X, 34, 9

Abandoned, the wife of the gambler grieves. Grieved, too, is his motheras he wanders vaguely. Afraid and in debt, ever greedy for money, hesteals in the night to the home of another.


Rig Veda X, 34, 10

He is seized by remorse when he sees his wife's lot, beside that of herneighbor with well-ordered home. In the morning, however, he yokes thebrown steeds and at evening falls stupid before the cold embers.


Rig Veda X, 34, 11

-----------

How to best keep the citation string? such that later one couldaggregate these three verse into a unit such as we have above. wherethey each have their own record in the dBase. (Also think of "quotes""jokes" "sayings" "maxims" etc... in the same category of "text fragments")

an exhaustive generic bibliographic citation is pretty well understoodto be comprised of(where "collection, author, publisher, date, title etc." are alreadypresent in the Dublin Core spec and my schema)


Series
Volume
Part
Section
Chapter
Paragraph-verse

Now... what is the best way to handle the above in terms of a schema?this is where I get stumped, the DCMI use of RDF XML style notation isa different universe and does not translate well to a relational dbasePostGreSQL schema... If I study the back end MySQL Dbases for boxed LAMPapps (Drupal, Word Press, XOOPs etc) I see various strategies dependingon who developed the module which uses a specific Table (a snake pit oftables!)

We see fields that hold discreet data values mapped with relation tablesto other data; and we also see fields that seem to be used to hold anarray of metadata: These are scary!

varChar(255) SomeData value: "a:23;isT:45;bv:$1;...." etc... somequite long and completely opaque from a human readability point of viewwhich goes against basic DCMI principles.

The whole name of the game being: how can you keep the metadata clearenough and simple enough that it can live into the future and be easilyextracted-transformed, where the known problem (well documented) thatschema's which are too opaque are basically cast in stone, with anysecond generation agents (programmer, application, export tools etc)being locked out, required a complete refactoring of the entire framework later (very expensive) such that many companies simple a) cannotupgrade b) suffer the consequences. I'm sure this issue is also presentin a lot of business frameworks.

I searched the web for any models, and will continue to do so... as onewould expect to see a lot of information from the academy wherecitations for text fragments are a "mission critical" component for anypublished document (PHP, scientific research, book reviews, teachingtexts etc)

But I want to put this out on this list.. if anyone has experience withdbase schema for metadata for text fragments other wise called"bibliographic citation" Please email me off list if you have anyadvice, pointers or URL's or models or resources, I would be deeplygrateful. Contact me off list.

Or it if you feel this is a subject of general interest then shout"Please keep this thread on the list!" and we will


TIA!

Sivakatirswami







_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

OT: Metadata for Text Fragments - Help with Dbase Schema

Reply via email to