Aloha, as usual here I come begging..

I'm working on a schema for a new data base that will "hold anything and everything" utilizing Dublin Core modified/extended with the Media Annotation specification where field names are very "generic" and allow you to maintain metadata and text content in the dbase for a wide variety of "just about anything" (i.e if you wanted to put into the same tableimages of species of flowers, or for YouTube video you have uploaded or chapters of a book, they will all "fit") all to be accessed later by RunRev desktop apps, Revlets, iRev, iPhone apps etc.

Of course even the Dublin Core Metadata Initiative says that when it gets down to doing your "application profile" things will start getting customized pretty quickly... I've drafted a dozen input data sets and as many output requirement scenarios I have pretty much sorted almost all possible input and output requirements and use cases for most resources. But I am stumped by one, which I thought would be obvious, because in the world of Academia, this would seem to be an common requirement:

metadata for text fragments, otherwise known as "the citation" but in this case publisher, author, source are already present fields in the Dublin Core. The problem comes with how to store metadata where the text fragment is part of a whole.

OK, what are we talking about. Let's use a real and specific example. The Hindu vedas are vast and voluminous. We have selected many verses for specific use. These need to be a) discreet b) re-assemble-able

For example here are some fun thoughts from the Rig Veda about gambling (dice have been around for millenia!):

---------

Downward they roll, then jump in the air! Though handless themselves, they can keep the upper hand over those who have! On the board, like magic coals, they consume, though cold, the player's heart to ashes.

Rig Veda X, 34, 9

Abandoned, the wife of the gambler grieves. Grieved, too, is his mother as he wanders vaguely. Afraid and in debt, ever greedy for money, he steals in the night to the home of another.

Rig Veda X, 34, 10

He is seized by remorse when he sees his wife's lot, beside that of her neighbor with well-ordered home. In the morning, however, he yokes the brown steeds and at evening falls stupid before the cold embers.

Rig Veda X, 34, 11

-----------

How to best keep the citation string? such that later one could aggregate these three verse into a unit such as we have above. where they each have their own record in the dBase. (Also think of "quotes" "jokes" "sayings" "maxims" etc... in the same category of "text fragments")


an exhaustive generic bibliographic citation is pretty well understood to be comprised of (where "collection, author, publisher, date, title etc." are already present in the Dublin Core spec and my schema)

Series
Volume
Part
Section
Chapter
Paragraph-verse

Now... what is the best way to handle the above in terms of a schema? this is where I get stumped, the DCMI use of RDF XML style notation is a different universe and does not translate well to a relational dbase PostGreSQL schema... If I study the back end MySQL Dbases for boxed LAMP apps (Drupal, Word Press, XOOPs etc) I see various strategies depending on who developed the module which uses a specific Table (a snake pit of tables!)

We see fields that hold discreet data values mapped with relation tables to other data; and we also see fields that seem to be used to hold an array of metadata: These are scary!

varChar(255) SomeData value: "a:23;isT:45;bv:$1;...." etc... some quite long and completely opaque from a human readability point of view which goes against basic DCMI principles.

The whole name of the game being: how can you keep the metadata clear enough and simple enough that it can live into the future and be easily extracted-transformed, where the known problem (well documented) that schema's which are too opaque are basically cast in stone, with any second generation agents (programmer, application, export tools etc) being locked out, required a complete refactoring of the entire frame work later (very expensive) such that many companies simple a) cannot upgrade b) suffer the consequences. I'm sure this issue is also present in a lot of business frameworks.

I searched the web for any models, and will continue to do so... as one would expect to see a lot of information from the academy where citations for text fragments are a "mission critical" component for any published document (PHP, scientific research, book reviews, teaching texts etc)

But I want to put this out on this list.. if anyone has experience with dbase schema for metadata for text fragments other wise called "bibliographic citation" Please email me off list if you have any advice, pointers or URL's or models or resources, I would be deeply grateful. Contact me off list.

Or it if you feel this is a subject of general interest then shout "Please keep this thread on the list!" and we will

TIA!

Sivakatirswami







_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to