In looking at the 3.0 design document at http://www.jspwiki.org/wiki/JSPWiki3Design
I have some comments on the metadata plans. These comments are only tentative. ---- ! Metadata Meta API Now, before getting into this too deeply it occurs to me that we might consider a pluggable meta API rather than single metadata schema. There are likely a variety of different applications that JSPWiki may be used within (simple wikis, embedded apps, hives, part of document mgmt systems, etc.), and we likely also want scalability (i.e., in terms of both simplicity/complexity and factors like page an revision count) in our metadata just as we do in other areas. I don't think this sounds particularly difficult if we're using a JSR-170 compliant repository: there'd be a core set of metadata fields whose actual descriptors would be assigned by the API implementation. If an application needed more than that it'd be up to the implementation to define and handle (e.g., because the documents will be used within a more complex framework or document management system having an existing schema). We'd simply be creating the API and reference implementation. ---- ! Recommendation I agree that the schema for JSPWiki should use standards wherever possible, and would advocate basing the reference implementation of a metadata API on Dublin Core, given that it is the predominant document metadata schema in use on the Web, either used directly or heavily informed by it). Due to its origins in OCLC (publishers of WorldCat), Dublin Core is used in almost the entirety of the world's libraries for lightweight interchangeable metadata and is compatible with and/or the basis of the designs used by the W3C and its "semantic web". When these terms don't suffice there are a variety of ways to extend the set. An accepted way to do this is to create and publish (i.e., post on the web) an "application profile" for the local customisations made. I am willing to both design and create the necessary documents for a Dublin Core application profile for JSPWiki. Examples of these documents (which are backed by an RDF document) are at http://dublincore.org/documents/2004/09/10/library-application-profile/ http://www.natlib.govt.nz/dr/drterms.html http://www.natlib.govt.nz/dr/terms# (RDF document) Note that there is no requirement that an application profile be either submitted or approved by the DCMI. It's just playing nicely by the rules to do so. For our purposes it'd just be a published web page plus a static RDF document. Below is the name and online comments for each proposed term followed by my comments and/recommendation for the term to be used in 3.0. I've sorted the list to begin with those terms that can be supported directly by the existing Dublin Core terms, followed by a set of terms to be defined within a JSPWiki application profile. Within the profile would be references to equivalent terms in other schemas where available and appropriate. I note that many of the proposed field names come from Atom. While this is perhaps an appropriate usage, Atom is a syndication schema, not a content repository schema. There's not a huge difference and Atom is in large parts (semantically) compatible with and influenced by Dublin Core (e.g., choice of atom:creator). For documents stored in a repository I believe Dublin Core is likely more appropriate. ---- ! Historical Note Historically, there are two Dublin Core schemas, DC.* and DCTERMS.*. The original core set (about a dozen) of Dublin Core Metadata Elements (DC.*) have been grandfathered into the set of DC Terms (DCTERMS, see footnote). For our purposes below, we can consider DC.* and DCTERMS.* as identical namespaces (they by definition now are). There used to be a qualification scheme whereby e.g., DC.date could be qualified as DC.date.modified, but this has been dropped in favour of having most of these qualified terms become full terms in their own right within the DCTERMS namespace. Where they exist, I've included the DC.* term or qualified term in parentheses below.] -------------- * atom:updated As in RFC 4287. This is a DATE. Recommendation: Use DCTERMS.modified. [DC.date or DC.date.modified] DATE. * atom:published As in RFC 4287. As JSPWiki does not yet support "draft" -pages, this is essentially a creation date. NB: This cannot be checked from page version #1, because that might be deleted. This is a DATE. Recommendation: Use DCTERMS.created. Agreed: this must be carried through all revisions since it provides a canonical container for the origin date of the document. [DC.date or DC.date.created] DATE. * atom:id As in RFC 4287. This has some advantages, and can easily be tied to the JCR jcr:uuid. This is a STRING. Recommendation: Use DCTERMS.identifier. [DC.identifier] STRING (URI?) * wiki:creator As in atom_published, the creator probably needs to be stored separately. Though on wikipages it might not be that useful. This is TBD. Recommendation: Use DCTERMS.creator. The Atom specification seems to borrow extensively from DC, with atom:author identical with the concept of DC.creator (they apparently just didn't like the term 'creator' and changed it to 'author'), but do use 'contributor' in the same manner (again, paraphrasing the terminology from DC). This will need to occur in all revisions since we need to maintain the original author ID regardless of the existence of a given revision. [DC.creator] STRING. * wiki:author Denotes the Identity of the user who saved this version of the page. This should probably be a reference to the user identity. It should also have a useful value in case the modification is done by the system automatically. This value should never be anything meaningless - in fact, I think that PageManager should throw an Exception if there is an missing attribute when saved. This is TBD. Recommendation: Use DCTERMS.contributor. The idea with DC.creator and DC.contributor is that the former is the original creator (author) of a resource, and any subsequent contributions (editing, translation, etc.) are considered as being done by a 'contributor'. For the original author, see wiki:creator (DC.creator) above. [DC.contributor] STRING. * wiki:ipaddr The IP address where the last change occurred. The SpamFilter might then add some additional tags (in its own namespace). This is a STRING Recommendation: Use new application profile wiki:ipaddr. STRING. * wiki:content The actual content as a binary stream (BINARY) Recommendation: Use new application profile wiki:content. Question as to binary stream? Not STRING? * wiki:contentType The MIME type of the content. JSPWiki markup shall be denoted as "text/x-wiki.jspwiki". Creole as "text/x-wiki.creole". Other types are also allowed, e.g. "text/html" or "image/jpeg". Recommendation: Use DCTERMS.format. This is the term used to contain a format identifier. While I recognise that these discussions tend to devolve rather quickly, I would highly recommend considering the MIME or Internet Media Type as "application/*" instead of "text/*", e.g., "application/x-wiki+jspwiki". In looking at the history of "text/html" vs. "application/html" this would suggest that text formats that use a significant amount of processing to perform rendering generally move towards being considered more an application than a text format (i.e., that while they may be largely human readable they quickly become indecipherable or largely unreadable in practice when used with plugins and other complex syntax, e.g., many if not most pages on Wikipedia. [DC.format] STRING. * wiki:acl The access control list for this page. Format TBD. Recommendation: Use new application profile wiki:acl. TBD. * wiki:changenote A simple, text/plain description of the note of the change. STRING. Recommendation: Use new application profile wiki:changenote. The way to do this in Dublin Core would likely be considered too complicated for this application. The change note needs to be considered as metadata of the revision, not the document. STRING. * wiki:state Essentially an Enum defining the state of the page. Can be EXISTS or DELETED. Format TBD. Recommended: Use new application profile wiki:state. Enumerated value set. STRING. * wiki:minorchange This change is minor, and should not be shown in the changelog, though an actual change has been made. Recommended: Use new application profile wiki:state. Enumerated value set. Not labeled as BOOLEAN but seems to be. ---- In summary, while I see Atom as interesting and in large part semantically compatible with Dublin Core, I think it'd be better to incorporate a schema that was designed more specifically for resources than for feeds; the definitions fit more closely with our usage. As I mentioned above I consider these merely comments-on-the-path. Murray DCTERMS. The set of Dublin Core terms are found at http://dublincore.org/documents/dcmi-terms/ ........................................................................... Murray Altheim <murray07 at altheim.com> === = = http://www.altheim.com/murray/ = = === SGML Grease Monkey, Banjo Player, Wantanabe Zen Monk = = = = Boundless wind and moon - the eye within eyes, Inexhaustible heaven and earth - the light beyond light, The willow dark, the flower bright - ten thousand houses, Knock at any door - there's one who will respond. -- The Blue Cliff Record
