John, There are several issues here.
1. How To Represent Formatted Text Many applications require the ability to use formatted text as the value of properties. This is especially true in the Requirements space where people often use textual formatting to signify meaning. Many formats are used and have there advocates, e.g. RTF, HTML, and wiki text. In order to promote interchange, we decided to use one format for formatted text, namely XHTML. The reasons for chosing XHTML is that it's a W3C standard (like RDF), it's relatively easy to parse (unlike HTML), and it can be conveniently represented in RDF using the XMLLiteral datatype. 2. <div> versus <span> Some properties, like dcterms:description, are intended to contain multi-line formatted text. They should contain valid <div> content. Other properties, like dcterms:title, are intended to contain single-line text. They should contain valid <span> content. The OSLC wiki unfortunately has some errors caused by careless copy and paste. I've reported this and Steve has committed to fix the errors. 3. Formatted versus Plain Text Plain text is a subset of formatted text. However, if the plain text contains special XML characters, they need to be replaced by character entities when used in XMLLiteral values. An application that only handles plain text can easily escape and unescape the special characters. The problem comes when a client POSTs or PUTs formatted text. In this case, I think it is acceptable to discard the markup if the loss of the formatting is not harmful. If the app does not natively support formatting then it's hard to see where discarding formatting would be harmful. Discarding formatting can easily be done using standard XML parsing libs. The inability to handle formatting is like other inherent limitations in the app. For example, there may be limits to string length, integer size, or precision of floating point numbers. The app has to decide which types of truncation are harmless and which might cause harm. If the truncation is harmful then the app should reject the request and provide some useful error message. Regards, ___________________________________________________________________________ Arthur Ryman DE, PPM & Reporting Chief Architect IBM Software, Rational Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) From: John Arwe <[email protected]> To: [email protected] Cc: Joe Ross <[email protected]>, Anamitra Bhattacharyya <[email protected]>, Robert Uthe <[email protected]>, Ken Parzygnat <[email protected]> Date: 11/15/2011 05:40 PM Subject: [oslc-core] XHTML vs simple text in OSLC Core's common properties Sent by: [email protected] A subset of common properties has value-type = XMLLiteral and a description that says the content [should] include valid XHTML-within-<div> [1]; description and title are examples. Most of Tivoli's existing product set expects "just straight strings" (no rich text/XHTML permitted; markup would be treated as strings, i.e. not recognized as markup). I'm trying to avoid re-inventing new while still enabling existing products that would not, as servers processing a POST(Create)/PUT/PATCH, tolerate XHTML coming in. I see oslc:name (within Resource Shapes) which is potentially re-usable (RS is part of Common as a resource definition, but oslc:name is not listed in common properties). I would be interested in thoughts from Core on how best to accomplish the goal of enabling apps not yet ready to change to accept XHTML as described above. (1) One possibility would be to define a Common Property whose value is simply "String"; one might re-use oslc:name for that purpose (to avoid defining new) or simply define new. (2) I wonder out loud about an alternative of using the existing common properties with an explicit type of ^^String on the [RDF/XML at least... :-( want JSON too though] serialized representations. But I find that not so appealing, since (at a bare minimum) it imposes requirements on client implementations to use serialization rules more restrictive than those defined in Core in order for one of my servers to accept the data. (3) Becoming even more of a language lawyer than usual and noting that the existing descriptions of the relevant properties use a conditional (SHOULD), and the domain specs (like CM 2.0) only impose normative requirements on implementations (not representations). So compliant service providers (should? must? not clear!) tolerate sans-XHTML values (good for me), and compliant clients (should, by my reading) provide with-XHTML values which my providers (may? should? must? not clear!) accept... I choose the "should" reading, but do not implement that, so my provider is compliant but less useful than ones that would accept with-XHTML values. The reason I assert "not clear!" is: specs like CM 2.0 [2] based on Core say "OSLC CM consumers and service providers MUST be compliant with both the core specification and this CM specification, and SHOULD follow all the guidelines and recommendations in both these specifications. " I.e. they talk about compliance only in terms of consumers and providers, not resources. In a case like [1]'s oslc:shortTitle, whose entire description is "Shorter form of dcterms:title for the resource represented as rich text in XHTML content. SHOULD include only content that is valid inside an XHTML <div> element. ", it is left to the reader to decide the effects of the SHOULD. There is no clear statement of responsibilities for service providers or consumers. While my reading would be that compliant service providers MUST tolerate sans-XHTML values (good for me), compliant clients should provide with-XHTML values, and compliant service providers MUST accept with-XHTML values, if my evil twin read the last MUST as a SHOULD and challenged me to show which normative statement was violated then I would be hard pressed to find one. If I change my goal to practical interop rather than trying to minimize the cost of shoe-horning my existing implementation within the letter of the spec, I have a reasonable case to argue for MUST. [Aside and fair disclosure: [1] does in at least one place appear to attempt to place normative restrictions on a resource - foaf:person. But I find no place in Core that defines compliance, so we revert to domain specs like CM 2.0 and the identical problem.] (4) Clarify the meaning of "... SHOULD include only content that is valid inside an XHTML <div> element. " with respect to implementations, and then see where I stand. The preceding seems ample evidence that the current text is ambiguous. (5) Define an extension property(ies) that lack the XHTML restriction and use those until my implementations learn to recognize it as markup when present. Which, in the case of a CM 2.0 ChangeRequest, means that it would be a gating factor in becoming compliant (dcterms:title = 1:1) as a service provider. (6) Accept the with-XHTML values but do not render them in my UI. Seems within my power at least, although not perfect. The horse/water meme :-) One could draw the conclusion OSLC assumes a Web-based UI when it requires these XHTML-enabled fields. Is that an explicit intent of OSLC? If it is UNintentional, 1:1 on XHTML-enabled strings would appear to be an anti-pattern. Requiring a value and encouraging that value to contain XHTML but then saying "well you don't have to display them ever" seems incoherent - if they're not for display, why XHTML? [1] http://open-services.net/bin/view/Main/OSLCCoreSpecAppendixA?sortcol=table;table=up#OSLC_Properties [2] http://open-services.net/bin/view/Main/CmSpecificationV2?sortcol=table;table=up#Compliance Best Regards, John Voice US 845-435-9470 BluePages Tivoli OSLC Lead - Show me the Scenario _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.net/mailman/listinfo/oslc-core_open-services.net
