Re: [oslc-core] XHTML vs simple text in OSLC Core's common properties

Arthur Ryman Thu, 24 Nov 2011 16:40:38 -0500

John,

There are several issues here.


1. How To Represent Formatted Text

Many applications require the ability to use formatted text as the value 
of properties. This is especially true in the Requirements space where 
people often use textual formatting to signify meaning. Many formats are 
used and have there advocates, e.g. RTF, HTML, and wiki text. In order to 
promote interchange, we decided to use one format for formatted text, 
namely XHTML. The reasons for chosing XHTML is that it's a W3C standard 
(like RDF), it's relatively easy to parse (unlike HTML), and it can be 
conveniently represented in RDF using the XMLLiteral datatype.

2. <div> versus <span>

Some properties, like dcterms:description, are intended to contain 
multi-line formatted text. They should contain valid <div> content. Other 
properties, like dcterms:title, are intended to contain single-line text. 
They should contain valid <span> content. The OSLC wiki unfortunately has 
some errors caused by careless copy and paste. I've reported this and 
Steve has committed to fix the errors.

3. Formatted versus Plain Text

Plain text is a subset of formatted text. However, if the plain text 
contains special XML characters, they need to be replaced by character 
entities when used in XMLLiteral values.

An application that only handles plain text can easily escape and unescape 
the special characters. The problem comes when a client POSTs or PUTs 
formatted text. In this case, I think it is acceptable to discard the 
markup if the loss of the formatting is not harmful. If the app does not 
natively support formatting then it's hard to see where discarding 
formatting would be harmful. Discarding formatting can easily be done 
using standard XML parsing libs.

The inability to handle formatting is like other inherent limitations in 
the app. For example, there may be limits to string length, integer size, 
or precision of floating point numbers. The app has to decide which types 
of truncation are harmless and which might cause harm. If the truncation 
is harmful then the app should reject the request and provide some useful 
error message.

Regards, 
___________________________________________________________________________ 

Arthur Ryman 

DE, PPM & Reporting Chief Architect
IBM Software, Rational 
Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) 





From:
John Arwe <[email protected]>
To:
[email protected]
Cc:
Joe Ross <[email protected]>, Anamitra Bhattacharyya 
<[email protected]>, Robert Uthe <[email protected]>, Ken Parzygnat 
<[email protected]>
Date:
11/15/2011 05:40 PM
Subject:
[oslc-core] XHTML vs simple text in OSLC Core's common properties
Sent by:
[email protected]



A subset of common properties has value-type = XMLLiteral and a 
description that says the content [should] include valid 
XHTML-within-<div> [1]; description and title are examples. 
Most of Tivoli's existing product set expects "just straight strings" (no 
rich text/XHTML permitted; markup would be treated as strings, i.e. not 
recognized as markup).  I'm trying to avoid re-inventing new while still 
enabling existing products that would not, as servers processing a 
POST(Create)/PUT/PATCH, tolerate XHTML coming in.  I see oslc:name (within 
Resource Shapes) which is potentially re-usable (RS is part of Common as a 
resource definition, but oslc:name is not listed in common properties). 
I would be interested in thoughts from Core on how best to accomplish the 
goal of enabling apps not yet ready to change to accept XHTML as described 
above.   
(1) One possibility would be to define a Common Property whose value is 
simply "String"; one might re-use oslc:name for that purpose (to avoid 
defining new) or simply define new.   
(2) I wonder out loud about an alternative of using the existing common 
properties with an explicit type of ^^String on the [RDF/XML at least... 
:-( want JSON too though] serialized representations.  But I find that not 
so appealing, since (at a bare minimum) it imposes requirements on client 
implementations to use serialization rules more restrictive than those 
defined in Core in order for one of my servers to accept the data. 
(3) Becoming even more of a language lawyer than usual and noting that the 
existing descriptions of the relevant properties use a conditional 
(SHOULD), and the domain specs (like CM 2.0) only impose normative 
requirements on implementations (not representations).  So compliant 
service providers (should? must? not clear!) tolerate sans-XHTML values 
(good for me), and compliant clients (should, by my reading) provide 
with-XHTML values which my providers (may? should? must? not clear!) 
accept... I choose the "should" reading, but do not implement that, so my 
provider is compliant but less useful than ones that would accept 
with-XHTML values.  The reason I assert "not clear!" is: specs like CM 2.0 
[2] based on Core say "OSLC CM consumers and service providers MUST be 
compliant with both the core specification and this CM specification, and 
SHOULD follow all the guidelines and recommendations in both these 
specifications. "  I.e. they talk about compliance only in terms of 
consumers and providers, not resources.  In a case like [1]'s 
oslc:shortTitle, whose entire description is "Shorter form of 
dcterms:title for the resource represented as rich text in XHTML content. 
SHOULD include only content that is valid inside an XHTML <div> element. 
", it is left to the reader to decide the effects of the SHOULD.  There is 
no clear statement of responsibilities for service providers or consumers. 
 While my reading would be that compliant service providers MUST tolerate 
sans-XHTML values (good for me), compliant clients should provide 
with-XHTML values, and compliant service providers MUST accept with-XHTML 
values, if my evil twin read the last MUST as a SHOULD and challenged me 
to show which normative statement was violated then I would be hard 
pressed to find one.  If I change my goal to practical interop rather than 
trying to minimize the cost of shoe-horning my existing implementation 
within the letter of the spec, I have a reasonable case to argue for MUST. 
 [Aside and fair disclosure: [1] does in at least one place appear to 
attempt to place normative restrictions on a resource - foaf:person.  But 
I find no place in Core that defines compliance, so we revert to domain 
specs like CM 2.0 and the identical problem.] 
(4) Clarify the meaning of "... SHOULD include only content that is valid 
inside an XHTML <div> element. " with respect to implementations, and then 
see where I stand.  The preceding seems ample evidence that the current 
text is ambiguous. 
(5) Define an extension property(ies) that lack the XHTML restriction and 
use those until my implementations learn to recognize it as markup when 
present.  Which, in the case of a CM 2.0 ChangeRequest, means that it 
would be a gating factor in becoming compliant (dcterms:title = 1:1) as a 
service provider. 
(6) Accept the with-XHTML values but do not render them in my UI.  Seems 
within my power at least, although not perfect.  The horse/water meme :-) 
One could draw the conclusion OSLC assumes a Web-based UI when it requires 
these XHTML-enabled fields.  Is that an explicit intent of OSLC?  If it is 
UNintentional, 1:1 on XHTML-enabled strings would appear to be an 
anti-pattern.  Requiring a value and encouraging that value to contain 
XHTML but then saying "well you don't have to display them ever" seems 
incoherent - if they're not for display, why XHTML? 
[1] 
http://open-services.net/bin/view/Main/OSLCCoreSpecAppendixA?sortcol=table;table=up#OSLC_Properties
 

[2] 
http://open-services.net/bin/view/Main/CmSpecificationV2?sortcol=table;table=up#Compliance
 

Best Regards, John

Voice US 845-435-9470  BluePages 
Tivoli OSLC Lead - Show me the Scenario 
_______________________________________________
Oslc-Core mailing list
[email protected]
http://open-services.net/mailman/listinfo/oslc-core_open-services.net

Re: [oslc-core] XHTML vs simple text in OSLC Core's common properties

Reply via email to