Arthur, I understand that OSLC is taking a larger problem domain into account when defining syntax and data types, but I'm not arguing about that. I'm arguing that in the specific case of the compact spec, it makes it much less consumable. Generalizing is not always a good thing if it makes life more difficult for some of the cases. Perhaps what you are suggesting with the oslc:htmlEncodedTitle idea is the right way to address this, but all I'm advocating is that we not make this more complicated for consumers than it has been for the last 3 years. If that means a change to the spec, so be it.
Adam Archer Jazz Developer IBM Toronto Lab From: Arthur Ryman/Toronto/IBM To: Adam Archer/Toronto/IBM@IBMCA Cc: Randy Hudson <[email protected]>, "[email protected]" <[email protected]>, [email protected], Samuel Padgett <[email protected]> Date: 08/23/2011 10:40 AM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Adam, Your argument for escaping HTML is based purely on the use case of compact rendering. However, the OSLC specs are designed to enable many other use cases and to promote general interoperability among different tools. The OSLC core spec provides guidelines for the use of some common RDF properties, including those from the Dublin Core. [1] dcterms:title and dcterms:description in general contain text that may include formatting information. There were several candidates for syntax - HTML, Wiki, RTF. The RDF spec contains the means for including XML literals, and XHTML is the W3C standard for formatted text. All the other text formats are convertible to XHTML. It was therefore chosen as the recommended way to package rich text. The historical design of Jazz compact rendering is optimized for consumption by web browsers. However, the RDF plain text datatype does NOT imply that the text is HTML encoded. Another processor would be justified in displaying the verbatim encoding. However, for the case of OSLC preview, I see no reason why the text couldn't be encoded, but it should be put in a property that makes that clear, e.g. oslc:htmlEncodedTitle. FYI, I work on Focal Point and its REST API provides both a plain text and a marked up text version of some attributes. [1] http://open-services.net/bin/view/Main/OSLCCoreSpecAppendixA?sortcol=table;up=#Dublin_Core_Properties Regards, ___________________________________________________________________________ Arthur Ryman DE, PPM Chief Architect IBM Software, Rational Toronto Lab | +1-905-413-3077 Twitter | Facebook | YouTube From: Adam Archer/Toronto/IBM To: Arthur Ryman/Toronto/IBM@IBMCA Cc: Samuel Padgett <[email protected]>, Randy Hudson <[email protected]>, "[email protected]" <[email protected]>, [email protected] Date: 08/22/2011 05:00 PM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup The big concern to me is not the ability to process the RDF/XML with XPath, it's the ability to do so in a browser environment. Currently all implementations of all rich hovers in all Jazz based products encode any html tags in their dcterms:title attributes (and doubly encode special characters). For the consumer on the browser side, this means simply taking the content of the attribute, decoding it (which browsers are very good at) and slapping the result into the dom (which browsers are also very good at). The alternative would be a total consumability nightmare from the point of view of a browser (which is the most important consumer of this entire spec). If the tags are actually child nodes in the xml representation, it means we will have child elements in the resulting document that we get back from the xml http request which means we will have to traverse a dom tree and recreate a structure which could easily be represented as an escaped string, like everyone is doing today. I realize that implementation is not supposed to lead the spec, but I don't even think that would be the case here. The oslc compact spec grew organically out of the old jazz compact rendering spec which can be found here: https://jazz.net/wiki/bin/view/Sandbox/CompactRenderingV1P1 If we look at the semantic description of the dc:title and jp:abbreviation it states explicitly that the content MUST be escaped: > The HTML markup MUST be escaped; for example, "<b>" as "<b>". This decision was made consciously for very well defined technical reasons (discussed above) in the original spec. If that decision was reversed in the creation of the OSLC compact spec then I believe that to have been a huge mistake and would like to see the spec fixed rather than for all providers to have to change how their compact documents are served and all consumers to have to go to the trouble of walking the dom to determine what the provider is actually trying to show. Adam Archer Jazz Developer IBM Toronto Lab From: Arthur Ryman/Toronto/IBM To: Samuel Padgett <[email protected]> Cc: Adam Archer/Toronto/IBM@IBMCA, Randy Hudson <[email protected]>, "[email protected]" <[email protected]>, [email protected] Date: 08/22/2011 04:40 PM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Sam, You wrote: It's very difficult to parse the former using XPath. For instance, the expression "/oslc:Compact/dcterms:title" takes out the "<s>" and "</s>" I don't think problems using XPath are a valid reason to encode markup since RDF/XML itselt is very difficult to process using XPath. At one point we tried to define an OSLC-variant of RDF/XML that looked like "normal" XML. However, we abandonned that and now require support for generic RDF/XML. The are many equivalent ways to represent a given set of triples in RDF/XML. It would therefore be very problematic to use XPath, XSLT, or XQuery to process RDF/XML. The safe way to process RDF/XML is to use an RDF toolkit like Jena. Regards, ___________________________________________________________________________ Arthur Ryman DE, PPM Chief Architect IBM Software, Rational Toronto Lab | +1-905-413-3077 Twitter | Facebook | YouTube From: Samuel Padgett <[email protected]> To: "[email protected]" <[email protected]> Cc: Adam Archer/Toronto/IBM@IBMCA, Randy Hudson <[email protected]> Date: 08/07/2011 01:01 PM Subject: [oslc-core] OSLC Compact representation, titles with markup Sent by: [email protected] I believe the spec is a bit confusing when it comes to titles with markup for UI Preview. The Compact representation has a dcterms:title property. It's defined as an XML Literal that can contain XHTML markup [1]. My understanding of XML Literals as discussed in the RDF Primer [2] means a title with markup would look like this, <dcterms:title>12345: <s>Null pointer exception during startup</s></dcterms:title> The example [3] of this resource has a title like this, however, <dcterms:title> 12345: <s>Null pointer exception during startup</s> </dcterms:title> The example doesn't seem to fit with the description. It's very difficult to parse the former using XPath. For instance, the expression "/oslc:Compact/dcterms:title" takes out the "<s>" and "</s>" Most implementations I'm aware also follow the example where markup is encoded. It means special characters need to be "double encoded." For instance, "12345: Values > 1000 incorrectly calculated" would be, <dcterms:title>12345: Values &gt; 1000 incorrectly calculated</dcterms:title> I think we should add more clarity to the spec here, as getting this wrong can open up consumers to cross-site scripting attacks. I'd also suggest we say that providers MUST NOT use any markup with a <script> tag and consumer MUST NOT display any markup with a <script> tag to guard against this problem. Best Regards, Sam [1] http://open-services.net/bin/view/Main/OslcCoreUiPreview?sortcol=table;up=#Representation_Compact [2] http://www.w3.org/TR/rdf-syntax/#xmlliterals [3] http://open-services.net/bin/view/Main/OslcCoreUiPreview?sortcol=table;up=#XML_Representation_Format _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.net/mailman/listinfo/oslc-core_open-services.net
