Hi Arthur, Thanks for the engagement, for seeing both sides, and for figuring out what was going on with the W3C Validator (and submitting a problem report).
Regarding XHTML vs. HTML in general, I still think it would have been pragmatic to look at who is actually consuming/producing marked up text and where it's coming from/what's being done with it, to choose a format that minimizes the amount of conversion required. That said, I do see your reasons for favouring XHTML from the outset, and of course I recognize that the decision was made long ago and revisiting would have been difficult. Also, I do appreciate that you considered my point of view. On the particular issue of compact rendering, I would strongly advocate for option 2, defining a new datatype for HTML and using it together with the existing dcterms:title property. Defining such a type places no greater practical burden on providers or consumers than defining a new property. In either case, it's one new resource in the vocabulary to recognize, and they can handle values in exactly the same way (either by leaving the content as a string and leaving it to a browser to render, or by parsing and interpreting it themselves). However, using a new type separates the expression of the concerns in the standard RDF way: the property identifies the characteristic of the subject that the statement specifies, and the type suggests how to interpret the lexical form of the statement's object. Moreover, if we define a type, it can be reused with other properties, like dcterms:description, if that is ever needed. I would also suggest that the spec should explicitly provide guidance on typed vs. plain literals, hopefully in favour of the former. Cheers, Dave -- Dave Steinberg IBM Rational Software [email protected] From: Arthur Ryman/Toronto/IBM To: Dave Steinberg/Toronto/IBM@IBMCA Cc: [email protected], [email protected] Date: 08/31/2011 09:17 AM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Dave/Randy, Thx for persisting on this point. It turns out that the W3C RDF Validator is in fact displaying markup characters in strings wrong. It is escaping them. You can see the correct, unescaped, results by turning on the Advanced option of N-Triples output. This discussion has made me realize that my suggested name for a new oslc:htmlEncodedTitle property is misleading. Encoding is only required when you put the triple in an XML document., e.g. the OSLC compact rendering resource The encoding is removed by the parser and you end up with the unescaped string. Since we are defining RDF predicates, the reference to encoding is inappropriate because there is no encoding at the RDF value level. We therefore have the following alternatives for markup in the title: 1. Use XML Literal datatype and XHTML content. 2. Define a new datatype for HTML 3. Define a new predicate for HTML titles, e.g. oslc:htmlTitle Using HTML within the context of the UI preview is OK since the UI is expected to be a web UI and you'd just copy the string. However, I think using HTML in RDF is not a good idea because all readers of the data would then have to cope with it, I mentioned that Tidy could be used by the writer of the data to convert it to XHTML. That does not mean this is practical for all readers of the data. In general, when you are designing a format for interoperability, you should convert diverse formats into one common format. We should therefore adopt XHTML as the one common format for marked up text interchange. Recall that HTML is only one alternate format. We also have sources that produce rich text (RTF), and wiki text. Agreeing on XHTML is a useful simplification. Regards, ___________________________________________________________________________ Arthur Ryman DE, PPM & Reporting Chief Architect IBM Software, Rational Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) From: Arthur Ryman/Toronto/IBM@IBMCA To: Dave Steinberg/Toronto/IBM@IBMCA Cc: [email protected], [email protected] Date: 08/30/2011 05:49 PM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Sent by: [email protected] Dave, My point was that when you use rdf:datatype, the content of the element must be a string, not XML. When you use rdf:parseType="Literal" the content is expected to be XML. In the RDF data model, the lexical space of XML consists of well-formed XML fragments, i.e. there is no escaping other than that required by XML. You managed to get the rdf:datatype case to validate by escaping the XML markup, i.e. turning it into a string, which seems like unnecessary work if you already have an XML fragment. BTW, I don't understand why the W3C RDF Validation service is displaying the XML content as escaped. That means the data is actually double-escaped. I'd be happier seeing plain text N-Triples or Turtle. It seems to me that since RDF/XML is well-formed XML, then the natural way to include XML literals is as XML, not as a string that contains escaped XML markup. However, I concede your point that in principle we don't need rdf:parseType="Literal" if you are sure that we get exactly the same set of triples using just rdf:datatype. If so, you are correct in saying that rdf:parseType="Literal" is just syntactic sugar. I see where you are going with this. You want OSLC to create a new datatype for HTML and you are demonstrating that rdf:datatype gives you the mechanism to do this. As I said before, creating a new datatype will limit interoperability since other processors will not know how to process the new datatype. There is no standard way to define the meaning of a new RDF datatype. Regards, ___________________________________________________________________________ Arthur Ryman DE, PPM & Reporting Chief Architect IBM Software, Rational Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) From: Dave Steinberg/Toronto/IBM@IBMCA To: [email protected] Date: 08/26/2011 05:31 PM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Sent by: [email protected] Hi Arthur, Sorry, but I just don't agree. The two links you gave are both to the RDF/XML spec, and they describe a special syntax for XMLLiteral-typed literals and a general syntax for typed literals. They do not state that the general syntax cannot be used for the case of XMLLiteral, and they don't say anything that contradicts my understanding of the RDF abstract data model. Indeed, if you follow the "XML literals" link in Section 2.8, the RDF Concepts spec defines XMLLiteral, like any other datatype, with a lexical space, a value space and a mapping between the two. So, given any XML value, what is to prevent you from using that mapping to compute a corresponding lexical form, combining it with the datatype URI, and using the ordinary literal notation (in any RDF concrete syntax)? I just tried entering the following two RDF/XML documents into the validation service: <rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdf=" http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="http://example.com/bugs/2314"> <dcterms:title rdf:parseType="Literal" xmlns="http://www.w3.org/1999/xhtml "> 12345: <s>Null pointer exception during startup</s></dcterms:title> </rdf:Description> </rdf:RDF> <rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdf=" http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <rdf:Description rdf:about="http://example.com/bugs/2314"> <dcterms:title rdf:datatype=" http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral"> 12345: <s xmlns="http://www.w3.org/1999/xhtml">Null pointer exception during startup</s></dcterms:title> </rdf:Description> </rdf:RDF> It yielded exactly the same result in both cases: I can also confirm Steve's claim that Jena can be configured to write out exactly the same triples using either syntax. Cheers, Dave -- Dave Steinberg IBM Rational Software [email protected] Arthur Ryman---08/26/2011 03:58:51 PM---Dave, No, it's not just syntactic sugar. You need rdf:parseType="Literal" if you include element con From: Arthur Ryman/Toronto/IBM To: Dave Steinberg/Toronto/IBM@IBMCA Cc: [email protected], [email protected] Date: 08/26/2011 03:58 PM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Dave, No, it's not just syntactic sugar. You need rdf:parseType="Literal" if you include element content. If you use rdf:datatype then only character content is allowed. This is explained in the spec at [1] and [2]. rdf:parseType="Literal" allows XML Literal content. rdf:datatype="whatever" allows string content. However, since specs are hard to understand, I suggest you convince yourself of this, as I did, by using the W3C RDF Validation service. [3] [1] http://www.w3.org/TR/REC-rdf-syntax/#section-Syntax-XML-literals [2] http://www.w3.org/TR/REC-rdf-syntax/#section-Syntax-datatyped-literals [3] http://www.w3.org/RDF/Validator/ Regards, ___________________________________________________________________________ Arthur Ryman DE, PPM & Reporting Chief Architect IBM Software, Rational Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) Dave Steinberg---08/26/2011 03:22:10 PM---Arthur, I believe you're mistaken. I think that parseType="Literal" is just From: Dave Steinberg/Toronto/IBM@IBMCA To: [email protected] Date: 08/26/2011 03:22 PM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Sent by: [email protected] Arthur, I believe you're mistaken. I think that parseType="Literal" is just syntactic sugar (RDF Primer: "RDF/XML provides a special notation to make it easy to write literals of this kind"). Either way you write it, you end up with the same statement. Two statements with the same subject, the same predicate and a typed literal with the same type (< http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>) and the same lexical form are indistinguishable. Also, if you were correct, parseType="Literal" would provide RDF/XML with some sort of privileged XMLLiteral representation that couldn't written out using any other RDF notation. Cheers, Dave -- Dave Steinberg IBM Rational Software 905-413-3705 [email protected] Arthur Ryman---08/26/2011 02:22:29 PM---Randy, Your example makes the content a string that looks like XHTML, i.e. the From: Arthur Ryman/Toronto/IBM To: Randy Hudson/Raleigh/IBM@IBMUS Cc: Dave Steinberg <[email protected]>, [email protected], [email protected] Date: 08/26/2011 02:22 PM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Randy, Your example makes the content a string that looks like XHTML, i.e. the content contains no XHTML elements since all the markup characters are encoded. A string is simply parsed character data and is valid XML. The correct way to include the XHTML elements is: <dcterms:title rdf:parseType="Literal"> 12345: <s xmlns=" http://www.w3.org/1999/xhtml">Null pointer exception during startup</s></dcterms:title> The OSLC Guidelines about escaping are for the case where you need to include characters that might get misinterpreted as XML markup. For example, consider a math statement like "1 < 2". When you put that in an XML element, you need to encode it as "1 < 2" Regards, ___________________________________________________________________________ Arthur Ryman DE, PPM & Reporting Chief Architect IBM Software, Rational Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) From: Randy Hudson/Raleigh/IBM@IBMUS To: Arthur Ryman <[email protected]> Cc: Dave Steinberg <[email protected]>, [email protected], [email protected] Date: 08/25/2011 07:06 PM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup The following input is also equivalent: <dcterms:title rdf:datatype=" http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral"> 12345: <s xmlns="http://www.w3.org/1999/xhtml">Null pointer exception during startup</s></dcterms:title> So there are (at least) two different ways to serialize a property value of type XML literal. But, the OSLC guidelines state: 1.2 If property value is a Literal value-type 1.2.1 Inside the XML element add the value as a string with any required escaping That would seem to suggest that the above form should be used. -Randy From: Arthur Ryman <[email protected]> To: Dave Steinberg <[email protected]> Cc: [email protected], [email protected] Date: 08/25/2011 04:34 PM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Sent by: [email protected] Dave, 1. XML Namespaces. RDF/XML is well-formed XML so it must support namespaces correctly. For triples whose datatype is XML Literal, the value of this literal is a well-formed XML fragment, and therefore the namespaces should be present in the content. If there is an enclosing <span> element, then the namespace should be there. Otherwise, each element in the content should have the namespace. The spec doesn't say "for XHTML, you need to insert an xmlns attribute for http://www.w3.org/1999/xhtml" because that is part of the XHTML standard, i.e. it's not XHTML unless the elements are in the XHTML namespace. 2. Jena I loaded the sample RDF/XML into Fuseki which uses Jena and it produced the correct result. I assume the Jena API lets you get an XML DOM from the literal value. The input contained: <dcterms:title rdf:parseType="Literal" xmlns=" http://www.w3.org/1999/xhtml"> 12345: <s>Null pointer exception during startup</s> </dcterms:title> The output value is: " 12345: <s xmlns="http://www.w3.org/1999/xhtml ">Null pointer exception during startup</s> "^^< http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral> 3. XHTML versus HTML The primary reason is that RDF supports XHTML via the XMLLiteral datatype. There is no parsing support for HTML built into RDF. Another strong reason is that the syntax of HTML is very irregular and hard to parse correctly - that is one of the reasons XML was invented. This is very important from a security viewpoint. To guard against script injection attacks, you really should parse the input and remove any <script> elements or Javascript attributes. Doing that correctly for HTML requires a full HTML parser. On the other hand, the XHTML is given to you as a DOM which you can easily traverse or process using XSLT or XPATH. 4. Datatypes The specs do specify the datatypes for some properties. Look at the Value-Type column of the tables, e.g. [1]. You need to include the datatype explicitly for ints, dates, XML. etc. You specify that using rdf:datatype in RDF/XML, or using ^^ in Turtle. I don't know what the state of adoption is. We really should get some test suites written for the specs. 5. Inventing new Datatypes The RDF spec defines the XSD datatypes and the XMLLiteral datatype. RDF parsers know how to parse those. If someone introduces a new datatype URI, it could break parsers since they won't know how to parse the contents. There is no standard way to define new datatypes. Try it with the RDF Validation service [2] [1] http://open-services.net/bin/view/Main/OSLCCoreSpecAppendixA [2] http://www.w3.org/RDF/Validator/ Regards, ___________________________________________________________________________ Arthur Ryman DE, PPM & Reporting Chief Architect IBM Software, Rational Toronto Lab | +1-905-413-3077 (office) | +1-416-939-5063 (mobile) From: Dave Steinberg/Toronto/IBM@IBMCA To: [email protected] Date: 08/24/2011 03:05 PM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Sent by: [email protected] Hi Arthur, Thanks for the response. Apologies for being slow in replying; I've been out sick for the last day and a half. I agree that putting the XML namespace on the enclosing element would be a convenience, but only if tools supported that. As far as I could find, Jena provides no fine-grained access to namespace declarations (i.e. other than at the model level), so I believe that one couldn't use it to produce or consume the fragment that you suggested. Moreover, the other RDF representations offer no such convenience, even in theory. So, it seems to me that the suggestion to use a namespace was actually a pretty significant one, and not one that's reflected in the specs, since you'd always need an enclosing element for your XML content. Thanks for the suggestion of using Tidy to convert from HTML to XHTML. That was very helpful for me. But I must admit, I'm still left wondering what makes XHTML superior to HTML for interchanging formatted text, especially in light of the compact representation example and my own experiences, where the opposite seems to be true. One last thing that I'll emphasize is that I mentioned a lack of guidance in the OSLC specs specifically about plain vs. typed literals. It seems so odd to me that plain literals seem to be favoured everywhere, except when in comes to using XMLLiteral with rdf:parseType="literal", but none of this is acknowledged or explained anywhere. It looks like using a typed literal in this one case is accepted merely as a requirement to benefit from the prettier RDF/XML syntax for XML content. However, I view things completely in the opposite light. To me, typed literals are a powerful benefit of RDF. You can use a typed literal to decide how to handle a literal value, without looking at the value itself, but that advantage is lost without a sufficiently specific type. Thus, I don't understand how defining and using a new RDF datatype to identify something as widely recognized and understood as HTML would impair interoperability. I think it would do the opposite. Cheers, Dave -- Dave Steinberg IBM Rational Software [email protected] Arthur Ryman---08/23/2011 10:09:55 AM---Dave, Thx for the comments. From: Arthur Ryman/Toronto/IBM To: Dave Steinberg/Toronto/IBM@IBMCA Cc: [email protected], [email protected] Date: 08/23/2011 10:09 AM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Dave, Thx for the comments. I agree that the guidance on using XMLLiteral is not very clear in the spec. There was a lot of discussion about this at the time the spec was under development, but not much of that discussion survived the editorial process. The only place I see it is in the appendix on standard properties - dcterms:title and dcterms:description. [1] The guidance was that dcterms:title should be valid XHTML <span> content and dcterms:description valid XHTML <div> content. This means that the RDF datatype should be XMLLiteral and that appropriate namespaces should be used for XHTML content. Putting the XHTML namespace on the enclosing element is a convenience. The parser should propagate that to the content, i.,e. when you look at the triples, the XML literal node should have the inherited namespace. If you wanted the namespace directly in the content then you could enclose the content in a <div> or <span> and put the namespace there. Using XHTML is the best way to achieve interchange of formatted text. There are converter from HTML to XHTML, e.g. Tidy. However, in the case of preview, why would conversion be needed? Shouldn't we be defining content that is XHTML? In another use case, people wanted to use native Wiki text as the content. However, that would cause a big interop problem since there are many Wiki syntaxes. All of these are convertible to XHTML since that is what the Wikis do to display the formatted result. In another use case, people wanted to include Rich Text. The general theme is that developers want to use whatever native format their tool supports, e,g, HTML, wiki text, and Rich Text, since it avoids conversions. However, this would couple the resource to the tool. OSLC is trying to achieve interoperability among heterogeneous tools. Therefore a common rich text format is needed. The alternative of defining new RDF datatypes for HTML, wiki text, RTF etc. would mean that OSLC resources would not be understood by other applications. In general, the creation of new RDF datatypes is discouraged since it impairs interoperability. [1] http://open-services.net/bin/view/Main/OSLCCoreSpecAppendixA?sortcol=table;up=#Dublin_Core_Properties Regards, ___________________________________________________________________________ Arthur Ryman DE, PPM Chief Architect IBM Software, Rational Toronto Lab | +1-905-413-3077 Twitter | Facebook | YouTube Dave Steinberg---08/23/2011 12:06:32 AM---Hi all, I've been following this thread with interest, as it touches on some of the From: Dave Steinberg/Toronto/IBM@IBMCA To: [email protected] Date: 08/23/2011 12:06 AM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Sent by: [email protected] Hi all, I've been following this thread with interest, as it touches on some of the more general confusion/discomfort I've been developing over the past several weeks or months about the use of XMLLiteral with rdf:parseType="Literal" for HTML content. Adam's comments below are particularly interesting. In general, it's not clear to me who benefits from the use of the unescaped literal representation, or in what scenario. And that approach, then, requires the use of the XMLLiteral type, which I also wonder about (as I'll explain further). If there is some benefit that I don't know about, perhaps it derails this whole line of thought. But if there isn't, could this be a case of the concrete representation tail wagging the abstract syntax dog? One thing that always struck me as odd was that rdf:parseType="Literal" examples were the only ones I could find anywhere in OSLC that use typed literals (the XMLLiteral type is implicit with this special RDF/XML syntax). Moreover, I couldn't find any guidance in the specs about the use of plain vs. typed literals at all. From the perspective of a client, anyway, it would seem a very nice thing if a particular provider would use a typed literal to tell you that a title, for example, should be treated as a simple string or as HTML content. And that's the very thing that typed literals do. It could be that the presence of an XMLLiteral type is supposed to signal the use of XHTML content, and the absence of any type is supposed to signal plain text. But I couldn't find that spelled out anywhere -- if it is, perhaps it's hard to find, or perhaps I just did a poor job of looking -- and I'd argue it would be better to include types in both cases. [1] It's this line of thinking that leads me to question the use of XMLLiteral in the first place. I saw in some old discussions that the intention in OSLC was not for XMLLiteral to imply XHTML necessarily. Using it for other XML languages was considered and endorsed, in principle. But where does that leave XHTML? With a type that doesn't really say what it is or what you can do with it. We have specs that communicate the XHMTL intent in words, but we also have a mechanism built into RDF that could communicate this, and we're not really using it fully. Thus, I think it would be preferable to define and use a type that specifically represents HTML. And note, I suggest HTML, not XHTML, since using any type other than XMLLiteral eliminates the "benefit" of the special rdf:parseType="Literal" syntax. And without that, I don't see a particular benefit in the stricter XHTML syntax. One other possibility that I've considered, which Arthur suggested previously, is using a namespace to identify that the XML is XHTML, in particular, instead of doing it directly in the literal type. And I believe that, strictly, the XHTML namespace is required for the elements to be valid XHTML. But I found no hint of this in the spec or any examples, and certainly RTC doesn't do this (I haven't checked other providers). Moreover, I believe it's also a worse approach, since there's no guarantee that your RDF runtime of choice will give you access to namespaces declared on the property element (I don't believe Jena does), and detecting a namespace inside the element content would require actually parsing the value as XML. If all you want to do is pass markup along for display in a browser, it would be unfortunate to have to actually parse the content to determine that it's XHTML. And this is where I close the loop on my thinking, by coming back to how a consumer might actually want to make use of HTML content. Even outside of the compact rendering scenario, ultimately it's probably going to get displayed by a browser, whether as part of a larger Web page or in a browser-backed widget in a rich client. And for that, HTML is probably just as good as, if not better than, XHTML. Rather than worrying about whether the content is well-formed XML, it's probably sufficient to just give it to the browser and see what it can do with it. I would assert that "something a browser can render" has been the working definition of HTML for a good number of years now, while XHTML has largely faded in importance. Going the other way, the appeal of HTML really shows. If a provider natively deals with HTML (without concern for XML well-formedness), it would be attractive to not have to convert that into XHTML to expose it via OSLC. Likewise, a consumer may use a rich text control that yields HTML. Generalized parsing of HTML for conversion to XHTML is non-trivial, and it seems unfortunate to impose that conversion task onto everyone, just so that we can use rdf:parseType="Literal" in RDF/XML and avoid applying normal XML encoding to markup content (of course, some encoding will likely be required for other RDF syntaxes anyway). So, those are my thoughts on this (admittedly enlarged) topic. Even if they all do make perfect sense (and I'm not necessarily claiming they do), I realize we may be well past the point of being able to act on them. Still, I thought I'd put them out there and see what others make of them. Cheers, Dave [1] In fact, I think that the consistent use of typed literals in general would be beneficial. You could even imagine exploiting them as a compatibility measure, if it was decided that the type of a property needed to change. This is a related, but separate, topic, which I'd be thrilled to discuss further, but I don't want to open too many cans of worms at once. [2] Or, perhaps, a less kind way of putting that is that the XHTML namespace is required for the elements to -- Dave Steinberg IBM Rational Software [email protected] Adam Archer---08/22/2011 06:20:05 PM---The big concern to me is not the ability to process the RDF/XML with XPath, it's the ability to do From: Adam Archer/Toronto/IBM@IBMCA To: Arthur Ryman/Toronto/IBM@IBMCA Cc: "[email protected]" <[email protected]>, Randy Hudson <[email protected]>, [email protected] Date: 08/22/2011 06:20 PM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Sent by: [email protected] The big concern to me is not the ability to process the RDF/XML with XPath, it's the ability to do so in a browser environment. Currently all implementations of all rich hovers in all Jazz based products encode any html tags in their dcterms:title attributes (and doubly encode special characters). For the consumer on the browser side, this means simply taking the content of the attribute, decoding it (which browsers are very good at) and slapping the result into the dom (which browsers are also very good at). The alternative would be a total consumability nightmare from the point of view of a browser (which is the most important consumer of this entire spec). If the tags are actually child nodes in the xml representation, it means we will have child elements in the resulting document that we get back from the xml http request which means we will have to traverse a dom tree and recreate a structure which could easily be represented as an escaped string, like everyone is doing today. I realize that implementation is not supposed to lead the spec, but I don't even think that would be the case here. The oslc compact spec grew organically out of the old jazz compact rendering spec which can be found here: https://jazz.net/wiki/bin/view/Sandbox/CompactRenderingV1P1 If we look at the semantic description of the dc:title and jp:abbreviation it states explicitly that the content MUST be escaped: > The HTML markup MUST be escaped; for example, "<b>" as "<b>". This decision was made consciously for very well defined technical reasons (discussed above) in the original spec. If that decision was reversed in the creation of the OSLC compact spec then I believe that to have been a huge mistake and would like to see the spec fixed rather than for all providers to have to change how their compact documents are served and all consumers to have to go to the trouble of walking the dom to determine what the provider is actually trying to show. Adam Archer Jazz Developer IBM Toronto Lab From: Arthur Ryman/Toronto/IBM To: Samuel Padgett <[email protected]> Cc: Adam Archer/Toronto/IBM@IBMCA, Randy Hudson <[email protected]>, "[email protected]" <[email protected]>, [email protected] Date: 08/22/2011 04:40 PM Subject: Re: [oslc-core] OSLC Compact representation, titles with markup Sam, You wrote: It's very difficult to parse the former using XPath. For instance, the expression "/oslc:Compact/dcterms:title" takes out the "<s>" and "</s>" I don't think problems using XPath are a valid reason to encode markup since RDF/XML itselt is very difficult to process using XPath. At one point we tried to define an OSLC-variant of RDF/XML that looked like "normal" XML. However, we abandonned that and now require support for generic RDF/XML. The are many equivalent ways to represent a given set of triples in RDF/XML. It would therefore be very problematic to use XPath, XSLT, or XQuery to process RDF/XML. The safe way to process RDF/XML is to use an RDF toolkit like Jena. Regards, ___________________________________________________________________________ Arthur Ryman DE, PPM Chief Architect IBM Software, Rational Toronto Lab | +1-905-413-3077 Twitter | Facebook | YouTube From: Samuel Padgett <[email protected]> To: "[email protected]" <[email protected]> Cc: Adam Archer/Toronto/IBM@IBMCA, Randy Hudson <[email protected]> Date: 08/07/2011 01:01 PM Subject: [oslc-core] OSLC Compact representation, titles with markup Sent by: [email protected] I believe the spec is a bit confusing when it comes to titles with markup for UI Preview. The Compact representation has a dcterms:title property. It's defined as an XML Literal that can contain XHTML markup [1]. My understanding of XML Literals as discussed in the RDF Primer [2] means a title with markup would look like this, <dcterms:title>12345: <s>Null pointer exception during startup</s></dcterms:title> The example [3] of this resource has a title like this, however, <dcterms:title> 12345: <s>Null pointer exception during startup</s> </dcterms:title> The example doesn't seem to fit with the description. It's very difficult to parse the former using XPath. For instance, the expression "/oslc:Compact/dcterms:title" takes out the "<s>" and "</s>" Most implementations I'm aware also follow the example where markup is encoded. It means special characters need to be "double encoded." For instance, "12345: Values > 1000 incorrectly calculated" would be, <dcterms:title>12345: Values &gt; 1000 incorrectly calculated</dcterms:title> I think we should add more clarity to the spec here, as getting this wrong can open up consumers to cross-site scripting attacks. I'd also suggest we say that providers MUST NOT use any markup with a <script> tag and consumer MUST NOT display any markup with a <script> tag to guard against this problem. Best Regards, Sam [1] http://open-services.net/bin/view/Main/OslcCoreUiPreview?sortcol=table;up=#Representation_Compact [2] http://www.w3.org/TR/rdf-syntax/#xmlliterals [3] http://open-services.net/bin/view/Main/OslcCoreUiPreview?sortcol=table;up=#XML_Representation_Format _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.net/mailman/listinfo/oslc-core_open-services.net _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.net/mailman/listinfo/oslc-core_open-services.net _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.net/mailman/listinfo/oslc-core_open-services.net _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.net/mailman/listinfo/oslc-core_open-services.net _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.net/mailman/listinfo/oslc-core_open-services.net _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.n et/mailman/listinfo/oslc-core_open-services.net _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.net/mailman/listinfo/oslc-core_open-services.net _______________________________________________ Oslc-Core mailing list [email protected] http://open-services.net/mailman/listinfo/oslc-core_open-services.net
