On 20/07/15 20:29, Gary King wrote:
Suppose I have a triple-store containing like

    <http://a> <http://b> "Hi \u001A is control-Z” .

What should the SPARQL/XML output be for this query:

     SELECT ?o { ?s ?p ?o }

If I use Apache Jena 2.13.0 and ask for JSON, I get:

{
   "head": {
     "vars": [ "o" ]
   } ,
   "results": {
     "bindings": [
       {
         "o": { "type": "literal" , "value": "Hi \u001A is control-Z" }
       }
     ]
   }
}

Asking for XML, however, gives me:

<?xml version="1.0"?>
<sparql xmlns="http://www.w3.org/2005/sparql-results#";>
   <head>
     <variable name="o"/>
   </head>
   <results>
     <result>
       <binding name="o">
         <literal>Hi  is control-Z</literal>

There is a real raw control-Z in that line (which is illegal in XML 1.0). It just displays as a space character in some fonts. if I cut&paste the line into emacs it displays as ^Z.

Unfortunately, you can't conneg for XML 1.0 vs XML 1.1 as far as I know which makes the whole thing a bit of a "no win" situation. The SPARQL Results in XML spec happens to say "XML 1.0".

Historically, an app couldn't (spec-wise) get the character in first place (RDF/XML in XML 1.0). Nowadays, Turtle,

BTW: The state of XML 1.1 for Java is iffy:
https://bugs.openjdk.java.net/browse/JDK-8029437

       </binding>
     </result>
   </results>
</sparql>

Where the control-Z character has disappeared.

AFAIK, XML 1.0 cannot encode these control characters, whereas an XML 1.1 output 
could use &#x1a;. I also see that the RDF validator 
(http://www.w3.org/RDF/Validator/) is perfectly happy with the results whereas it 
seems as if it should not be?

thoughts?

Use JSON (and it parses faster), or SPARQL results in TSV.

Emit XML 1.1 if you need to.

        Andy


thanks,
--
Gary Warren King, metabang.com
Cell: (413) 559 8738
Fax: (206) 338-4052
gwkkwg on Skype * garethsan on AIM * gwking on twitter




Reply via email to