On 03/31/2013 02:13 PM, Andy Seaborne wrote:
On 31/03/13 11:47, lou1se m1ch3l wrote:
I'm working with input models like this one:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
     xmlns="http://www.test.org/";>

     <rdf:Description rdf:about="http://www.test.org/test";>
         <sometext xml:lang="en" rdf:parseType="Literal">some
text</sometext>
<sometext xml:lang="fr" rdf:parseType="Literal">texte</sometext>
     </rdf:Description>

</rdf:RDF>

Hi there,

In RDF/XML if both a datatype and a language is given the datatype take precedence and the language, or enclosing language (it may be further out) is ignored.

 <sometext xml:lang="en"
           rdf:datatype="http://www.w3.org/2001/XMLSchema#string";
 >foo</sometext>

==> "foo"^^xsd:string.  No language.

rdf:parseType="Literal" is like writing rdf:datatype=rdf:XMLLiteral except it also tells the parer to use the XML context as the literal lexical form.

So the language is ignored - XMLLiterals are supposed to be self-contained XML fragments, independent of the XML document context as if wrapped in <div></div>

To put a language in the content

<sometext rdf:parseType="Literal"><span xml:lang="fr">texte<span></sometext>

The language is not part of the RDF literal and SPARQL LANG() wil not get it.

Your SPARQL query is right - it is the data that does not contain the language information.

    Andy


Thank you for the detailed and clear answer.

I didn't know (as you can guess) that "In RDF/XML if both a datatype and a language is given the datatype take precedence and the (enclosing) language is ignored", though my intuition was that this use of parseType="Literal" was inappropriate. As I don't have any influence on the incoming data model, I "solved" my problem by pre-processing the input to get rid of the problematic parseType attributes. Notice that I was unable to succeed in this at the RDF level (within Jena API), as the language information seems to be definitively lost (at least I couldn't retrieve it) once the graph is built: I had to do this pre-processing at the text/XML level.

Regards.


I'm trying to load the text for a given language, using some SPARQL like
this:

SELECT ?sometext
WHERE {
     ?x <http://www.test.org/sometext>?sometext .
     FILTER (LANG(?sometext) = 'en')
}

As you can see bellow, I'm having trouble to filter the result according
to the language (en):
$ sparql --data=data.rdf.xml  --query=test.rq
------------
| sometext |
============
------------

If I comment out the filtering as
# FILTER (LANG(?sometext) = 'en')

the result is as bellow:
------------------------------------------------------------------------
| sometext |
========================================================================
| "texte"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>|
| "some text"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>|
------------------------------------------------------------------------

If I remove all the rdf:parseType="Literal" attributes from the model,
using the initial query, result is:
------------------
| sometext       |
==================
| "some text"@en |
------------------

So, it seems that the xml:lang attribute applies to the sometext element
itself, but _not_ to the enclosed literal denoted by
rdf:parseType="Literal".
I'm quite sure it's not a bug within the Jena framework, but rather a
consequence from my ignorance: how should i write the SPARQL query to
filter the results according to the value of the xml:lang attribute,
provided I have to accommodate with the input model ?

Thanks for any advice or interesting pointer.
Regards.




Reply via email to