Hi there,
The rules for RDFXML literals requirte canoncial XML - this is quite a
hogh barrer and much XMl
'
<OMOBJ xmlns=\"http://www.openmath.org/OpenMath\"
version=\"2.0\"
cdbase=\"http://www.openmath.org/cd\">
</OMOBJ>
'
1/ The extra newlines before and after probably don't help.
2/ The attributes must be in alphabetical order; xmlns= is not an
attribute (namespaces before attributes) but this has "version" then
"cdbase"
Andy
(I'm some what tempted to suggest we drop all RDF XML Literal checking)
http://www.w3.org/TR/rdf-concepts/#section-XMLLiteral
==>
http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/#def-exclusive-canonical-XML
On 10/07/13 04:10, Joshua TAYLOR wrote:
On Tue, Jul 9, 2013 at 8:01 PM, Joshua TAYLOR <[email protected]> wrote:
Hi all, I'm trying to create an XMLLiteral and get the value from it.
(This is a reduced version of a larger problem, where I'm reading a
model that has some XMLLiterals, and I need to do some work with
them.) It appears that getValue() throws DatatypeFormat Exception
when this happens, though. Here's code that illustrates the problem:
import com.hp.hpl.jena.datatypes.xsd.impl.XMLLiteralType;
import com.hp.hpl.jena.rdf.model.Literal;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.vocabulary.RDF;
public class ReadXMLLiterals {
public static void main( String[] args ) {
// Some XML content with newlines.
final String xmlContent = "<OMOBJ
xmlns='http://www.openmath.org/OpenMath'\n" +
"
cdbase='http://www.openmath.org/cd'>\n" +
"</OMOBJ>";
// Create a model, and an XMLLiteral with the given content,
and
// add a triple [] rdf:value xmlliteral so that something can
be shown in the
// model, and to demonstrate that the literal can be created
without error.
// Print the model.
System.out.println( "=============" );
final Model model = ModelFactory.createDefaultModel();
final Literal xmlLiteral = model.createTypedLiteral(
xmlContent,
XMLLiteralType.theXMLLiteralType );
model.createResource().addProperty( RDF.value, xmlLiteral );
model.write( System.out, "N3" );
// Try to get the value of the literal and watch things fall
apart.
// The error is that "Lexical form ... is not a legal
instance of
//
Datatype[http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral]
Bad rdf:XMLLiteral.
// How come there was no complaint when the literal was
created?
And what's the
// problem with the lexical form?
System.out.println( "-------------" );
try {
xmlLiteral.getValue();
}
catch ( Exception e ) {
e.printStackTrace( System.out );
}
}
}
The output shows that the model can be created (and thus that the
XMLLiteral can be created) and serialized (showing the proper
datatype). However, when trying to use Literal#getValue to get a
value from the XMLLiteral, things explode:
=============
[] <http://www.w3.org/1999/02/22-rdf-syntax-ns#value>
"""<OMOBJ xmlns='http://www.openmath.org/OpenMath'
cdbase='http://www.openmath.org/cd'>
</OMOBJ>"""^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral> .
-------------
com.hp.hpl.jena.datatypes.DatatypeFormatException: Lexical form
'<OMOBJ xmlns='http://www.openmath.org/OpenMath'
cdbase='http://www.openmath.org/cd'>
</OMOBJ>' is not a legal instance of
Datatype[http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral] Bad
rdf:XMLLiteral
at
com.hp.hpl.jena.graph.impl.LiteralLabelImpl.getValue(LiteralLabelImpl.java:324)
at
com.hp.hpl.jena.graph.Node_Literal.getLiteralValue(Node_Literal.java:39)
at
com.hp.hpl.jena.rdf.model.impl.LiteralImpl.getValue(LiteralImpl.java:98)
at ReadXMLLiterals.main(ReadXMLLiterals.java:33)
Is something actually actually malformed with the the lexical form?
And if it is, should it have been caught when the literal was created?
--
Joshua Taylor, http://www.cs.rpi.edu/~tayloj/
OK, the last post was a first step toward trying to track down a
mysterious RIOT warning that I was getting in some circumstances
involving an XMLLiteral. It seems that the warning is only triggered
if certain things have happened with the model beforehand. Here's a
minimal working example:
import java.io.ByteArrayInputStream;
import com.hp.hpl.jena.datatypes.xsd.impl.XMLLiteralType;
import com.hp.hpl.jena.rdf.model.Literal;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.sparql.expr.NodeValue;
public class ReadXMLLiterals2 {
// Some N3 content that contains a resource :test with an
// rdf:value that is an XMLLiteral.
final static String n3Content =
"@prefix : <http://example.org/> .\n" +
"@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
.\n" +
":test rdf:value \"\"\"\n" +
"<OMOBJ xmlns=\"http://www.openmath.org/OpenMath\"\n" +
" version=\"2.0\"\n" +
" cdbase=\"http://www.openmath.org/cd\">\n" +
"</OMOBJ>\n\"\"\"^^rdf:XMLLiteral .";
// Demo creates a model and reads the N3 content into it, and then
prints the model.
// If produceWarning is true, then the model is used to create an
XMLLiteral first
// with very simple content (<foo></foo>) and then do a bit of
manipulation with it.
// Particularly, we get a node from the literal, then make a node
value from that.
// In the case that the literal is created and a node value made from
its node, a
// RIOT warning, "Lexical form '...' not valid for datatype
rdf:XMLLiteral", is
// printed, *for the XMLLiteral appearing in the n3Content that is
read during the
// call to Model#read.
public static void demo( final boolean produceWarning ) {
System.out.println( "============" );
final Model model = ModelFactory.createDefaultModel();
if ( produceWarning ) {
final Literal xmlLiteral = model.createTypedLiteral(
"<foo></foo>",
XMLLiteralType.theXMLLiteralType );
NodeValue.makeNode( xmlLiteral.asNode() );
}
model.read( new ByteArrayInputStream( n3Content.getBytes() ), null,
"N3" );
System.out.println( "------------");
model.write( System.out, "N3" );
}
// Run the demo both with and without the warning.
public static void main( final String[] args ) {
demo( false );
demo( true );
}
}
The output from this follows. In the first case, where nothing
happens with the model before reading the n3 content, the n3content is
read without any warnings, and the model is printed without a problem.
In the case that we create an XMLLiteral first and manipulate it,
then a warning occurs when reading the n3 content.
============
------------
@prefix : <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
:test
rdf:value """
<OMOBJ xmlns=\"http://www.openmath.org/OpenMath\"
version=\"2.0\"
cdbase=\"http://www.openmath.org/cd\">
</OMOBJ>
"""^^rdf:XMLLiteral .
============
23:06:11,042 WARN riot:68 - [line: 3, col: 17] Lexical form '
<OMOBJ xmlns="http://www.openmath.org/OpenMath"
version="2.0"
cdbase="http://www.openmath.org/cd">
</OMOBJ>
' not valid for datatype http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral
------------
@prefix : <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
:test
rdf:value """
<OMOBJ xmlns=\"http://www.openmath.org/OpenMath\"
version=\"2.0\"
cdbase=\"http://www.openmath.org/cd\">
</OMOBJ>
"""^^rdf:XMLLiteral .
This is only a warning, so it's not a showstopper, but with bigger
XMLLiterals it makes test output much harder to read (yes, I'm sure I
could go and change the log levels for RIOT, but in general I'm
interested in seeing any warnings that Jena produces), and I'm quite
puzzled that reading the *same content* could cause a warning in one
case, but not in another.
Any help is much appreciated, and thanks in advance!
//JT