On 11/10/14 02:54, Rouquette, Nicolas F (313D) wrote:

Firstly,

BananaRDF is a separate, independent project. Have you asked them about this? I guess they don't read this list as they haven't replied.

By the way, that project cares a lot more about Turtle and related formats. For them, RDF/XML is from a bygone era.

I think the problem is the following:

- it is possible to construct a Jena Graph where some URIs are
 relative.

Yes - and such a graph is not RDF -- the RDF data model requires absolute URIs. If you push the system outside that, there are no guarantees. Only RDF/XML will have problems. Certain features

Checking every string for being an absolute URI is way too expensive. (The project has been through this before!)

- some of Jena's "write" APIs do not provide a way to specify the
base URI
for converting relative URIs in the graph into absolute URIs in the
output syntax (whether it is using a "@base" or not)

No - that's not what the base URI is used for on output.
That's how it is used for reading.

The base URI on output is used to abbreviate:

<http://www.w3.org/2001/sw/RDFCore/ntriples/> and a baseURI of <http://www.w3.org/2001/sw/RDFCore/> puts "ntriples/" into the output.

It does not take "ntriples/" and produce "http://www.w3.org/2001/sw/RDFCore/ntriples/";.

However, the Jena API -- specifically RDFDataMgr.write() -- seems to
exepect that all URIs in the input graph are absolute since it does
not provide support for specifying a base URI for relative URIs in
the graph.

Correct - that is because RDF is defined that way.

A base URI on writing is NOT used to make URIs absolute. It is used to abbreviate absolute URIs in relative form in the syntax.

Try this:

    String x = "<http://example/x> <http://example/p> <o>." ;
    Model m = ModelFactory.createDefaultModel() ;
    RDFDataMgr.read(m, new StringReader(x), null, Lang.NTRIPLES) ;
    m.write(System.out, "RDF/XML-ABBREV", "http://example/";) ;

and get

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
    xmlns:j.0="http://example/";>
  <rdf:Description rdf:about="x">
    <j.0:p rdf:resource="o"/>
  </rdf:Description>
</rdf:RDF>

Note <http://example/x> in the data becomes rdf:about="x" -- it this then the applications responsibility to ensure that when it is read, the correct base is supplied.

Absolute in, relative out.

Banana-rdf provides a generic API for constructing RDF graphs --
I.e., the same code can be compiled by binding the parameter to, e.g.
The Jena API or the Sesame API.
Similarly, constructing a graph with the banana-rdf API and
serializing with different libraries -- e.g., Jena, Sesame -- should
result in 2 API-specific serializations of graphs that should be
isomorphic to each other as far as the W3 RDF spec is concerned.

If you wish to submit a patch to Jena that would be great.

Maybe one of:

1/ Patch to the XML writer (at a glance "relativize" but I haven't written tests)
2/ Patch for RDFDataMgr to allow the base URI to be passed in.

Workarounds:
A/ Don't use RDF/XML, use Turtle.
B/ Call the writer in this style : model.write(....baseURI....)

        Andy




On 10/9/14 3:39 AM, "Andy Seaborne" <[email protected]> wrote:

Nicolas,

Which version of Jena are you referring to? (the line number for
RDFDataMgr does not seem to line up).

This is from the 2.11.2 version of jena-arq
In the git master branch, the corresponding code is at line 1331:


     private static void write$(OutputStream out, DatasetGraph dataset,
RDFFormat serialization)
     {
         WriterDatasetRIOT w = createDatasetWriter$(serialization) ;
         w.write(out, dataset, RiotLib.prefixMap(dataset), null, null) ; //
line 1331
     }


And what's the data being serialized?  There is some BananaDRF
processing applied to the data read in teh example.  Could you provide
(N-Triples is quite forgiving) the data or a short, standalone program
that can produce it?

The test was originally written like this:

   "write simple graph as TURTLE string" in {
     val turtleString = writer.asString(referenceGraph,
"http://www.w3.org/2001/sw/RDFCore/";).get
     turtleString should not be ('empty)
     val graph = Await.result(reader.read(turtleString, rdfCore),
Duration(1, SECONDS))
     assert(referenceGraph isIsomorphicWith graph)
   }


At this point, "referenceGraph" is well-formed: every URI is absolute

  {http://www.w3.org/2001/sw/RDFCore/ntriples/
@http://purl.org/dc/elements/1.1/publisher http://www.w3.org/;
   http://www.w3.org/2001/sw/RDFCore/ntriples/
@http://purl.org/dc/elements/1.1/creator "Art Barstow";
   http://www.w3.org/2001/sw/RDFCore/ntriples/
@http://purl.org/dc/elements/1.1/creator "Dave Beckett"}



When writer.asString() executes, it calls this:

     def asString(graph: Jena#Graph, base: String): Try[String] = Try {
       val result = new StringWriter()
       import org.w3.banana.jena.Jena.ops._
       val relativeGraph : Jena#Graph = graph.relativize(URI(base))
       RDFDataMgr.write(result, relativeGraph, lang)
       result.toString()
     }


graph.relativize() is a banana-rdf API -- it constructs a new Jena Graph:

  {ntriples/ @http://purl.org/dc/elements/1.1/creator "Dave Beckett";
   ntriples/ @http://purl.org/dc/elements/1.1/creator "Art Barstow";
   ntriples/ @http://purl.org/dc/elements/1.1/publisher http://www.w3.org/}


As I mentioned above, the Jena API RDFDataMgr.write() does not provide a
way to pass a base URI for the graph.
So given a relative Jena Graph as input, we get the serialization of that
graph which is also relative (as the value of the "result" string writer:

<ntriples/>       <http://purl.org/dc/elements/1.1/creator> "Dave Beckett" ,
"Art Barstow" ;
                <http://purl.org/dc/elements/1.1/publisher> 
<http://www.w3.org/> .



There is some confusion about relative URIs here.

Possibly


The RDF data model is defined in terms of absolute URIs.  A relative URI
should never occur in an RDF graph.  They can occur in RDF syntax.

However, the Jena API -- specifically RDFDataMgr.write() -- seems to
exepect that all URIs in the input graph are absolute since it does not
provide support for specifying a base URI for relative URIs in the graph.



A base URI on writing is used to convert an absolute URI to a relative
one in the output syntax.  As the base URI is also written out, the
whole RDF Graph still has absolute URIs when read back in again.

I think the problem is the following:

- it is possible to construct a Jena Graph where some URIs are relative.
- some of Jena's "write" APIs do not provide a way to specify the base URI
for converting relative URIs in the graph into absolute URIs in the output
syntax (whether it is using a "@base" or not)


The report you reference was caused by "" as a subject URI.  That's not
legal RDF. <> is not "".

Sorry -- I don't understand what you're referring to.


Supplying the baseURI on writing only affects the abbreviation of URIs
in the data. They must be absolute in the data in the first place.

I agree with you except that, as I explained above, it *is* possible to
construct Jena Graphs where some URIs are relative.
Perhaps writing such graphs (regardless of the output format) should throw
an exception unless one is supplying a base URI.


BananaRDF has some way to use Jena so that it creates the output it
wants.  I thought BananaRDF achieved this by adding some sort of
identifiable maker to URIs as the initial part of the URI.

No.

Here's my understanding of the design & goal of the banana-rdf API (I'm a
recent user)

banana-rdf is effectively a parameterized API for RDF.
The API parameter "binds" the generic banana-rdf API types & operations to
the types & operations of a particular API library (e.g. Jena, Sesame,
etcŠ)

So normally, it shouldn't matter which library one uses -- Jena, Sesame,
etc.. -- the results should be consistent.

For example, reading a graph with one API and reading the same graph with
another API should result in 2 API-specific graphs
that should be isomorphic to each other as far as the W3 RDF spec is
concerned.

Banana-rdf provides a generic API for constructing RDF graphs -- I.e., the
same code can be compiled by binding the parameter to, e.g. The Jena API
or the Sesame API.
Similarly, constructing a graph with the banana-rdf API and serializing
with different libraries -- e.g., Jena, Sesame -- should result in 2
API-specific serializations of graphs that should be isomorphic to each
other as far as the W3 RDF spec is concerned.


- Nicolas.


        Andy

On 09/10/14 00:26, Rouquette, Nicolas F (313D) wrote:
I understand that several folks have had this exception in conjunction
with Fuseki/TDB:

http://jena.markmail.org/search/?q=BadURIException#query:BadURIException+
pa
ge:1+mid:fur2joez3ny5ibvw+state:results

I've tracked down this exception in a Scala example from the
w3/banana-rdf
project -- specifically:

https://github.com/w3c/banana-rdf/blob/master/examples/src/main/scala/org
/w
3/banana/examples/IOExample.scala

This exception happens during the RDF/XML serialization:

Unparser.wObjStar() line: 358   
Unparser.wRDF() line: 345       
Unparser.write() line: 247      
Abbreviated.writeBody(Model, PrintWriter, String, Boolean) line: 142    
BaseXMLWriter.writeXMLBody(Model, PrintWriter, String) line: 492        
BaseXMLWriter.write(Model, Writer, String) line: 464    
Abbreviated.write(Model, Writer, String) line: 127      
BaseXMLWriter.write(Model, OutputStream, String) line: 450      
AdapterRDFWriter.write(OutputStream, Graph, PrefixMap, String, Context)
line: 52        
RDFDataMgr.write$(OutputStream, Graph, RDFFormat) line: 1262    
RDFDataMgr.write(OutputStream, Graph, RDFFormat) line: 1028     
RDFDataMgr.write(OutputStream, Graph, Lang) line: 1018  
JenaRDFWriter$$anon$1$$anonfun$write$1.apply$mcV$sp() line: 20  
JenaRDFWriter$$anon$1$$anonfun$write$1.apply() line: 17 
JenaRDFWriter$$anon$1$$anonfun$write$1.apply() line: 17 
Try$.apply(Function0) line: 191 
JenaRDFWriter$$anon$1.write(Graph, OutputStream, String) line: 17       
JenaRDFWriter$$anon$1.write(Object, OutputStream, String) line: 15      
IOExample$class.main(IOExample, Array[String]) line: 44 
IOExampleWithJena$.main(Array[String]) line: 64 
IOExampleWithJena.main(Array[String]) line: not available       


I believe the problem originates, in part, here:

RDFDataMgr.write$(OutputStream, Graph, RDFFormat) line: 1262


      private static void write$(OutputStream out, Graph graph, RDFFormat
serialization)
      {
          WriterGraphRIOT w = createGraphWriter$(serialization) ;
          w.write(out, graph, RiotLib.prefixMap(graph), null, null) ; //
line 1262
      }

The last 2 null arguments are the values for the "baseURI" and "context"
parameters.
Are all writers able to cope with a null base URI?





It seems that the Turtle writer can but not the RDF/XML writer.
The BadURIException happens during Unparser.wObjStar() when one of the
Resources has a null URI:

BaseXMLWriter.checkURI(String) line: 820        
BaseXMLWriter.relativize(String) line: 797      
Unparser.wURIreference(String) line: 918        
Unparser.wURIreference(Resource) line: 922      
Unparser.wAboutAttr(Resource) line: 913 
Unparser.wIdAboutAttrOpt(Resource) line: 869    
Unparser.wTypedNodeOrDescriptionLong(Unparser$WType, Resource, Resource,
List) line: 830 
Unparser.wTypedNodeOrDescription(Unparser$WType, Resource, Resource)
line:
764     
Unparser.wTypedNode(Resource) line: 737 
Unparser.wObj(Resource, Boolean) line: 677      
Unparser.wObjStar() line: 364   

Because the writer has a null baseURI, then
BaseXMLWriter.relativize(null)
returns null.
This then fails the URI check; hence the BadURIException.

To avoid this problem, I refactored the call to RDFDataMgr.write(),
originally:

def write(graph: Jena#Graph, os: OutputStream, base: String): Try[Unit]
=
Try {
        import org.w3.banana.jena.Jena.ops._
        val relativeGraph : Jena#Graph = graph.relativize(URI(base))
        RDFDataMgr.write(os, relativeGraph, lang)
      }

To the following:

def write(graph: Jena#Graph, os: OutputStream, base: String): Try[Unit]
=
Try {
        import org.w3.banana.jena.Jena.ops._
        val relativeGraph : Jena#Graph = graph.relativize(URI(base))
        val serialization: RDFFormat =
RDFWriterRegistry.defaultSerialization(lang)
        val wf: WriterGraphRIOTFactory =
RDFWriterRegistry.getWriterGraphFactory(serialization)
        if ( wf == null )
              throw new RiotException("No graph writer for
"+serialization)
;
        val w: WriterGraphRIOT = wf.create(serialization)
        w.write(os, relativeGraph,
system.RiotLib.prefixMap(relativeGraph),
base, null)
      }

This is basically a copy/paste adaptation of what the Jena API does
anyway
with the difference that I pass the graph's base URI to the writer.

Well, the refactored version of the test works, the original doesn't.

- Nicolas.




Reply via email to