paulmillar opened a new issue, #1663: URL: https://github.com/apache/jena/issues/1663
### Version 4.6.0 ### Feature I don't think this is a bug per se , but (seemingly) I've hit a limitation on what `riot` can do. Since v1.4, PDF has supported embedding an RDF graph as metadata. This has been standardised as [XMP](https://en.wikipedia.org/wiki/Extensible_Metadata_Platform). I believe that, by convention, XMP uses the empty IRI to indicate that the subject of triples is the PDF file itself. [The Wikipedia example](https://en.wikipedia.org/wiki/Extensible_Metadata_Platform#Example) suggests this; however, I haven't verified this by checking the XMP specification. I wrote some simple metadata in Turtle to illustrate the problem/limitation: ```turtle @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. <> dc:description "An example that demonstrates a problem."@en; dc:title "An example title"@en; dc:creator "Jane Doe"; dc:date "2022-12-04"; dc:language "en-GB"; . ``` I am able to use the `riot` command to convert this Turtle data into a corresponding RDF/XML file, as needed by XMP. ```console paul@sprocket:~/Riot problem$ riot --formatted=RDF/XML example.ttl <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="file:///home/paul/Riot%20problem/example.ttl"> <dc:language>en-GB</dc:language> <dc:date>2022-12-04</dc:date> <dc:creator>Jane Doe</dc:creator> <dc:title xml:lang="en">An example title</dc:title> <dc:description xml:lang="en">An example that demonstrates a problem.</dc:description> </rdf:Description> </rdf:RDF> paul@sprocket:~/Riot problem$ ``` The problem here is that `riot` "helpfully" expands the empty IRI into a corresponding `file:` IRI. Note that the `rdf:Description` element contains the `rdf:about` attribute with a value `file:///home/paul/Riot%20problem/example.ttl`. This is a problem because 1. the resource is the Turtle file rather than the PDF file, 2. IRIs are absolute and the PDF file may be renamed or copied onto a different system. I was hoping for `riot` to generate the following XML: ```xml <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about=""> <dc:language>en-GB</dc:language> <dc:date>2022-12-04</dc:date> <dc:creator>Jane Doe</dc:creator> <dc:title xml:lang="en">An example title</dc:title> <dc:description xml:lang="en">An example that demonstrates a problem.</dc:description> </rdf:Description> </rdf:RDF> ``` As far as I'm aware, the output from `riot` is correct, as the empty IRI is equivalent to the expanded resource (again, I haven't checked this with RDF spec.). Therefore, I wouldn't classify this as a bug. However, the output isn't what I need and I haven't found an option to `riot` to get the desired output; i.e., with `rdf:about=""`. A simple solution might be to add an option that suppresses riot/Jena's ability to expand an empty IRI. A more sophisticated solution would identify IRIs that are the input file itself and replace them with the empty IRI. Just as a side-node: embedding the above RDF/XML infoset under a `<x:xmpmeta xmlns:x="adobe:ns:meta/">` element allows [`podofoxmp`](https://github.com/podofo/podofo) to create a new PDF file that includes the desired RDF graph. ### Are you interested in contributing a solution yourself? None -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
