On 08/01/15 12:57, Martynas Jusevičius wrote:
Thanks.

Couldn't Jena's IRI be made to extend java.net.URI, override the
methods that differ, and still be accepted where URI is expected?

java.net.URI is a final class.

Probably other issues as well (IRIs are unicode, URIs are ASCII ; RFC 2396 vs 3896) but that's a blocker in itself.

IRIs are not particularly cheap - the resolver has caching to mitigate this.

The parsing rules such as the scheme-specific stuff (that IRI adds over j.n.URI) can have a noticeable cost (e.g. parsing). Another cost is the use of regexs - using java.net.URI to parse the structure (it has a hand-written URI parser) might help partly.

Slightly bizarrely, if you pass in the components of a java.net.URI, java builds a string and parses it to get the components again.

Adding IRI.toURI operation might help c.f. IRI.toURL. Contributions welcome.

        Andy


On Tue, Jan 6, 2015 at 10:27 PM, Andy Seaborne <[email protected]> wrote:
On 06/01/15 16:22, Martynas Jusevičius wrote:

Hey,

I'm reading URIs from request input that will end up in an RDF Model.
They can be relative, in which case they need to be resolved, and they
can be invalid, in which case they need to be rejected.

What I'm looking for is to replace new URI/URI.create() and
URI.resolve(uri) usages with a more RDF-compliant solution.

I wanted to check if IRIResolver is the right class for this purpose?
Are there any examples?


org.apache.jena.riot.system.IRIResolver (not the legacy one in the old N3
parser).

This uses the jena-iri library which is quite, err, "precise".

See also the parsing pipeline that uses CheckerIRI for checking.  In fact,
you might want to use StreamRDF (where all parers send things).

         Andy



Thanks.

Martynas



Reply via email to