On 30/03/15 14:21, Reto Gmür wrote:
The current code uses an interface
IRI (a different from URL and URI in the java core library for which I fail
to understand the justifying use cases)
java.net.URLs has inappropriate operations (.open)
The stumbling block is the desired for a typed interface
and the fact that subjects are "BlankNodeOrIRI"
java.net.URI is a class and is final.
That can be worked around although IMO it's not pretty to have, say, a
union for BlankNodeOrIRI + variations on all method calls mentioning
BlankNodeOrIRI as arguments.
java.net.URI is not bad and it's UTF-8 + IPv6 additions, not strict
US-ASCII.
The constructor has an implicit parser behind it so it is not a simple
wrapper of little cost. The parser is good for the syntax. It does not
do punycode in toASCIIString().
There is a question of how to treat bad data - whether to allow bad IRIs
at all so that an application can use the API and then clean the data up
by processing the RDF or whether it needs to clean it beforehand.
That's a style issue, not a purely technical one.
Check early vs be as permissive as possible.
(example: other people's data is often fit for their purpose but may be
strictly "bad". An ETL pipeline might want to just get stuff in and then
fix it working in RDFTerms, e.g. apply " " to "%20" or NFC rules.)
As commons-rdf is supposed to be neutral to underlying providers, making
that call is not right.
Andy