afs commented on issue #2052:
URL: https://github.com/apache/jena/issues/2052#issuecomment-1773725617

   It's a bit more messy than that. There is a long timescale history and a 
very slow/careful evolution in the standards and implemnentations on the web.
   
   Browsers handle all this for the user so often the user does not notice. 
Toolkits nowadays often seem to do it as well (the newer `java.net.http`, for 
example) so applications rarely see this. But if you push on detail and 
correctness, it can show though.
   
   Nowadays, I see "URI" beginning to be used generally and interchangeably 
with "IRI" with "IRI" fading away.
   The extreme case of this is https://url.spec.whatwg.org/ where "URL" is used 
for everything within the HTML area.
   
   URI = [RFC3986](https://datatracker.ietf.org/doc/html/rfc3986)
   IRI = [RFC3987](https://datatracker.ietf.org/doc/html/rfc3987)
   
   IRIs (Unicode) are abstract and have to be mapped to URIs to be used.
   
   The HTTP infrastructure grew up around old URLs which have an ALPHA 
production `[A-Z][a-z]` of ABNF 
[RFC2234](https://www.rfc-editor.org/rfc/rfc2234). When using someone else's 
server, you can't rely on the server understanding e.g, UTF-8.
   
   In parallel, there is also 
[punycode](https://datatracker.ietf.org/doc/rfc3492/) to be able to write 
non-ASCII host names in strict ASCII URLs so that DNS can be used for the host 
name.
   
   Java `java.net.URI` isn't even RFC3986 - it's the closest to the preceding 
[RFC2396]https://datatracker.ietf.org/doc/html/rfc2396) but I don't think it's 
every going to be updated.
   
   `jena-iri` started because compliance to the standards was not great.
   
   In Jena, we now have an abstraction `IRIx` and a test suite of expectations 
(in the `jena-core`) so that the URI/IRI library can be switched. We'll replace 
use of `jena-iri` with the new code sometime; `jena-iri` will remain for 
testing and for a part of the legacy code which isn't normally used by apps but 
is available.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to