At ("@") and colon (":") in path segments should not be escaped by
IRI.normalize()
----------------------------------------------------------------------------------
Key: ABDERA-225
URL: https://issues.apache.org/jira/browse/ABDERA-225
Project: Abdera
Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Niklas Lindström
The `normalize` method on IRI objects escapes ":" and "@" in path segments,
which doesn't seem to comply with the ABNF rules given in the IRI RFC (RFC 3987
section 2.2. <http://tools.ietf.org/html/rfc3987#section-2.2>).
It seems that `isegment` explicitly *allows* for ":" and "@", according to:
isegment = *ipchar
...
ipchar = iunreserved / pct-encoded / sub-delims / ":" / "@"
The `normalize` method encodes each segment by filtering on
`CharUtils.Profile.IPATHNODELIMS`, via `CharUtils.is_ipathnodelims`, which uses
`is_ipath` && `!isGenDelim`.
The effect is that this (groovy):
import org.apache.abdera.i18n.iri.IRI
println new IRI("http://example.org/publ/1999:175").normalize()
println new IRI("http://example.org/people/[email protected]").normalize()
prints out:
http://example.org/publ/1999%3A175
http://example.org/people/admin%40example.org
, where I would expect no escaping at all. Note that:
println new java.net.URI("http://example.org/publ/1999:175").normalize()
println new
java.net.URI("http://example.org/people/[email protected]").normalize()
results in the expected:
http://example.org/publ/1999:175
http://example.org/people/[email protected]
Since an ipath is composed of "/"-separated isegments (with some specifics
regarding absolute, rootless and noscheme), perhaps a solution would be to
rework `is_ipath` into calls fully mimicking these ABNF rules, and use
`UrlEncoding.encode(..., Profile.IPATH.filter())` in `IRI.normalize(path)`?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.