On 7/9/2012 10:59 PM, Paul Sandoz wrote:
Hi Joe,
What happens when someone logs a bug for system IDs containing IPv6 addresses
and non-percent encoded international characters?
Exception would be expected just as if Xerces is used.
On Jul 10, 2012, at 3:42 AM, Joe Wang wrote:
Hi Paul,
I'm back from vacation.
You're right. But such an error is also expected. The original design never
tried to out-do the java.net.URL. If a system ID input fails URL, it shall
result in an exception.
The patch that supplied the extra encoding was provided to both Sun and Apache,
and applied to Sun sources. However, it never went into the Apache code base
(refer to https://issues.apache.org/jira/browse/XERCESJ-1156). I thought of
removing the patch, bringing our source in sync with that of Apache. But then I
feared that we might get a regression since the patch has been in the source
for so many years.
Thus, this ugly solution (removing would be prettier) to leave the old change
as is but use java.net.URL in all other cases.
java.net.URL is being used in all cases:
Except that an encoded url is the input when escapeNonUSAscii is used.
602 if (reader == null) {
603 stream = xmlInputSource.getByteStream();
604 if (stream == null) {
605 URL location = new
URL(escapeNonUSAscii(expandedSystemId));
606 URLConnection connect = location.openConnection();
607 if (!(connect instanceof HttpURLConnection)) {
608 stream = connect.getInputStream();
609 }
If this is really about supporting non-percent encoded international characters
in the system ID, then you can make a simple fix to support IPv6-based URLs in
general: do not percent encoded *any* ascii characters.
When encoding an url, aren't reserved characters supposed to be encoded
as well?
Joe
Paul.
By the way, we can only consider this one for 7u8 now.
Thanks,
Joe