On 7/9/2012 10:59 PM, Paul Sandoz wrote:
Hi Joe,

What happens when someone logs a bug for system IDs containing IPv6 addresses 
and non-percent encoded international characters?

Exception would be expected just as if Xerces is used.



On Jul 10, 2012, at 3:42 AM, Joe Wang wrote:
Hi Paul,

I'm back from vacation.

You're right. But such an error is also expected.  The original design never 
tried to out-do the java.net.URL.  If a system ID input fails URL, it shall 
result in an exception.

The patch that supplied the extra encoding was provided to both Sun and Apache, 
and applied to Sun sources. However, it never went into the Apache code base 
(refer to https://issues.apache.org/jira/browse/XERCESJ-1156).  I thought of 
removing the patch, bringing our source in sync with that of Apache. But then I 
feared that we might get a regression since the patch has been in the source 
for so many years.

Thus, this ugly solution (removing would be prettier) to leave the old change 
as is but use java.net.URL in all other cases.

java.net.URL is being used in all cases:

Except that an encoded url is the input when escapeNonUSAscii is used.


  602         if (reader == null) {
  603             stream = xmlInputSource.getByteStream();
  604             if (stream == null) {
  605                 URL location = new 
URL(escapeNonUSAscii(expandedSystemId));
  606                 URLConnection connect = location.openConnection();
  607                 if (!(connect instanceof HttpURLConnection)) {
  608                     stream = connect.getInputStream();
  609                 }

If this is really about supporting non-percent encoded international characters 
in the system ID, then you can make a simple fix to support IPv6-based URLs in 
general: do not percent encoded *any* ascii characters.

When encoding an url, aren't reserved characters supposed to be encoded as well?

Joe


Paul.


By the way, we can only consider this one for 7u8 now.

Thanks,
Joe

Reply via email to