It turns out the recent failures on Solaris x64 hudson are due to an illegal character in the URI host name string:

Testcase: testCrossPlatformNormalise(org.apache.river.impl.net.UriStringTest): Caused an ERROR Illegal character in hostname at index 13: http://hudson_solaris:9081/nonactivatablegroup-dl.jar java.net.URISyntaxException: Illegal character in hostname at index 13: http://hudson_solaris:9081/nonactivatablegroup-dl.jar

The hostname is incorrectly parsed only as an authority component when passed to the constructor URI(String str), leading to the errors seen in failing hudson tests.

So the good news is, it isn't a problem with Solaris x64.

But it does raise some important questions.

But first some background...

The summary of semantic changes:

   * Previous releases of Jini & River PreferredClassProvider have
     relied upon URL and the calling threads context ClassLoader
     (called the parent loader) to determine the correct
     PreferredClassLoader.  Basically URL resolves to an IP address, so
     the ClassLoader was determined by the IP address of the codebase
     and the calling threads context ClassLoader.
   * We could use URI and the calling Threads context ClassLoader
     instead.  This means that the PreferredClassLoader would be
     determined by the normalised form of the URI and the context
     ClassLoader.

The nitty gritty:

   * Relying on the IP address probably made sense in the 90's, today
     there are issues with virtual hosts, dynamic ip addresses and
     maintaining a fixed IP address over time and failover codebase
     replication so that the codebase always appears as the same IP
     address to clients.   You might also imagine that if Jini / River
     hits the internet, that NAT and routing would cause some big
     problems.  This doesn't mean we couldn't continue to use URL and
     provide a URL Handler for some new protocol that solves these issues.
   * Changing to URI brings some big benefits, but it comes at a
     price.  The benefits are an added layer of indirection, a
     documented standard that can be used to predict ClassLoader
     selection reliably, regardless of protocol.  Cheaper code base
     replication, backup and regional redirection and hosting of
     codebases.  (Obviously signing jar files will be an important step
     to prevent unwanted codebase mixing -eg a remote codebase attack
     with DNS posioning).  Dynamic IP addresses, virtual hosts and fail
     over hosting will work too.  What's the price you may ask?  For
     proper comparison, URI's must have a strictly restricted character
     set and be normalised EG: legal but escaped characters must be
     unescaped, the scheme must be in lower case, the host must also be
     in lower case, the path is case sensitive (file URL paths on
     Windows must be converted to upper case).  Only after
     normalisation is complete can we accurately call hashCode() and
     equals() on URI instances, since this eliminates false negatives.

       * Avoid using the underscore (_) character in machine names.
         Internet standards dictate that domain names conform to the
         host name requirements described in Internet Official Protocol
         Standards RFC 952 and RFC 1123. Domain names must contain only
         letters (upper or lower case) and digits. Domain names can
         also contain dash characters ( - ) as long as the dashes are
         not on the ends of the name. Underscore characters ( _ ) are
         not supported in the host name.

   * The huge benefit is now we can perform URI string based
     comparison, without relying on a URL Handlers for identity, so
     future URL Handlers will also have the same expected behaviour and
     must conform to standards.  This will also have huge performance
     benefits, no longer will class resolution block on DNS calls, nor
     will equals and hashCode comparison rely on DNS resolution.

Currently non conforming URL's can be loaded, such as "http://hudson_solaris:9081/nonactivatablegroup-dl.jar";, these will no longer be supportable with URI.

What do you say? Do I revert to tried and tested URL (the devil we know) or do I report a bug against the use of an underscore in the hudson_solaris hostname and continue refining the use of URI?

Regards,

Peter.





Reply via email to