Thanks Chris,

The dev list has been quiet, but we still appear to be making good progress, my effots look like complementing Greg's work and the work Gregg's planning, so the next release will hopefully fix a number of long standing issues. With fundamental semantic changes to PrefferedClassProvider, it might be worth incrementing River to version 3.0. The change to URI from URL may cause problems in some deployment environments, it isn't 100% backward compatible, although on a positive note, no code needs to be recompiled.

Cheers,

Peter.



On 6/08/2012 11:19 PM, Christopher Dolan wrote:
I say file a bug against the underscore. I've fought this battle before and 
lost...
http://domainkeys.sourceforge.net/underscore.html
Chris

-----Original Message-----
From: Peter Firmstone [mailto:[email protected]]
Sent: Monday, August 06, 2012 7:39 AM
To:[email protected]
Subject: PreferredClassProvider: URL and URI

It turns out the recent failures on Solaris x64 hudson are due to an
illegal character in the URI host name string:

Testcase:
testCrossPlatformNormalise(org.apache.river.impl.net.UriStringTest):
Caused an ERROR
Illegal character in hostname at index 13:
http://hudson_solaris:9081/nonactivatablegroup-dl.jar
java.net.URISyntaxException: Illegal character in hostname at index 13:
http://hudson_solaris:9081/nonactivatablegroup-dl.jar

The hostname is incorrectly parsed only as an authority component when
passed to the constructor URI(String str), leading to the errors seen in
failing hudson tests.

So the good news is, it isn't a problem with Solaris x64.

But it does raise some important questions.

But first some background...

The summary of semantic changes:

     * Previous releases of Jini&  River PreferredClassProvider have
       relied upon URL and the calling threads context ClassLoader
       (called the parent loader) to determine the correct
       PreferredClassLoader.  Basically URL resolves to an IP address, so
       the ClassLoader was determined by the IP address of the codebase
       and the calling threads context ClassLoader.
     * We could use URI and the calling Threads context ClassLoader
       instead.  This means that the PreferredClassLoader would be
       determined by the normalised form of the URI and the context
       ClassLoader.

The nitty gritty:

     * Relying on the IP address probably made sense in the 90's, today
       there are issues with virtual hosts, dynamic ip addresses and
       maintaining a fixed IP address over time and failover codebase
       replication so that the codebase always appears as the same IP
       address to clients.   You might also imagine that if Jini / River
       hits the internet, that NAT and routing would cause some big
       problems.  This doesn't mean we couldn't continue to use URL and
       provide a URL Handler for some new protocol that solves these issues.
     * Changing to URI brings some big benefits, but it comes at a
       price.  The benefits are an added layer of indirection, a
       documented standard that can be used to predict ClassLoader
       selection reliably, regardless of protocol.  Cheaper code base
       replication, backup and regional redirection and hosting of
       codebases.  (Obviously signing jar files will be an important step
       to prevent unwanted codebase mixing -eg a remote codebase attack
       with DNS posioning).  Dynamic IP addresses, virtual hosts and fail
       over hosting will work too.  What's the price you may ask?  For
       proper comparison, URI's must have a strictly restricted character
       set and be normalised EG: legal but escaped characters must be
       unescaped, the scheme must be in lower case, the host must also be
       in lower case, the path is case sensitive (file URL paths on
       Windows must be converted to upper case).  Only after
       normalisation is complete can we accurately call hashCode() and
       equals() on URI instances, since this eliminates false negatives.

         * Avoid using the underscore (_) character in machine names.
           Internet standards dictate that domain names conform to the
           host name requirements described in Internet Official Protocol
           Standards RFC 952 and RFC 1123. Domain names must contain only
           letters (upper or lower case) and digits. Domain names can
           also contain dash characters ( - ) as long as the dashes are
           not on the ends of the name. Underscore characters ( _ ) are
           not supported in the host name.

     * The huge benefit is now we can perform URI string based
       comparison, without relying on a URL Handlers for identity, so
       future URL Handlers will also have the same expected behaviour and
       must conform to standards.  This will also have huge performance
       benefits, no longer will class resolution block on DNS calls, nor
       will equals and hashCode comparison rely on DNS resolution.

Currently non conforming URL's can be loaded, such as
"http://hudson_solaris:9081/nonactivatablegroup-dl.jar";, these will no
longer be supportable with URI.

What do you say?  Do I revert to tried and tested URL (the devil we
know) or do I report a bug against the use of an underscore in the
hudson_solaris hostname and continue refining the use of URI?

Regards,

Peter.






Reply via email to