I say file a bug against the underscore. I've fought this battle before and lost... http://domainkeys.sourceforge.net/underscore.html Chris
-----Original Message----- From: Peter Firmstone [mailto:[email protected]] Sent: Monday, August 06, 2012 7:39 AM To: [email protected] Subject: PreferredClassProvider: URL and URI It turns out the recent failures on Solaris x64 hudson are due to an illegal character in the URI host name string: Testcase: testCrossPlatformNormalise(org.apache.river.impl.net.UriStringTest): Caused an ERROR Illegal character in hostname at index 13: http://hudson_solaris:9081/nonactivatablegroup-dl.jar java.net.URISyntaxException: Illegal character in hostname at index 13: http://hudson_solaris:9081/nonactivatablegroup-dl.jar The hostname is incorrectly parsed only as an authority component when passed to the constructor URI(String str), leading to the errors seen in failing hudson tests. So the good news is, it isn't a problem with Solaris x64. But it does raise some important questions. But first some background... The summary of semantic changes: * Previous releases of Jini & River PreferredClassProvider have relied upon URL and the calling threads context ClassLoader (called the parent loader) to determine the correct PreferredClassLoader. Basically URL resolves to an IP address, so the ClassLoader was determined by the IP address of the codebase and the calling threads context ClassLoader. * We could use URI and the calling Threads context ClassLoader instead. This means that the PreferredClassLoader would be determined by the normalised form of the URI and the context ClassLoader. The nitty gritty: * Relying on the IP address probably made sense in the 90's, today there are issues with virtual hosts, dynamic ip addresses and maintaining a fixed IP address over time and failover codebase replication so that the codebase always appears as the same IP address to clients. You might also imagine that if Jini / River hits the internet, that NAT and routing would cause some big problems. This doesn't mean we couldn't continue to use URL and provide a URL Handler for some new protocol that solves these issues. * Changing to URI brings some big benefits, but it comes at a price. The benefits are an added layer of indirection, a documented standard that can be used to predict ClassLoader selection reliably, regardless of protocol. Cheaper code base replication, backup and regional redirection and hosting of codebases. (Obviously signing jar files will be an important step to prevent unwanted codebase mixing -eg a remote codebase attack with DNS posioning). Dynamic IP addresses, virtual hosts and fail over hosting will work too. What's the price you may ask? For proper comparison, URI's must have a strictly restricted character set and be normalised EG: legal but escaped characters must be unescaped, the scheme must be in lower case, the host must also be in lower case, the path is case sensitive (file URL paths on Windows must be converted to upper case). Only after normalisation is complete can we accurately call hashCode() and equals() on URI instances, since this eliminates false negatives. * Avoid using the underscore (_) character in machine names. Internet standards dictate that domain names conform to the host name requirements described in Internet Official Protocol Standards RFC 952 and RFC 1123. Domain names must contain only letters (upper or lower case) and digits. Domain names can also contain dash characters ( - ) as long as the dashes are not on the ends of the name. Underscore characters ( _ ) are not supported in the host name. * The huge benefit is now we can perform URI string based comparison, without relying on a URL Handlers for identity, so future URL Handlers will also have the same expected behaviour and must conform to standards. This will also have huge performance benefits, no longer will class resolution block on DNS calls, nor will equals and hashCode comparison rely on DNS resolution. Currently non conforming URL's can be loaded, such as "http://hudson_solaris:9081/nonactivatablegroup-dl.jar", these will no longer be supportable with URI. What do you say? Do I revert to tried and tested URL (the devil we know) or do I report a bug against the use of an underscore in the hudson_solaris hostname and continue refining the use of URI? Regards, Peter.
