Thanks Chris,
The dev list has been quiet, but we still appear to be making good
progress, my effots look like complementing Greg's work and the work
Gregg's planning, so the next release will hopefully fix a number of
long standing issues. With fundamental semantic changes to
PrefferedClassProvider, it might be worth incrementing River to version
3.0. The change to URI from URL may cause problems in some deployment
environments, it isn't 100% backward compatible, although on a positive
note, no code needs to be recompiled.
Cheers,
Peter.
On 6/08/2012 11:19 PM, Christopher Dolan wrote:
I say file a bug against the underscore. I've fought this battle before and
lost...
http://domainkeys.sourceforge.net/underscore.html
Chris
-----Original Message-----
From: Peter Firmstone [mailto:[email protected]]
Sent: Monday, August 06, 2012 7:39 AM
To:[email protected]
Subject: PreferredClassProvider: URL and URI
It turns out the recent failures on Solaris x64 hudson are due to an
illegal character in the URI host name string:
Testcase:
testCrossPlatformNormalise(org.apache.river.impl.net.UriStringTest):
Caused an ERROR
Illegal character in hostname at index 13:
http://hudson_solaris:9081/nonactivatablegroup-dl.jar
java.net.URISyntaxException: Illegal character in hostname at index 13:
http://hudson_solaris:9081/nonactivatablegroup-dl.jar
The hostname is incorrectly parsed only as an authority component when
passed to the constructor URI(String str), leading to the errors seen in
failing hudson tests.
So the good news is, it isn't a problem with Solaris x64.
But it does raise some important questions.
But first some background...
The summary of semantic changes:
* Previous releases of Jini& River PreferredClassProvider have
relied upon URL and the calling threads context ClassLoader
(called the parent loader) to determine the correct
PreferredClassLoader. Basically URL resolves to an IP address, so
the ClassLoader was determined by the IP address of the codebase
and the calling threads context ClassLoader.
* We could use URI and the calling Threads context ClassLoader
instead. This means that the PreferredClassLoader would be
determined by the normalised form of the URI and the context
ClassLoader.
The nitty gritty:
* Relying on the IP address probably made sense in the 90's, today
there are issues with virtual hosts, dynamic ip addresses and
maintaining a fixed IP address over time and failover codebase
replication so that the codebase always appears as the same IP
address to clients. You might also imagine that if Jini / River
hits the internet, that NAT and routing would cause some big
problems. This doesn't mean we couldn't continue to use URL and
provide a URL Handler for some new protocol that solves these issues.
* Changing to URI brings some big benefits, but it comes at a
price. The benefits are an added layer of indirection, a
documented standard that can be used to predict ClassLoader
selection reliably, regardless of protocol. Cheaper code base
replication, backup and regional redirection and hosting of
codebases. (Obviously signing jar files will be an important step
to prevent unwanted codebase mixing -eg a remote codebase attack
with DNS posioning). Dynamic IP addresses, virtual hosts and fail
over hosting will work too. What's the price you may ask? For
proper comparison, URI's must have a strictly restricted character
set and be normalised EG: legal but escaped characters must be
unescaped, the scheme must be in lower case, the host must also be
in lower case, the path is case sensitive (file URL paths on
Windows must be converted to upper case). Only after
normalisation is complete can we accurately call hashCode() and
equals() on URI instances, since this eliminates false negatives.
* Avoid using the underscore (_) character in machine names.
Internet standards dictate that domain names conform to the
host name requirements described in Internet Official Protocol
Standards RFC 952 and RFC 1123. Domain names must contain only
letters (upper or lower case) and digits. Domain names can
also contain dash characters ( - ) as long as the dashes are
not on the ends of the name. Underscore characters ( _ ) are
not supported in the host name.
* The huge benefit is now we can perform URI string based
comparison, without relying on a URL Handlers for identity, so
future URL Handlers will also have the same expected behaviour and
must conform to standards. This will also have huge performance
benefits, no longer will class resolution block on DNS calls, nor
will equals and hashCode comparison rely on DNS resolution.
Currently non conforming URL's can be loaded, such as
"http://hudson_solaris:9081/nonactivatablegroup-dl.jar", these will no
longer be supportable with URI.
What do you say? Do I revert to tried and tested URL (the devil we
know) or do I report a bug against the use of an underscore in the
hudson_solaris hostname and continue refining the use of URI?
Regards,
Peter.