Thanks Dan.

Cheers,

Peter.

On 8/08/2012 6:09 PM, Dan Creswell wrote:
Yeah, sorry very busy with work and other bits and pieces but I agree with
Chris, file the bug cos the RFCs say you're right.

On 8 August 2012 08:32, Peter Firmstone<[email protected]>  wrote:

Thanks Chris,

The dev list has been quiet, but we still appear to be making good
progress, my effots look like complementing Greg's work and the work
Gregg's planning, so the next release will hopefully fix a number of long
standing issues. With fundamental semantic changes to
PrefferedClassProvider, it might be worth incrementing River to version
3.0.   The change to URI from URL may cause problems in some deployment
environments, it isn't 100% backward compatible, although on a positive
note, no code needs to be recompiled.

Cheers,

Peter.



On 6/08/2012 11:19 PM, Christopher Dolan wrote:

I say file a bug against the underscore. I've fought this battle before
and lost...
http://domainkeys.sourceforge.**net/underscore.html<http://domainkeys.sourceforge.net/underscore.html>
Chris

-----Original Message-----
From: Peter Firmstone [mailto:[email protected]]
Sent: Monday, August 06, 2012 7:39 AM
To:[email protected]
Subject: PreferredClassProvider: URL and URI

It turns out the recent failures on Solaris x64 hudson are due to an
illegal character in the URI host name string:

Testcase:
testCrossPlatformNormalise(**org.apache.river.impl.net.**UriStringTest):
Caused an ERROR
Illegal character in hostname at index 13:
http://hudson_solaris:9081/**nonactivatablegroup-dl.jar<http://hudson_solaris:9081/nonactivatablegroup-dl.jar>
java.net.URISyntaxException: Illegal character in hostname at index 13:
http://hudson_solaris:9081/**nonactivatablegroup-dl.jar<http://hudson_solaris:9081/nonactivatablegroup-dl.jar>

The hostname is incorrectly parsed only as an authority component when
passed to the constructor URI(String str), leading to the errors seen in
failing hudson tests.

So the good news is, it isn't a problem with Solaris x64.

But it does raise some important questions.

But first some background...

The summary of semantic changes:

      * Previous releases of Jini&   River PreferredClassProvider have
        relied upon URL and the calling threads context ClassLoader
        (called the parent loader) to determine the correct
        PreferredClassLoader.  Basically URL resolves to an IP address, so
        the ClassLoader was determined by the IP address of the codebase
        and the calling threads context ClassLoader.
      * We could use URI and the calling Threads context ClassLoader
        instead.  This means that the PreferredClassLoader would be
        determined by the normalised form of the URI and the context
        ClassLoader.

The nitty gritty:

      * Relying on the IP address probably made sense in the 90's, today
        there are issues with virtual hosts, dynamic ip addresses and
        maintaining a fixed IP address over time and failover codebase
        replication so that the codebase always appears as the same IP
        address to clients.   You might also imagine that if Jini / River
        hits the internet, that NAT and routing would cause some big
        problems.  This doesn't mean we couldn't continue to use URL and
        provide a URL Handler for some new protocol that solves these
issues.
      * Changing to URI brings some big benefits, but it comes at a
        price.  The benefits are an added layer of indirection, a
        documented standard that can be used to predict ClassLoader
        selection reliably, regardless of protocol.  Cheaper code base
        replication, backup and regional redirection and hosting of
        codebases.  (Obviously signing jar files will be an important step
        to prevent unwanted codebase mixing -eg a remote codebase attack
        with DNS posioning).  Dynamic IP addresses, virtual hosts and fail
        over hosting will work too.  What's the price you may ask?  For
        proper comparison, URI's must have a strictly restricted character
        set and be normalised EG: legal but escaped characters must be
        unescaped, the scheme must be in lower case, the host must also be
        in lower case, the path is case sensitive (file URL paths on
        Windows must be converted to upper case).  Only after
        normalisation is complete can we accurately call hashCode() and
        equals() on URI instances, since this eliminates false negatives.

          * Avoid using the underscore (_) character in machine names.
            Internet standards dictate that domain names conform to the
            host name requirements described in Internet Official Protocol
            Standards RFC 952 and RFC 1123. Domain names must contain only
            letters (upper or lower case) and digits. Domain names can
            also contain dash characters ( - ) as long as the dashes are
            not on the ends of the name. Underscore characters ( _ ) are
            not supported in the host name.

      * The huge benefit is now we can perform URI string based
        comparison, without relying on a URL Handlers for identity, so
        future URL Handlers will also have the same expected behaviour and
        must conform to standards.  This will also have huge performance
        benefits, no longer will class resolution block on DNS calls, nor
        will equals and hashCode comparison rely on DNS resolution.

Currently non conforming URL's can be loaded, such as
"http://hudson_solaris:9081/**nonactivatablegroup-dl.jar<http://hudson_solaris:9081/nonactivatablegroup-dl.jar>",
these will no
longer be supportable with URI.

What do you say?  Do I revert to tried and tested URL (the devil we
know) or do I report a bug against the use of an underscore in the
hudson_solaris hostname and continue refining the use of URI?

Regards,

Peter.







Reply via email to