As you probably know, I've been working on a branch of httpclient to address 
several issues that I believe prohibit a public 1.0 release, including 
several critical bugs, and several design issues that complicate or prohibit 
fixing those bugs and adding new features in line with the httpclient 
charter and scope.  Since that refactoring is nearly complete, and Vincent's 
recent postings are beginning to press the issue, I'll open up my changes 
for discussion (not that they've been hidden up to this point).  While 
personally I believe every one of these changes is a postive step, I'm not 
trying to ram them down anyone's throat either, I'm just trying to open up 
the discussion

To obtain this branch of httpclient, simply run "cvs -q update -P -d -C -r 
rlwrefactoring" from the root of httpclient.

As discussed below, there are docs, javadocs and tests that should 
demonstrate and document the use of this package.

Below is an enumeration of the major changes I've made to the package, 
although there are probably other changes that I can't recall right now.  
I'd be more than happy to defend or discuss individual changes, although 
here my intention is simply to enumerate them.

A few comments though:

* Two decision points that I can't seem to make up my mind about are:

(a) whether or not HttpConnection should be an interface (currently it is)

(b) whether or not Cookie should be an interface (currently it is not)

(My basic rationale is that generally speaking in components I think it is 
appropriate to make types into interfaces whenever it is reasonable to 
expect alternative implementations, as we certainly can with HttpMethod, 
probably with HttpState, but I'm more on the fence with HttpConnection and 
Cookie.)

* As should be abundantly clear by now, but apparently it isn't, I have no 
issues with the concept of StreamInterceptor and ConnectionInterceptor, but 
looking at the current interface the contract is not clear enough to be able 
to add support for them. (And I would contend that the contract isn't clear 
enough in the main branch either, given the inconsistencies and downright 
omissions I mentioned last week).  If someone can clarify how precisely 
those are supposed to work, we can and should certainly work them back in, 
but absent that clarification, I don't see why we should maintain 
half-implemented support for them.

* It has been (and still is) my intention to prepare the necessary patches 
for Slide to make it fully compatible with this refactoring.  It should be 
relatively straightforward to do so.

* Regarding a release plan, I'd suggest the following (more or less in this 
order):

(0. As I've always maintained, if we want to release the current main branch 
or any previous revision of the main branch as anything short of a 1.0 
release (say 0.9), I'm not opposed to that.  I am opposed to releasing 
releasing the current main branch as a 1.0 release.  Why? Because we know it 
has fairly critical problems, and we know that much of the API is likely to 
change to address those problems, so why pretend that it is a stable 
release?  (Not that my personal opinion can, should or *has* stopped anyone 
from calling for or voting for a 1.0 release with the current main branch.))

1. Let's review these changes, make additional changes (or perhaps undo 
changes) as warranted, and merge them back into the main branch.

2. Let's resolve the logging stuff one way or another. Personally, I'd like 
to see us accept the logging proposal and migrate HTTPClient over to using 
it.  I found the logging extremely useful though, and I expect I will 
continue to, so I for one would be opposed to removing all logging from 
httpclient outright.  I suppose conditional compilation is one other option.

3. Let's make sure Slide *can* work with the revised HTTP Client (i.e., that 
it supports all of the functionality strictly required by slide), and leave 
whether or not they choose to up to them.

4. Let's release the resulting code base as the 1.0 release.

Bugs Fixed
----------

* Headers
  * When two response headers with the same name are sent, one of them is 
ignored. (HTTP/1.0 and HTTP/1.1 say we should treat "a: one" and "a: two" 
and "a: one, two". This is actually a pretty common occurrence for headers 
like Set-Cookie.)

* HTTP Redirects
  * Query string changes are silently ignored during redirect.
  * Host/port changes are silently ignored during redirect.
  * Protocol changes are silently ignored during redirect
  * Redirects fail for all HTTPS requests

* Cookies
  * isToBeDiscarded() returns wrong value (false for true/true for false)
  * Path is ignored when accepting cookies
  * Secure cookies are accepted over insecure connections
  * Secure cookies are transmitted over insecure connections
  * When only name/value is supplied, path defaults to "/".  When domain or 
other parameter is supplied, path defaults to null.
  * Secure parameter being inapproriately sent in "cookie" header.
  * Expires attribute doesn't work for some date formats accepted by commons 
browsers and sent by common servers.
  * "Deleting" cookies doesn't fully work--cookies remain in State (but are 
not transmitted)
  * When cookies exist but no cookies match the given request, "Cookie: 
$Version=1" is submitted, rather than no Cookie header at all.
  * When not specified, cookie path should default to (directory containing 
the) request path, not "/"

* NameValuePair
  * equals method throws NullPointerException when name or value is null

* Proxying
  * When connecting via a proxy, HTTPS parameter is silently ignored (i.e., 
we speak HTTP to the proxy and from the proxy to the destination server)


Functional Enhancements
-----------------------

* Track Content-Length response header value versus number of bytes actually 
read in HttpMethodBase and GetMethod, so that we can stop reading when 
Content-Length has been reached, and warn when the server tries to overflow 
it. (Arguably this was a bug per the HTTP spec.)

* GetMethod (and PostMethod) delete temp files on VM exit

* HttpClient.getState never returns null (so we don't have to do any "get, 
if null then new and set" logic when trying to "manually" manipulate the 
HttpState)

* Query string and request parameter handling more robust with respect to 
null names and values.

* Query string and request parameters now support multiple values for a 
single name (e.g., ?foo=bar&foo=bar2)

* Query string and request parameters now distinguish between null and empty 
values (e.g., "?foo" versus "?foo="

* During method processing, now detects HTTP redirect loops and returns 
immediately

* During method processing, now detects authentication failure and returns 
immediately (previously would retry with bad credentials several times)

* Added hashCode method to NameValuePair per the Object contract

* Multiple authentication realms are now supported.

* HttpMethodBase now supports response bodies when returned (hence now 
PutMethod, OptionsMethod, etc. can read and return response bodies)

* HttpMethod/HttpMethodBase now supports addRequestHeader, 
removeRequestHeader methods.

Other Enhancements
------------------

* Cleaned up and added a fair amount of JavaDoc comments

* Added ~70 or so more tests, exercising most of the public facing 
functionality.

* Added a web app for providing the server side of testing

* Added docs/web-site

* Made error/exception messages more explicit

* Added more detailed logging

Refactorings
------------

* Reduced inter-class couplings:

   + RequestOutputStream no longer requires HttpMethod

   + ResponseInputStream no longer requires HttpMethod (although I preserved 
the constructor for the time being as a convenience method for passing the 
Transfer-Encoding, Content-Length and related values)

   + HttpMethodBase (and children) no longer requires HttpClient

   + Removed instance variable dependencies between HttpMethodBase and its 
children. The contract between HttpMethodBase and its children is now fully 
specified by protected methods.

* Clearly distinguish HttpMethod and HttpMethodBase methods (i.e., separate 
interface concerns from implementation concerns).  It should now be much 
simpler to create new implementations of HttpMethod with or without using 
HttpMethodBase.

* Clean up HttpMethodBase methods, workflow and contract between base and 
children, making it much easier/simpler to create new subclasses, and 
removing several unnecessary methods.

* Added HttpConnection abstraction, which acts as a container for the 
streams and all connection-related state variables 
(host/port/proxyhost/proxyport/secure/etc.), most of which were previously 
stored in HttpClient

* HTTP protocol concerns (e.g., header writing, response parsing, etc) no 
longer split over multiple classes (previously split fairly arbitrarily 
between HttpClient and HttpMethodBase), now consolidated in HttpMethodBase.

* Made several method names more uniform (e.g., toExternalForm) and more 
specific (e.g., HttpMethodBase.getHeader() previously returned either a 
response or a request header, depending upon when it was called, now there 
is a getRequestHeader() and getResponseHeader()).

* "Authentication challenge token" no longer stored within State (this was a 
request level state stored temporarily in State)

* Authentication Credentials no longer stored in HttpClient, but in 
HttpState

* Moved setParameter methods down to PostMethod, because in all cases except 
PostMethod, setParameter really means setQueryString.

* Many of these refactorings help us to clarify the roles for the key types:

   + HttpConnection is a socket connection together with connection related 
state variables (host/port/etc.)

   + HttpState is a container for attributes that may persist from request 
to request, and from connection to connection (Cookies, authentication 
Credentials, etc.)

   + HttpMethod is an action to be applied to some HttpConnection with a 
given HttpState.  HttpMethod also acts as a container for the response to 
that action.

   + HttpClient is a container for HttpStates and one or more 
HttpConnections, to which HttpMethods may be applied. (Alternative drivers 
may also be used.)

   + (HttpMethodBase is just one implementation of HttpMethod that may be a 
useful starting point for typical HTTP/1.0 and HTTP/1.1 interactions, but 
the full protocol may be controlled by alternative implementations.)

Other Changes (I see them as improvements, but you may disagree)
----------------------------------------------------------------

* Combined functionality of URIUtil and URLUtil classes into URIUtil, 
removed URLUtil

* Changed URIUtil semantics slightly.  The same functionality (actually more 
functionality) that was available before is still available, although the 
way in which you get to it has changed to be a bit more explicit/clear.

* URIUtil no longer "hides" UnsupporedEncodingExceptions.

* In GetMethod and PostMethod useDisk defaults to false instead of true. 
Automatically set to true when file or temp dir is set.

* In GetMethod and PostMethod, the constructors taking both a useDisk and 
temp file/dir param have been removed.  If you don't want to use disk, you 
are unlikely to want to pass in a file.  Those removed methods would also 
only conditionally set file/dir, based upon useDisk flag, which seems a bit 
confusing.

* Renamed State to HttpState for consistency

* Made HttpState an interface, with HttpStateImpl as the base implementation

* Cookie.isToBeDiscarded renamed Cookie.isPersistent (a more 
intention-revealing name)

* Removed ConnectionInterceptor and StreamInterceptor since (a) the 
contract/intent isn't clear to me, (b) to the degree that the contract is 
clear, it's not being followed.  We can add it back once the use is 
clarified, but for now I'm not sure how.

* Removed the unused MD5Encoder.  We can add it back once we're using it. 
(The same functionality can be easily created out of core JRE classes 
anyway.)

* Old Credentials class moved to UsernamePasswordCredentials. Credentials 
made into an interface (supporting additional types of credentials)

* The primary public interfaces/classes have been limited to JDK 1.1 or 
earlier types, to allow for implementations compatible with older VMs.

- Rod

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp

Reply via email to