Great! I'm glad it seems to be working out. I'm all in favor of taking a bunch of smaller refactoring steps to get there, to lessen the chances we break things in there.
I've been thinking about relative ordering of the layers. I think the caching layer has to be pretty close to the bottom, perhaps the first decorator around BasicHttpClient, primarily because variant-related headers(e.g. decompression), authentication, and cookie handling all have an impact on the correct operation of the cache. I think redirect handling has an impact too, because it is not transparent and may hide the actual origin of a response on a cache hit (where it wouldn't get populated correctly into the HttpContext). I haven't gotten any further than that at this point--I'm not familiar with the relative interactions of authentication and redirects, for example. Looking forward to seeing where you get with the code. Jon On Wed, Jun 6, 2012 at 4:25 PM, Oleg Kalnichevski <[email protected]> wrote: > On Sat, 2012-06-02 at 15:15 +0200, Oleg Kalnichevski wrote: > > On Fri, 2012-06-01 at 10:07 -0400, Jon Moore wrote: > > > As we've seen, we've run into a couple of problems with some of our > > > decorators for the DefaultHttpClient where underlying backends do > > > non-transparent things (ranging from decompressing content bodies > without > > > updating response headers, to following redirects). I do think the > > > decorators provide a nice way to layer the overall functionality from a > > > codebase point of view, but I think they are brittle because they rely > on > > > unenforceable assumptions (e.g. that backend clients that get injected > > > behave transparently). > > > > > > I think the solution to this is to continue using the decorators, but > hide > > > them from most users--in other words, reimplementing the > DefaultHttpClient > > > as a stack of decorators that get enabled/disabled by configuration. In > > > other words, a user should be able to "new DefaultHttpClient()" and get > > > something that is capable of caching, following redirects, and handling > > > compression (per configuration). > > > > > > I think this looks like: > > > (1) generate a decorator for redirect following > > > (2) create a new class (BasicHttpClient?) that is essentially > > > DefaultHttpClient minus the redirect logic > > > (3) reimplement DefaultHttpClient so that it generates and configures a > > > stack of RedirectFollowingHttpClient, DecompressingHttpClient, > > > CachingHttpClient, BasicHttpClient (this might mean needing to pull the > > > httpclient-cache module into the httpclient module) > > > (4) add warning documentation on the DecompressingHttpClient and > > > CachingHttpClient about needing to wire them up correctly (this should > > > probably happen first, actually) > > > > > > I suspect there are actually other decorators lurking in there too > > > (authentication? cookie handling?). > > > > > > I think this potentially gives us a clean way to layer in > > > functionality--implement it as a decorator and then figure out where > in the > > > stack is the right/safe way to add it. Opting into the new > functionality is > > > a matter of turning it on in the config for DefaultHttpClient, which > is far > > > simpler for users to do. > > > > > > Thoughts? > > > > > > Jon > > > > Hi Jon > > > > Absolutely, having a number of smaller specialized components combined > > together into a processing chain (based on the chain of responsibility > > pattern0 would make for a much better design and clear API. There is, > > unfortunately, a reason why redirect and authentication handling logic > > is currently crammed into one class (DefaultRequestDirector). There can > > be really complex scenarios where a redirect can cause re-authentication > > with the site and successful authentication can trigger a new redirect, > > which in its turn causes another re-authentication. It took _literally_ > > 10 years to get all those scenarios working properly, especially those > > involving NTLM. The last known issue related to HTTP authentication was > > resolved just recently in the 4.2 release. Speaking from experience > > troubleshooting such complex HTTP exchanges I suspect redirect and > > authentication handling logic are simply inseparable and must be > > executed in a single request execution loop. To make matters worse, many > > of those scenarios cannot be easily simulated (mainly thanks to NTLM) > > and therefore are not covered in our unit and integration tests. > > > > Having said all that please do not feel discouraged from pursuing the > > idea of splitting DefaultRequestDirector into smaller chunks of code and > > moving redirect handling code into a separate HttpClient decorator. We > > probably can still settle for a middle ground approach: leave redirect / > > auth logic in DefaultRequestDirector but disable it when using > > HttpClient decorators. In any way I really like the idea of developing a > > builder class to instantiate HttpClient for complex scenarios involving > > caching / transparent decompression / authentication and transparent > > redirect handing. Feel free to create a branch for 4.2.x releases and > > start working on the trunk or on an experimental branch. > > > > cheers > > > > Oleg > > > > Hi Jon > > I started experimenting with your idea on a branch [1] and now think it > is doable. It is a much nicer design which should result in much cleaner > code. However, a successful implementation of the new design will > effectively amount to a complete rewrite of DefaultRequestDirector, > which is probably the single most complex piece of code in HttpClient. > I'll try to get something you could look at toward the end of the week > > Cheers > > Oleg > > [1] > > https://svn.apache.org/repos/asf/httpcomponents/httpclient/branches/decorator-refactoring/ > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
