On Wed, 2014-06-04 at 15:38 -0400, Craig Skinfill wrote: > HTTP Components team - I have some approved time this summer to work on an > open source project, and I'd like to work on improving the caching support > in the async http client. Currently, the requests to the origin are > non-blocking, but the requests to the cache are blocking. The async > caching support appears to be implemented as a decorator of the http > client, while in the blocking client case its implemented by decorating the > internal ClientExecChain instance. > > My initial idea was to follow the same pattern in the async client as with > the blocking client, and use an internal ExecutorService to submit requests > to the cache, and then block (with a timeout) the returned Future with the > cache lookup result. This is of course still blocking, but at least > provides a potentially configurable timeout when checking the cache. > > How should I approach this? I see a comment in > https://issues.apache.org/jira/browse/HTTPASYNC-76 regarding the likely > need to make changes to the existing blocking http client caching > implementation along with changes to the core async http client protocol > pipeline processing. Are there any existing ideas, plans, etc., for making > the caching non-blocking for the async client? Or what changes would be > needed in the blocking client's caching implementation? > > Is there enough need to make this improvement? > > Thanks.
Hi Craig Async HTTP caching is a much neglected area in HC. Any contribution there would be enormously welcome. I, for one, am very happy to have you on board. Async HTTP caching is a difficult task from a purely design perspective and is likely to require several iterations to get things right. In general non-blocking I/O makes certain things easier but it also other things much more complex. Content (data) streaming is one of those things. Standard Java InputStream / OutputStream API is simple and effective but it is inherently blocking and simply does not work well with even-driven designs. For non-blocking transports we use consumer / producer based model that enables reactive programming style and works well for data intensive applications. The problem is it is damn hard to organize those consumers and producers into a pipeline based on the chain of responsibility patterns. The ability to model protocol processing logic as a sequence of related and interdependent elements is what makes integration of caching aspects into the blocking client seamless and efficient. Ideally, we should be able to do the same for the non-blocking client. Another major issue is that presently HTTP cache components are tightly coupled with InputStream and the whole design of the caching APIs is effectively blocking. I must confess that I do not see an easy solution to those design issues. No matter what we do we are likely to end up breaking existing APIs, which is also a problem. So, I can also well imagine that we make the decision to _not_ support data streaming with caching at all (at least initially). If we always buffer messages in memory it would make it much easier to come up with a reasonable processing pipeline design, which is asynchronous but only at the HTTP message level. This would also enable us to fully re-use blocking caching elements without having to alter them. It might be an unpleasant but necessary compromise. If this all does not sound too depressing this issue might be a good starting point. It would also give you a good expose to the existing code base and API design. Cheers Oleg --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
