Re: HttpComponents: entities and connections

Oleg Kalnichevski Wed, 25 Jan 2006 15:23:19 -0800

On Wed, 2006-01-25 at 17:00 +0100, Roland Weber wrote: 
> Hi Oleg,
>


Hi Roland 

See my comments at the bottom. They are quite verbose, which is not like
me at all ;-)

> here are my thoughts on this, still preliminary.
> 
> I think we need a way to layer output streams (when sending) and
> input streams (when receiving). That's for chunked encoding, for
> transport compression and decompression, content compression and
> decompression (not supported by us, but added by applications),
> for progress monitoring, for statistical analysis, for dumping
> copies of the data that is sent or received, or other things
> developers will come up with.
> 
> The standard approach to directly take an output stream and wrap
> others around it has some limitations. For example, http-async
> performs request preprocessing before allocating a connection,
> so there is no stream yet. Of course that can be solved by using
> a dummy stream that can be directed to another stream later.
> A second drawback is that the order in which the streams are
> created is fixed. There is no way to insert a debug or progress
> monitoring stream inbetween.
> These limitations can be overcome (on the sending side) if the
> request preprocessing does not create actual streams, but only
> defines a list of streams to be layered later on. Elements can
> be inserted into the list at any position. The actual creation
> of the streams is deferred to the connection. As a first idea,
> the elements in the list could be:
> - Class objects for stream classes that have a constructor which
>   accepts a stream as the only argument. The classes will be
>   instantiated via reflections.
> - Instances of a factory interface, for streams that need more
>   parameters than just the underlying stream:
> 
>     interface OutputStreamFactory {
>         public OutputStream create(OutputStream) throws ...;
>     }
> 
> This would move the logic for choosing the various streams into
> the preprocessing classes, while actual streams are still created
> in the connection. The connection would not have to expose the
> output stream, though the arbitrarily layered streams can of course
> screw up the entity that is sent. The current code for choosing a
> chunked or plain stream can remain in place as a default, so the
> connections can still be used directly. Or else that code is moved
> to an OutputStreamFactory, which becomes the default.
> 
> On the receiving side, a symmetric approach can be used. Instead
> of returning a BasicHttpEntity with fully initialized streams,
> the connection returns a ProtoHttpEntity:
> 
>     interface ProtoHttpEntity extends BasicHttpEntity {
>         public void setStreamStack(List) throws ...;
>     }
> 
> The ProtoHttpEntity object does not return an input stream until
> the list of streams to be layered is provided. The response
> interceptors can build that list, and only at the end will the
> entity content be accessible. The socket input stream remains
> protected, except from a custom stream implementation that is
> inserted at the bottom of the stream stack.
> 
> 
> Is that in line with your ideas? Am I over-designing ;-?
> 

This is what I can say at this point. To me this looks fancy but not
very practical. 

All the features you mentioned are perfectly doable right now

> The chunked encoding, transport compression and decompression, 

'Chunk-encoding' and 'identity' are the only transfer codings in use
nowadays and this is unlikely to change in our life time. I have never
seen transfer compression used anywhere, because content compression is
enough. 

Another point, the chunk encoding MUST be the last one applied, it
cannot be put somewhere in the middle of the encoding stack or not at
all. So, what is the point of building a whole new level of abstraction
for no practical gain? If one needed transport compression /
decompression really badly, writing custom entity writer and generator
seems a fair price to pay

> content compression an decompression 

In my opinion the content coding is can be implemented much cleaner by
chaining HttpEntity instances akin to input/output stream chaining  

> progress monitoring, for statistical analysis, for dumping 
> copies of the data that is sent or received, or other things
> developers will come up with.

This can be much better done by layering HttpDataReceiver and
HttpDataTransmitter wrappers around the standard ones. This is how I am
planning to implement wire logging for HttpClient 4.0. 

> The standard approach to directly take an output stream and wrap 
> others around it has some limitations. For example, http-async
> performs request preprocessing before allocating a connection,
> so there is no stream yet. Of course that can be solved by using
> a dummy stream that can be directed to another stream later.
> 

I am not sure I understand entirely the problem you are having with
http-async. You want to preprocess an entity before the connection is
allocated? What for?  

> A second drawback is that the order in which the streams are
> created is fixed.

The whole thing is that the order IS fixed. One cannot put the chunk
codec in an arbitrary place. It MUST be the last coding applied.

>  There is no way to insert a debug or progress
> monitoring stream inbetween.
> 

The debug or progress monitoring code should precede any data streams in
my opinion. All these functions certainly do not belong to request /
response interceptors. They belong to the HTTP data receiver /
transmitter stack.

To sum things up, we have

* HttpDataTramsmitter / HttpDataReceiver to take care of things on the
byte and line level

* HttpEntityGenerator / HttpEntityWriter to take care of transfer coding

* HttpRequestInterceptor / HttpResponseInterceptor to take care of
content coding

In my humble opinion this is quite logical. We should not attempt to
shove all these quite diverse aspects into request / response
interceptors.

The only *real* problem that I see at the moment is that
HttpEntityGenerator and HttpEntityWriter interfaces are not symmetric
from the design purism standpoint. But they serve their purpose of
ensuring the integrity of the low level transport layer and are easy to
replace or extend.   

More strongly I think we should put our efforts elsewhere at this point
of time: HttpAsync, HttpNIO, HttpCookie and HttpAuth. We can redesign
HttpEntityGenerator and HttpEntityWriter at any moment before the API
freeze (which is nowhere in sight) once we see a real need for that.

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: HttpComponents: entities and connections

Reply via email to