Re: Counting actual input size [was: svn commit: r1088435]

sebb Sat, 09 Apr 2011 04:41:16 -0700

On 8 April 2011 14:23, sebb <seb...@gmail.com> wrote:
> On 8 April 2011 07:45, Milamber <milam...@apache.org> wrote:
> [...]
>
>>>> With my last submission (r1088748), I try to respond to your feedback.
>>>> Please say me if another thing to improve.
>>>>
>>> The problem of chunked responses still exists - such responses don't
>>> have a Content-Length header.
>>>
>>> One way round this would be to wrap the input Stream with a
>>> org.apache.commons.io.input.CountingInputStream.
>>> I don't think this will affect performance adversely.
>>>
>>> Does that make sense?
>>>
>>
>> Yes may be a good idea.
>> Since your last commit on HC4Impl, entity.getContentLength() return -1
>> (unknown size) (but http response have a content-length define)
>> I thinks the ResponseContentEncoding class which decompress stream is
>> the cause.
>
> That seems likely.
>
>> On HC4, I try to use a CountingInputStream on instream, but the return
>> size is uncompressed.
>> //                InputStream instream = entity.getContent();
>>                InputStream instream = new
>> CountingInputStream(entity.getContent());
>>                res.setResponseData(readResponse(res, instream, (int)
>> entity.getContentLength()));
>>                int cnt = ((CountingInputStream) instream).getCount();
>>                log.debug("CNT=" + cnt);
>>
>>
>> I thinks that CountingInputStream must be in more deep in code, directly
>> in HttpClient, or inside the Gzip/deflate input stream?
>
> Yes, you're right - the streams we currently use are somewhat removed
> from the actual input .
>
> For HC3 and Java, we decompress the inputstream directly, so could
> wrap that with a CountingInputStream first.
> However, the stream contains the de-chunked data, so the chunking
> overhead would not be seen.
> But it would be closer to the true size, and might be acceptable.
>
> Ideally, one would like to intercept the input stream before
> de-chunking, but I'm not sure that's possible with HC3 and Java.
>
> However with HC3 and HC4 one can provide custom sockets, so it would
> be possible to count the actual input.
>
> One could even detect the end of the header by looking for CRLF CRLF -
> but that might add an unacceptable overhead, in which case we could
> use the current header calculation which would be reasonably close.
>
> It's not possible to provide a custom socket implementation for Java
> HTTP, only Java HTTPS, so this approach would not work there, so we
> would have to use the CountingInputStream.
>
> I suggest we use the simple approach of CountingInputStream (CIS) for
> Java and HC3; it's easy to do and fairly accurate. No point spending
> lots of time on those implementations as HC4 is better.
>
> I'll have a look at HC4 to see what can be done - would you like to
> see if CIS works OK for HC3 and Java?


HC4 keeps metrics on the connection, so it's very easy to find the
actual byte counts for header and body.

==

I think we should consider changing the default to be the total
network response size. However, this may affect some size assertions.

HTTPSampleResult (v2.4) stores the decoded body size only. Maybe we
should store the header and raw body sizes separately, rather than
combining some of them. This would give the most flexibility.

Also, consider adding the fields to SampleResult rather than
HTTPSampleResult. For non-HTTP responses, the headerSize would
normally be zero and raw body size would be the same as decoded body
size, but e.g. for POP3 perhaps it would make sense to implement
header size.

Adding the fields to SampleResult would also make it easier to display
them in the Tree View Listener (HTTPSampleResult is currently defined
in a different jar which is built later - perhaps that's a mistake).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@jakarta.apache.org
For additional commands, e-mail: dev-h...@jakarta.apache.org

Re: Counting actual input size [was: svn commit: r1088435]

Reply via email to