Re: Counting actual input size [was: svn commit: r1088435]

Milamber Tue, 12 Apr 2011 23:54:21 -0700

I've updated the patch on bug 43363 since your last commit on HC4

https://issues.apache.org/bugzilla/show_bug.cgi?id=43363


With your last commit on HC4Impl, the header size and body size aren't good 
with a gzip stream ou chunked response.
For example, with a chunked response, they are:
HC4:
Size in bytes: 8199
Headers size in bytes: 8192  (=> Like a buffer reader?)
Body size in bytes: 7

Java & HC3 (good value, verified with wireshark)
Size in bytes: 10505
Headers size in bytes: 581
Body size in bytes: 9924


For a gzip response:
HC4:
Size in bytes: 14025 (good)
Headers size in bytes: 1440
Body size in bytes: 12585

Java & HC3:
Size in bytes: 14025
Headers size in bytes: 291
Body size in bytes: 13734

It is a bug with HttpClient 4.1 too?

Milamber


Le 10/04/2011 16:09, Milamber a ecrit :
>
> Le 09/04/2011 12:40, sebb a ecrit :
>   
>> [snip]
>>     
>>>> On HC4, I try to use a CountingInputStream on instream, but the return
>>>> size is uncompressed.
>>>> //                InputStream instream = entity.getContent();
>>>>                InputStream instream = new
>>>> CountingInputStream(entity.getContent());
>>>>                res.setResponseData(readResponse(res, instream, (int)
>>>> entity.getContentLength()));
>>>>                int cnt = ((CountingInputStream) instream).getCount();
>>>>                log.debug("CNT=" + cnt);
>>>>
>>>>
>>>> I thinks that CountingInputStream must be in more deep in code, directly
>>>> in HttpClient, or inside the Gzip/deflate input stream?
>>>>       
>>>>         
>>> Yes, you're right - the streams we currently use are somewhat removed
>>> from the actual input .
>>>
>>> For HC3 and Java, we decompress the inputstream directly, so could
>>> wrap that with a CountingInputStream first.
>>> However, the stream contains the de-chunked data, so the chunking
>>> overhead would not be seen.
>>> But it would be closer to the true size, and might be acceptable.
>>>
>>> Ideally, one would like to intercept the input stream before
>>> de-chunking, but I'm not sure that's possible with HC3 and Java.
>>>
>>> However with HC3 and HC4 one can provide custom sockets, so it would
>>> be possible to count the actual input.
>>>
>>> One could even detect the end of the header by looking for CRLF CRLF -
>>> but that might add an unacceptable overhead, in which case we could
>>> use the current header calculation which would be reasonably close.
>>>
>>> It's not possible to provide a custom socket implementation for Java
>>> HTTP, only Java HTTPS, so this approach would not work there, so we
>>> would have to use the CountingInputStream.
>>>
>>> I suggest we use the simple approach of CountingInputStream (CIS) for
>>> Java and HC3; it's easy to do and fairly accurate. No point spending
>>> lots of time on those implementations as HC4 is better.
>>>
>>> I'll have a look at HC4 to see what can be done - would you like to
>>> see if CIS works OK for HC3 and Java?
>>>     
>>>       
> Yes, works fine (plain response, gzip and chunked)
>
>
>   
>> HC4 keeps metrics on the connection, so it's very easy to find the
>> actual byte counts for header and body.
>>   
>>     
> The bytes count works (with a HttpConnectionMetrics). It's the full
> response size including the headers. To get raw response size, I must
> subtract headers size (after calculating them).
> I found a issue for chunked response (for example: with google homepage,
> and disable headers manager (don't accept gzip response), the response
> size is incorrect. I don't know why, I must search.
>
>   
>> ==
>>
>> I think we should consider changing the default to be the total
>> network response size. However, this may affect some size assertions.
>>   
>>     
> Yes that's seems better. JMeter will indicate and calculate real network
> size and throughput.
> For assertions size element, I added a scope selector like Response
> assertion element (but with network size and without URL scope).
> What other elements needs this scope?
>
>   
>> HTTPSampleResult (v2.4) stores the decoded body size only. Maybe we
>> should store the header and raw body sizes separately, rather than
>> combining some of them. This would give the most flexibility.
>>
>> Also, consider adding the fields to SampleResult rather than
>> HTTPSampleResult. For non-HTTP responses, the headerSize would
>> normally be zero and raw body size would be the same as decoded body
>> size, but e.g. for POP3 perhaps it would make sense to implement
>> header size.
>>   
>>     
> Ok, I've added this fields in SampleResult.
>   
>> Adding the fields to SampleResult would also make it easier to display
>> them in the Tree View Listener (HTTPSampleResult is currently defined
>> in a different jar which is built later - perhaps that's a mistake).
>>   
>>     
> Ok done. In VRT, I purpose this:
> Size in bytes : =X+Y
> Headers size in bytes: X
> Response data size in bytes: Y
>
> I've submit a patch with this enhancement on bug 43363
> https://issues.apache.org/bugzilla/show_bug.cgi?id=43363
>
> Sebb, please apply on your trunk, and send your feedback. (specially on
> HC4 with chunked response)
>
> Milamber
>
>   
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>>   
>>     
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Counting actual input size [was: svn commit: r1088435]

Reply via email to