On Fri, 2013-11-29 at 17:20 +0400, Alexey Ermakov wrote:
> Hi,
> 
> One of the services I'm using serves realtime data via HTTP using 
> Twitter-like streaming scheme (chunked transfer encoding, JSON messages 
> separated by \r\n). While trying to consume this data using HttpClient (4.3.1 
> if that matters) I quickly ran into a problem. When the data is sent 
> uncompressed (either by requesting it directly from the backend or by 
> disabling compression) the read() on the response entity's input stream 
> completes as soon as the server sends me another chunk. However, if gzip is 
> enabled (which is what we use in production by compressing the data via 
> nginx), the read() gets stuck for a rather long time, presumably until some 
> buffer inside HC fills. Since the messages themselves are very small and 
> similar, they tend to compress very well, which results in severe lag.
> The problem isn't encountered when the same data is consumed using 
> AsyncHttpClient or plain old curl --compress -N, so it must be an issue with 
> AHC. I couldn't find any relevant RequestConfig/ConnectionConfig settings 
> that would help, is there anything I'm missing?
> 
> Scala code + nginx config that could be used to reproduce the issue are here 
> <https://gist.github.com/technocoreai/9b2fb194b236f773c8a4>, I can rewrite in 
> Java if that will help.

Alexey,

HttpClient makes use of standard Java GZIPInputStream class to
decompress GZIP encoded content entities. Whatever buffering is going on
in that class we, as HttpClient developers, have little to no control
over. 

GZIPInputStream is known to have been a trouble-maker in the past [1].
Give the patch attached to the issue a try and let me know if that fixes
the problem for you.

Oleg

[1] https://issues.apache.org/jira/browse/HTTPCLIENT-1403



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org
For additional commands, e-mail: httpclient-users-h...@hc.apache.org

Reply via email to