Hello,
Can't help you with the code but just to make sure you also consider an 
additional option: in order to be friendly to servers you can request a partial 
response with a range header in the first place (however not sure about the 
interop impact).

Gruss
Bernd
-- 
http://bernd.eckenfels.net




On Tue, Dec 6, 2016 at 5:21 PM +0100, "Joseph Naegele" 
<jnaeg...@grierforensics.com> wrote:










Hi folks,

How can I limit the amount of data downloaded for a request executed by the 
HttpAsyncClient and still process the response as "completed" in the registered 
FutureCallback? The use case is a large scale web crawler that truncates 
resources deemed too large.

I started by limiting the amount of data read from the response entity's 
InputStream, however this doesn't work with the default 
BasicAsyncResponseConsumer, because it uses the dynamically expanding 
SimpleInputBuffer to download the entire response entity.

I implemented my own HttpAsyncResponseConsumer, similar to the 
BasicAsyncResponseConsumer, and tried using IOControl to signal shutdown once 
the I've read maximum desired number of bytes, however this triggers a 
ConnectionClosedException. This is undesirable because I can't distinguish it 
from other causes of ConnectionClosedExceptions, and I want to treat 
"truncated" responses as completed in the registered FutureCallback (where I 
post-process the response).

Is there another method of implementing my desired functionality?

Thanks,
Joe Naegele


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org
For additional commands, e-mail: httpclient-users-h...@hc.apache.org






Reply via email to