On Tue, 2012-12-18 at 15:36 -0800, vigna wrote:
> We are trying to use DefaultHttpAsyncClient in our new crawler. We need to
> handle a few hundred connections per thread asynchronously, and it seems the
> right candidate.
>
> In the last weeks we experimented with many DefaultHttpClient on a few
> thousand threads and it worked well (actually, we found a couple of bugs, as
> our crawls are very wide and meet any kind of server configuration errors).
> Consider that we crawl URLs from different sites continuously, so we need to
> change at each request the cookie store, which we do by direct management of
> the store itself.
>
> After digging the (little) documentation, I really couldn't figure out how
> to manage cookies with HttpAsyncClient. Any suggestion or code snipped would
> be really welcome: what we need to do, basically, is:
>
> - keep a few hundred GET requests open in parallel.
> - use for each request an AsyncByteConsumer to accumulate in a buffer the
> content, and in some data structure headers, cookies, etc.
> - on completion, schedule the received data for analysis.
>
> All this requires however to manage cookies, and I could not understand how
> to modify the cookie store for each async request, and how to get the cookie
> store in onResponseReceived().
>
> Any help appreciated!
>
> seba
>
---
static class MyResponseConsumer extends AsyncCharConsumer<Boolean> {
@Override
protected void onResponseReceived(final HttpResponse response) {
}
@Override
protected void onCharReceived(final CharBuffer buf, final IOControl
ioctrl) throws IOException {
while (buf.hasRemaining()) {
System.out.print(buf.get());
}
}
@Override
protected void releaseResources() {
}
@Override
protected Boolean buildResult(final HttpContext context) {
CookieStore cookieStore = (CookieStore) context.getAttribute(
ClientContext.COOKIE_STORE);
List<Cookie> cookies = cookieStore.getCookies();
for (Cookie cookie: cookies) {
System.out.println(cookie);
}
return Boolean.TRUE;
}
}
public static void main(String[] args) throws Exception {
HttpAsyncClient httpclient = new DefaultHttpAsyncClient();
httpclient.start();
try {
BasicCookieStore cookieStore = new BasicCookieStore();
BasicHttpContext context = new BasicHttpContext();
context.setAttribute(ClientContext.COOKIE_STORE, cookieStore);
Future<Boolean> future = httpclient.execute(
HttpAsyncMethods.createGet("http://www.google.com/"),
new MyResponseConsumer(), null);
Boolean result = future.get();
if (result != null && result.booleanValue()) {
System.out.println("Request successfully executed");
} else {
System.out.println("Request failed");
}
System.out.println("Shutting down");
} finally {
httpclient.shutdown();
}
System.out.println("Done");
}
---
Oleg
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]