I think thread context switch will use many cpu resources. if I can use async method(maybe it use java nio, it's epoll in linux; if it use nio.2, it's aio), it will be more performant. I have hundreds(even thousands) threads running. some website is slow and which take half a minute to get a single webpage. I am now using 500 threads and it only use 200KB/s bandwidth. If I add more threads, it will use more memory(stack) and cpu
On Fri, Jul 18, 2014 at 10:02 PM, Oleg Kalnichevski <[email protected]> wrote: > On Fri, 2014-07-18 at 18:16 +0800, Li Li wrote: >> hi all, >> I used to use HttpComponents Client to crawl webpages. I need to >> improve it by using async client. What I want to is something like: >> Queue<URL> needCrawlQueue; >> Queue<String[]> htmlQueue; >> >> HttpAsyncClient client; >> int maxConcurrent=500; >> >> //if finished a url, then get notified and call back this code >> if(client.currentCrawlingCount<maxConcurrent){ >> URL url=needCrawlQueue.take(); >> //request this url >> } >> >> //if finished a url, then get notifed and call back this code >> //String url;String html is call back arguments >> htmlQueue.put(new String[]{url, html}; >> >> I mean I have a asnyc client class which take two queues. >> if current unfinished urls less than maxConcurrent, then it task a >> url from a queue and request this url. if a url succeed(or failed), >> add the result to another queue. >> > > Why do you think the use of an async client would necessarily be an > improvement? What is it exactly you want to improve? Generally a decent > blocking client with a moderate number of threads is likely to be faster > than an async one. > > Oleg > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
