Hi James,
Wget2 is built on top of the libwget library which uses Asynchronous network
calls. However, Wget2 is written such that it only utilizes one connection per
thread. This is essentially a design decision to simplify the codebase. In case
you want a more complex crawler, you can use libwget
On 31.07.2018 20:17, James Read wrote:
> Thanks,
>
> as I understand it though there is only so much you can do with
> threading. For more scalable solutions you need to go with async
> programming techniques. See http://www.kegel.com/c10k.html for a summary
> of the problem. I want to do large sc
Thanks,
as I understand it though there is only so much you can do with threading.
For more scalable solutions you need to go with async programming
techniques. See http://www.kegel.com/c10k.html for a summary of the
problem. I want to do large scale webcrawling and am not sure if wget2 is
up to t
On 31.07.2018 18:39, James Read wrote:
> Hi,
>
> how much work would it take to convert wget into a fully fledged
> asynchronous webcrawler?
>
> I was thinking something like using select. Ideally, I want to be able to
> supply wget with a list of starting point URLs and then for wget to crawl
>
Hi,
how much work would it take to convert wget into a fully fledged
asynchronous webcrawler?
I was thinking something like using select. Ideally, I want to be able to
supply wget with a list of starting point URLs and then for wget to crawl
the web from those starting points in an asynchronous f