I recommended this about a year ago I think, and it sort of fell
through the cracks, but let's try again. If anyone hasn't yet taken a look
at 'Pavuk', or tested it, you might want to try it. The main page is here:

        http://www.idata.sk/~ondrej/pavuk/

        It is a full-blown spider with a very intricate gui configuration
screen, which includes a built-in javascript, forms, ssl, cookies parser and
manipulation engine. It allows you to schedule a gather of content, merge it
with other content, it's completely multithreaded, and written in C with a
GTK front end to it as well. Here's one screenshot of the GUI in use:

        http://www.idata.sk/~ondrej/pavuk/pic/gtk12-common.gif

        This code is extremely developed, and yes, as with anything, there
are some bugs, but it will help us develop our ideas, and remove the need to
reinvent the wheel with our own work. I think we can really leverage what
Ondrej has done with Pavuk here.

        Comments?



/d


Reply via email to