Re: developing web spider

2008-04-04 Thread Kenji Noguchi
Attached is a essence of my crawler. This collects tag in a given URL HTML parsing is not a big deal as "tidy" does all for you. It converts a broken HTML to a valid XHTML. From that point there're wealth of XML libraries. Just write whatever you want such as element handler. I've extended it

Re: developing web spider

2008-04-03 Thread Nikita the Spider
In article <[EMAIL PROTECTED]>, John Nagle <[EMAIL PROTECTED]> wrote: > abeen wrote: > > Hello, > > > > I would want to know which could be the best programming language for > > developing web spider. > > More information about the spider, much better,, > > As someone who actually runs a Py

Re: developing web spider

2008-04-02 Thread John Nagle
abeen wrote: > Hello, > > I would want to know which could be the best programming language for > developing web spider. > More information about the spider, much better,, As someone who actually runs a Python based web spider in production, I should comment. You need a very robust parse

Re: developing web spider

2008-04-02 Thread Pete Wright
The O'Reilly Spidering Hacks book is also really good, albeit a little too focussed on Perl. On Apr 2, 9:54 am, [EMAIL PROTECTED] wrote: > On Apr 2, 6:37 am, abeen <[EMAIL PROTECTED]> wrote: > > > Hello, > > > I would want to know which could be the best programming language for > > developing we

Re: developing web spider

2008-04-02 Thread zillow10
On Apr 2, 2:54 pm, [EMAIL PROTECTED] wrote: > On Apr 2, 6:37 am, abeen <[EMAIL PROTECTED]> wrote: > > > Hello, > > > I would want to know which could be the best programming language for > > developing web spider. > > More information about the spider, much better,, > > > thanks > > >http://www.ima

Re: developing web spider

2008-04-02 Thread zillow10
On Apr 2, 6:37 am, abeen <[EMAIL PROTECTED]> wrote: > Hello, > > I would want to know which could be the best programming language for > developing web spider. > More information about the spider, much better,, > > thanks > > http://www.imavista.com Just saw this while passing by... There's a nice

Re: developing web spider

2008-04-02 Thread Stefan Scholl
abeen <[EMAIL PROTECTED]> wrote: > I would want to know which could be the best programming language for > developing web spider. Since you ask in comp.lang.python: I'd suggest APL -- Web (en): http://www.no-spoon.de/ -*- Web (de): http://www.frell.de/ -- http://mail.python.org/mailman/listinf

Re: developing web spider

2008-04-01 Thread Daniel Fetchinson
> I would want to know which could be the best programming language for > developing web spider. > More information about the spider, much better,, I hear Larry and Sergei were not exactly unsuccessful with a python implementation although you might of course try something even better :) If you a