On Fri, Dec 15, 2000 at 09:05:29AM +0100, Josip Rodin wrote: > > Could this be useful? > Maybe. It appears to be extremely similar to the program I wrote. They are both written in python and use the same python libraries. The big difference is that he chose to use threads, while I chose to limit the time spent searching for a given URL, which requires using signals (which use exceptions). The problem with using exceptions is that they don't work well with threads.
The biggest bottleneck for the program is sites that timeout. By default, it takes about 13 minutes for a connection to timeout. I currently have the timeout set to 15 seconds in my program, which should give it an edge over the other, assuming it uses 10 threads. Also, my program caches the results of a timeout, while his doesn't, which will really slow his down. I am running linkchecker on our pages (as of Fri Dec 15 18:57:57 UTC 2000) and you can see the output at http://www.debian.org/~treacy/urlcheck/scripts/out The following is from an included mail, not from Josip Rodin: > LinkChecker is a tool I wrote some time ago to check my HTML pages. > It has grown into a program that can check whole web structures > for broken links. > Its licensed under the GPL. > Comparison (responses are whether my program has that function): > Features: > o recursive checking yes > o multithreading no > o output in colored or normal text, HTML, SQL, CSV or a sitemap > graph in GML or XML. who cares > o HTTP/1.1, HTTPS, FTP, mailto:, news:, nntp:, Gopher, Telnet and local > file links support http and ftp only, which work well for our site. > o restriction of link checking with regular expression filters for URLs yes > o proxy support no and not needed since it's our site > o username/password authorization for HTTP and FTP no, but not needed for our site > o robots.txt exclusion protocol support no, but since it is our site, I want to check what I want to check > o i18n support no > o a command line interface yes > o a (Fast)CGI web interface (requires HTTP server) trivial to set up My program allows you to set the timeout for trying a connection, which his doesn't. Unfortunately, there is a tradeoff between these, unless you are willing to jump through some hoops. -- James (Jay) Treacy [EMAIL PROTECTED]

