Hi
I am programming a Web crawler and an indexer.
I am implementing Lynx in converting HTML documents into text files, by using
command "lynx -dump".
The problem is that it converts relative URLs to FILE:///db/www/... -stylish.
I am using Lynx in extracting links out of the HTML files, so I need to play
around alot to convert those local URLs back to relative ones, which I can
combine to the host name, therefore creating an absolute www- URL.
If you know any other program than Lynx which does these similar tasks at same
performance, I would be interested to know, thanks...
Jari Tuominen
http://www.vunet.org
_______________________________________________
Lynx-dev mailing list
[email protected]
http://lists.nongnu.org/mailman/listinfo/lynx-dev