> i need to get only the report body, not the whole page as Lynx does....
As I noted, that's almost certainly illegal. However, there is an XML based format that is commonly used to give abstracts of news items. I won't name it, in case part of your task was to discover it, but if the commercial people are on the ball, you will only get enough of the article to make you want to read the full page, with its advertising. (In most cases, if you are given the whole article, you are probably viewing a propaganda site, rather than a news site; i.e. the editorial is the advertising.) One other point, in the unlikely event of actually dealing with something that was designed with the semantic web in mind, you would need to process the document object model, which means using a full SGML parser. Normal web browsers are about taking syntactically badly broken HTML and making them visually usable, they, therefore have most of their code to deal with SGML violations, whereas a semantic web document ought to be easy to parse. _______________________________________________ Lynx-dev mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/lynx-dev
