On Wed, 04 Jun 2003 23:07:08 +0200 "H.J.Bathoorn" <[EMAIL PROTECTED]> wrote:
> On Tuesday 03 June 2003 07:12, Todd Slater wrote: > > http://clevername.homeip.net/gnews2 > > > > I've tested it out a little and it seems to work. If no new > > headlines are available, it just says "No new headlines" in the > > email. Note that it still has to pull all the pages from google. > > > > Put it in cron and run every hour? Let me know of any bugs, er, > > features! > > > > Todd > > running it!.....looks fine no probs (except I had to install & > configure sendmail). > > Wouldn't it be an idea to show the first 5 or 10 lines from the > extracted URL so one gets a better discription of the article itself? > > Good luck, > HarM That would be nice. I thought about trying to include the blurb from google but decided against it, especially with the version that only sends new headlines. The reason is that it 1. pulls the page(s) from google 2. strips out lines in the html with headlines (and writes to a file) 3. strips out the headline text and writes that to a file 4. strips out the url and writes that to a file 5. greps the old headline and url file for matches 6. if no match is found, writes that to a new file (one each for headline and url) 7. pastes new headline and new urls together 8. puts headlines and urls for each category in a mail It is beyond my skill right now to throw in extra text and guarantee that everything matches up like it does now :). As far as actually visiting the urls of the news sources, that would take quite a bit to have wget retrieve each page then try to filter out just the first few lines of text of the article. Today's the first day I really ran it, and I only retrieve 2 headlines for business, health, entertainment, and sports, and 5 for usa and world, and all for technology. Running it every 2-3 hours gave me new headlines in each category every time. BTW, you can remove or comment out the lines about "Old hl exists". That was just for debugging and I forgot to take it out of the script. Should save you from getting mail output from cron. Todd
Want to buy your Pack or Services from MandrakeSoft? Go to http://www.mandrakestore.com
