The old NY Times avantgo server seems to have stopped being updated. So I
modified my site file to look like this:
#
URL: http://www.nytimes.com/nytimes-partners/avantgo/main.html
Name: NY Times
Levels: 3
ImageURL: /.*gif
ImageScaleToMaxWidth: 150
ContentsCacheable: 0
StoryCacheable: 0
The URL works fine in a web browser, but sitescooper complains with this
wierd error (shown below). I can't find either NYT_HEADLINE nor NYT_HEADER
in my site file. How to do I get rid of these errors?
Also, the top level "page" has no links to the second level page in the
resulting .pdb file. Why is that?
sitescooper -fullrefresh -isilo -site ./nytimes.site
Reading configuration from "/home/dwight/.sitescooper/sitescooper.cf".
Using site choices from "/home/dwight/.sitescooper/site_choices.txt".
Restricting to sites: ./nytimes.site
SITE START: now scooping site "./nytimes.site".
Reading level-3 front page:
http://www.nytimes.com/nytimes-partners/avantgo/main.html
Image: http://www.nytimes.com/nytimes-partners/avantgo/nytlogo2.gif
Printing: http://www.nytimes.com/nytimes-partners/avantgo/main.html
Found 3 links, examining them.
Reading level-2 front page:
http://www.nytimes.com/nytimes-partners/avantgo/quicknews.html
File "nytimes.site" line 2: LinksStart pattern "</NYT_HEADER" not found in
page http://www.nytimes.com/nytimes-partners/avantgo/quicknews.html
Image: http://www.nytimes.com/nytimes-partners/avantgo/nytlogo5.gif
Image: http://www.nytimes.com/nytimes-partners/avantgo/spacer.gif
Printing: http://www.nytimes.com/nytimes-partners/avantgo/quicknews.html
Found 8 links, examining them.
Reading:
http://www.nytimes.com/nytimes-partners/avantgo/quicknews_story1.html
File "nytimes.site" line 2: StoryStart pattern "</NYT_HEADLINE" not found
in page
http://www.nytimes.com/nytimes-partners/avantgo/quicknews_story1.html
Reading:
http://www.nytimes.com/nytimes-partners/avantgo/quicknews_story2.html
File "nytimes.site" line 2: StoryStart pattern "</NYT_HEADLINE" not found
in page
http://www.nytimes.com/nytimes-partners/avantgo/quicknews_story2.html
--
Dwight D. McKay, Senior Technical Consultant
Network Kitchen Consulting
[EMAIL PROTECTED]
_______________________________________________
Sitescooper-talk mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/sitescooper-talk