Here's a site file I've written for the National Review Online
(http://www.nationalreview.com/). I guess it belongs in opinion.
The file could be better, I think, but I can't figure out how to get
StoryToPrintableSub to work.
This site has stories with urls like
http://www.nationalreview.com/<author>/<author><date>.shtml
A printable version of the same story is
http://www.nationalreview.com/<author>/<author>print<date>.html
Which seems like a job for StoryToPrintableSub. I constructed the
following substitution:
StoryURL: /\S+/\D+\d+\.shtml
StoryURL: /\S+/\D+print\d+\.html
StoryToPrintableSub: s,(/\S+/\D+)(\d+)\.sht,\1print\2.ht
(which is only the last of many different attempts).
However, when I run sitescooper against this site file, I get the
following error message:
File "nro.site" line 3: Printable substitution failed! (Bad file
number)
This error message shows up no matter what substitution I use. I
don't speak enough perl to understand what this means, and I haven't
been able to find any information which explains it. Can anybody help
me?
# National Review Online
#
URL: http://www.nationalreview.com/
Name: National Review Online
Levels: 2
ImageURL: /images/\D\.gif
StoryURL: /\S+/\D+\d+\.shtml
StoryURL: /\S+/\D+print\d+\.html
#StoryToPrintableSub: s,(/\S+/\D+)(\d+)\.sht,\1print\2.ht
--
John Straw [EMAIL PROTECTED]
Moderation in all things.
-- Publius Terentius Afer [Terence]