----- Bennett Wrote -----
I have found that the NYT server accepts level2 story urls with the
magic "?pagewanted=all" tag even if the story does not have following
pages. Is it possible to preprocess the level 2 scooped urls to
append "?pagewanted=all" before the level2 urls are collected?
----- End -----
Interesting. I've been using "?printpage=yes", which works too. To
use it, I've added the following line to my site file:
StoryToPrintableSub: s,(.*),\1\?printpage=yes,
You could try your solution the same way:
StoryToPrintableSub: s,(.*),\1\?pagewanted=all,
Quick explanation: the "s" is the Perl substitution operator, which
takes the form "s,pattern,replacement," (I'm using commas as
delimiters, but any non-alphanumeric, non-whitespace character could
be used as well.) In this case the pattern is "(.*)". The dot (".")
matches any character, the star ("*") means any number of times, and
the parentheses remember the match (storing it in the first pattern
match variable, "\1"). So the pattern matches any string, and the
replacement replaces it with itself (the contents of the pattern match
variable "\1") followed by our addition "?printpage=yes" or
"?pagewanted=all" (the backslash "\" in front of the question mark "?"
prevents the question mark from being interpreted as a metacharacter).
- Kennis
_______________________________________________
Sitescooper-talk mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/sitescooper-talk