[EMAIL PROTECTED] said:
> I think that the problem is that the story tags in the contents page st= > arts in one line and ends in another. > > I think that the same problem happened when I've tryed to scoop some Li= > nux Documents from http://www.linuxdoc.org, because all Docbook > generated d= ocuments are a mess. Take a look at source of some HowTos > at linuxdoc. > > I'm attaching the .site file for the Brazilian News site so you can tak= > e a look at it. Hi Rodrigo -- that is a mess! whitespace in the URLs themselves!! 2 tips: 1. if you use "\s+" in the patterns, that will match newlines and any whitespace at all. 2. if the story URL is picked up by sitescooper as containing spaces, which it might (Sitescooper as far as I know is totally right to do this!) then you might have to use a substitution on the URL, using URLProcess, to fix it. --j. _______________________________________________ Sitescooper-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/sitescooper-talk
