Title: Parsing HTML files correctly

I don't know if this list is for technical support but i am unaware of any other channels.

I build a site file which looks like -----(guardfull)>

URL: http://www.guardian.co.uk/guardian/todays_stories
Name: guardianfull
Levels: 3
StoryURL: http://www.guardian.co.uk/Print/.*\.(html|htm)

(end of guardfull)---------------------<

and I ran the command-line --->


perl sitescooper.pl -site site/guardfull -html

but the output was practically the whole site instead of just the pages which fitted the StoryURL constraints,
does anyone know why?

Any help is appreciated.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

        Jim McIntosh

Reply via email to