On Wed, 21 Mar 2001, Justin Mason wrote:
>
> Jarl Friis said:
>
> > In the URLProcessor.pm there are some lines
> > if ($url =~ /[\/\&\?](\S{5,20}?)$/) {
> > $self->{to_string_name} = $1;
> > } else {
> > $self->{to_string_name} = $url;
> > }
> > they seem to give me problems with a story-link like
> > '/shareware/index.html' which in the example above means that tne
> > to_string_name becomes 'index.html' which seem to conflict with the
> > contents (level 1) url, hence the it stops because the queue seems empty.
> > Does someone know what to do about it?
>
> Hmm...
>
> This shouldn't happen, as the queue is actually keyed numerically.
> But anyway, I've checked in a fix for it so that sitescooper never
> uses index.html/cgi/shtml/whatever as the key...
The site is www.ing.dk, the story that made the error is gone, I'll let
you know if I encounter (and notices) it again; Even though I have tried
to reproduce the error on
http://www.diku.dk/students/jarl/ing.dk.html
I haven't succeded, I have even copied the story and the story I lost
after the bad story, but now everything seem to work :-?
The included site-file contains the buf-url with the real url outcommented
The site is in danish, so you may not understand it. But the 2nd last
story has the title "Sharewarehjælp til Basic-programmering", and links to
http://www.ing.dk/shareware/index.html and that caused a problem actually
the problem was that I missed the last story. I am mad at myself I didn't
save the debug-info of the runs, sorry :-)
Thanks for an excelent program.
I guess I'll send you some danish news site files soon.
Jarl
# De Studerendes Vandreklub Kalender
# Author: Jarl Friis <[EMAIL PROTECTED]>
URL: http://www.diku.dk/students/jarl/ing.dk.html
#URL: www.ing.dk
ImageURL: 1
Name: Ingeniøren
Levels: 2
AuthorName: Jarl Friis
AuthorEmail: [EMAIL PROTECTED]
Active: 1
ContentsIncludeStartPattern: 1
ContentsIncludeEndPattern: 1
ContentsStart: <!-- Indholde Start -->
#This will include ShortNews
ContentsEnd: </TD></TR></TABLE> <BR>
#This will NOT include the ShortNews:
#ContentsEnd: <TR><TD COLSPAN="2"><IMG SRC="/ress/ramme/d.gif" WIDTH="2" HEIGHT="3"
ALT=""></TD></TR></TABLE>
StoryStart: <!-- .BeginEditable "trumpet" -->
StoryEnd: <!-- .BeginEditable "hojre_spalte_nede_bund" -->
#StoryURL: http://www.ing.dk.*
StorySkipURL: mailto:.*
TableRender: flatten
#StoryLifetime: 0
#ContentsCacheable: 0
#StoryCacheable: 0
#ContentsDiff: 0
#StoryDiff: 0
#ContentsHTMLPreProcess: {
# $_ =~ s,<p>,,sgi;
# }
#StoryHTMLPreProcess: {
# $_ =~ s,<p>,,sgi;
# }
#StoryPostProcess: {
# $_ =~ s,(<p>)+,,gsi;
# }