Here's a patch for a bug pointed out a month or so on htdig-dev,
and independently discovered by me a few days ago. It fixes a
problem with the empty exclude_urls: and bad_query_str: attributes
matching everything, not nothing, on 3.2b4ish
Is htdig-general the best place for these patches, or is htdig-dev
preferred?
====================================================================
--- htdig/htdig/Retriever.cc Wed Jul 4 09:39:04 2001
+++ htdig-altered/htdig/Retriever.cc Wed Oct 10 14:54:38 2001
@@ -900,8 +900,9 @@ Retriever::IsValidURL(const String &u)
//
tmpList.Create(config->Find(&aUrl,"exclude_urls")," \t");
HtRegexList excludes;
+ int popuated_exclude_list = tmpList.Count();
excludes.setEscaped(tmpList);
- if (excludes.match(url, 0, 0) != 0)
+ if (popuated_exclude_list && excludes.match(url, 0, 0) != 0)
{
if (debug >= 2)
cout << endl << " Rejected: item in exclude list ";
@@ -914,10 +915,12 @@ Retriever::IsValidURL(const String &u)
//
tmpList.Destroy();
tmpList.Create(config->Find(&aUrl,"bad_querystr")," \t");
+
+ int populated_badquerystr = tmpList.Count();
HtRegexList badquerystr;
badquerystr.setEscaped(tmpList);
char *ext = strrchr((char*)url, '?');
- if (ext && badquerystr.match(url, 0, 0) != 0)
+ if (ext && populated_badquerystr && badquerystr.match(ext, 0, 0) !=
0)
{
if (debug >= 2)
cout << endl << " Rejected: item in bad query list ";
====================================================================
thx
Jamie Anstice
Search Engineer
S.L.I. Systems
[EMAIL PROTECTED]
ph: 64 961 3262
mobile: 64 21 264 9347
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html