Ok, I've solved my problem, and can now have a list of working
exclude_urls without the serious performance decrease. Here are the
changes I made (sorry I'm not sending a proper diff file... need
guidance on how to do that properly):


htdig/htdig.h
--------------------

added:

extern int exclude_checked;
extern int badquerystr_checked;
extern HtRegexList  excludes;
extern HtRegexList  badquerystr;



htdig/htdig.cc
----------------------

added these as global variable definitions:

int exclude_checked = 0;
int badquerystr_checked = 0;

HtRegexList     excludes;
HtRegexList     badquerystr;


htdig/Retriever.cc

added these conditionals and removed the previous tmplist creates and
.setEscaped() calls:

if(!(exclude_checked)){
    //only parse this once and store into global variable
    tmpList.Destroy();
    tmpList.Create(config->Find(&aUrl, "exclude_urls"), " \t");
    excludes.setEscaped(tmpList, config->Boolean("case_sensitive"));
    exclude_checked = 1;
}

if(!(badquerystr_checked)){
    //only parse this once and store into global variable
    tmpList.Destroy();
    tmpList.Create(config->Find(&aUrl, "bad_querystr"), " \t");
    badquerystr.setEscaped(tmpList, config->Boolean("case_sensitive"));
    badquerystr_checked = 1;
}

 The difference in performance is night and day, and the excludes list
is only parsed once per dig rather than at *every* URL found.

 If this is at all useful to anyone, let me know. I can send files or if
someone would enlighten me (even RTFM me) I can send diff/patches.

Cheers,

Chris

-- 
Christopher Murtagh
Enterprise Systems Administrator
ISR / Web Communications Group 
McGill University
Montreal, Quebec
Canada

Tel.: (514) 398-3122
Fax:  (514) 398-2017



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
ht://Dig Developer mailing list:
[EMAIL PROTECTED]
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to