On Tue, 20 Apr 2004, Christopher Murtagh wrote: > Date: Tue, 20 Apr 2004 00:03:01 -0400 > From: Christopher Murtagh <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Cc: [EMAIL PROTECTED] > Subject: [htdig-dev] Solved! - Re: Performance issue with exclude_urls > > Ok, I've solved my problem, and can now have a list of working > exclude_urls without the serious performance decrease. Here are the > changes I made (sorry I'm not sending a proper diff file... need > guidance on how to do that properly): > > > htdig/htdig.h > -------------------- > > added: > > extern int exclude_checked; > extern int badquerystr_checked; > extern HtRegexList excludes; > extern HtRegexList badquerystr; > > > > htdig/htdig.cc > ---------------------- > > added these as global variable definitions: > > int exclude_checked = 0; > int badquerystr_checked = 0; > > HtRegexList excludes; > HtRegexList badquerystr; > > > htdig/Retriever.cc > > added these conditionals and removed the previous tmplist creates and > .setEscaped() calls: > > if(!(exclude_checked)){ > //only parse this once and store into global variable > tmpList.Destroy(); > tmpList.Create(config->Find(&aUrl, "exclude_urls"), " \t"); > excludes.setEscaped(tmpList, config->Boolean("case_sensitive")); > exclude_checked = 1; > } > > if(!(badquerystr_checked)){ > //only parse this once and store into global variable > tmpList.Destroy(); > tmpList.Create(config->Find(&aUrl, "bad_querystr"), " \t"); > badquerystr.setEscaped(tmpList, config->Boolean("case_sensitive")); > badquerystr_checked = 1; > } > > The difference in performance is night and day, and the excludes list > is only parsed once per dig rather than at *every* URL found. > > If this is at all useful to anyone, let me know. I can send files or if > someone would enlighten me (even RTFM me) I can send diff/patches. > > Cheers, > > Chris
Let us call the original files file.orig; do this: diff -u htdig.h.orig htdig.h > exclude_perform.0 diff -u htdig.cc.orig htdig.cc >> exclude_perform.0 diff -u Retriever.cc.orig Retriever.cc >> exclude_perform.0 If your diff command does not accept "-u" use "-c" instead. Attach exclude_perform.0 to an email to the list, or if you prefer ftp it to: ftp://ftp.ccsf.org/incoming Regards, Joe -- _/ _/_/_/ _/ ____________ __o _/ _/ _/ _/ ______________ _-\<,_ _/ _/ _/_/_/ _/ _/ ......(_)/ (_) _/_/ oe _/ _/. _/_/ ah [EMAIL PROTECTED] ------------------------------------------------------- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click _______________________________________________ ht://Dig Developer mailing list: [EMAIL PROTECTED] List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-dev