On Tue, 20 Apr 2004, Christopher Murtagh wrote:

> Date: Tue, 20 Apr 2004 00:03:01 -0400
> From: Christopher Murtagh <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: [htdig-dev] Solved! - Re: Performance issue with exclude_urls
> 
>  Ok, I've solved my problem, and can now have a list of working
> exclude_urls without the serious performance decrease. Here are the
> changes I made (sorry I'm not sending a proper diff file... need
> guidance on how to do that properly):
> 
> 
> htdig/htdig.h
> --------------------
> 
> added:
> 
> extern int exclude_checked;
> extern int badquerystr_checked;
> extern HtRegexList  excludes;
> extern HtRegexList  badquerystr;
> 
> 
> 
> htdig/htdig.cc
> ----------------------
> 
> added these as global variable definitions:
> 
> int exclude_checked = 0;
> int badquerystr_checked = 0;
> 
> HtRegexList     excludes;
> HtRegexList     badquerystr;
> 
> 
> htdig/Retriever.cc
> 
> added these conditionals and removed the previous tmplist creates and
> .setEscaped() calls:
> 
> if(!(exclude_checked)){
>     //only parse this once and store into global variable
>     tmpList.Destroy();
>     tmpList.Create(config->Find(&aUrl, "exclude_urls"), " \t");
>     excludes.setEscaped(tmpList, config->Boolean("case_sensitive"));
>     exclude_checked = 1;
> }
> 
> if(!(badquerystr_checked)){
>     //only parse this once and store into global variable
>     tmpList.Destroy();
>     tmpList.Create(config->Find(&aUrl, "bad_querystr"), " \t");
>     badquerystr.setEscaped(tmpList, config->Boolean("case_sensitive"));
>     badquerystr_checked = 1;
> }
> 
>  The difference in performance is night and day, and the excludes list
> is only parsed once per dig rather than at *every* URL found.
> 
>  If this is at all useful to anyone, let me know. I can send files or if
> someone would enlighten me (even RTFM me) I can send diff/patches.
> 
> Cheers,
> 
> Chris

Let us call the original files file.orig; do this:

diff -u htdig.h.orig htdig.h > exclude_perform.0
diff -u htdig.cc.orig htdig.cc >> exclude_perform.0
diff -u Retriever.cc.orig Retriever.cc >> exclude_perform.0

If your diff command does not accept "-u" use "-c" instead.

Attach exclude_perform.0 to an email to the list, or if you prefer ftp it
to:

  ftp://ftp.ccsf.org/incoming

Regards,

Joe
-- 
     _/   _/_/_/       _/              ____________    __o
     _/   _/   _/      _/         ______________     _-\<,_
 _/  _/   _/_/_/   _/  _/                     ......(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah        [EMAIL PROTECTED]



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
ht://Dig Developer mailing list:
[EMAIL PROTECTED]
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to