On Sat, 17 Apr 2004, Lachlan Andrew wrote:

> Date: Sat, 17 Apr 2004 00:20:07 +1000
> From: Lachlan Andrew <[EMAIL PROTECTED]>
> To: Joe R. Jah <[EMAIL PROTECTED]>
> Cc: [EMAIL PROTECTED]
> Subject: Profile
> 
> Thanks, Joe :)
> 
> There are some unexpected things about this profile.  In particular, 
> it seems that some files are not being profiled.
> 
> * regcomp is taking *much* too much time!  As I understand, regular
>   expressions are being used for things which really "should" be just
>   straight string compares.  I vote that we go back to doing standard
>   compares rather than create an escaped string and then call an
>   expensive regex function...
> 
> * Rather than 'fork'ing a separate external parser each time, could
>   we look at having a "persistent parser", like a persistent TCP
>   connection?  It would require a way of specifying the end of one
>   file and the start of the next, but it looks like the performance
>   gain might be worthwhile.
> 
> * regcomp, calloc and gethostbyname all seem to be being called a lot
>   more often than  gprof  recognises.
> 
> * gethostbyname  seems to be too expensive.  Joe, were you using
>   "persistent connections"?  Gabriele, could we reduce the number
>   of times  gethostbyname  is called?  Perhaps we could cache the
>   names?

That's the default, and I did not override it in my configuration file.
By the way, all indexed documents were on the same server as htdig.

Regards,

Joe
-- 
     _/   _/_/_/       _/              ____________    __o
     _/   _/   _/      _/         ______________     _-\<,_
 _/  _/   _/_/_/   _/  _/                     ......(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah        [EMAIL PROTECTED]

> * None of the functions in  Connection.cc  seem to be profiled.
>   Any ideas why?
> 
> Thanks all,
> Lachlan
> 
> On Wed, 14 Apr 2004 14:36, Joe R. Jah wrote:
> 
> > I compiled htdig-3.2.0b5 with -pg; the following patches applied:
> > DESTDIR.0 TMPFILE.0 extension_filter.0 fileSpace.0 operator[].0 and
> > robots.0
> >
> > I ran htdig on ~13k documents; it ran about 40% slower than my
> > regular htdig, (without -pg).  I ran gprof htdig > htdig.gmon;gzip
> > htdig.gmon, and put the profile on the patch site, although it's
> > not a patch;)
> >
> >   ftp://ftp.ccsf.org/htdig-patches/3.2.0b5/htdig.gmon.gz
> >
> > Hope it can help in improving htdig performance.
> >
> > Regards,
> >
> > Joe



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
ht://Dig Developer mailing list:
[EMAIL PROTECTED]
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to