On Sat, 17 Apr 2004, Lachlan Andrew wrote: > Date: Sat, 17 Apr 2004 00:20:07 +1000 > From: Lachlan Andrew <[EMAIL PROTECTED]> > To: Joe R. Jah <[EMAIL PROTECTED]> > Cc: [EMAIL PROTECTED] > Subject: Profile > > Thanks, Joe :) > > There are some unexpected things about this profile. In particular, > it seems that some files are not being profiled. > > * regcomp is taking *much* too much time! As I understand, regular > expressions are being used for things which really "should" be just > straight string compares. I vote that we go back to doing standard > compares rather than create an escaped string and then call an > expensive regex function... > > * Rather than 'fork'ing a separate external parser each time, could > we look at having a "persistent parser", like a persistent TCP > connection? It would require a way of specifying the end of one > file and the start of the next, but it looks like the performance > gain might be worthwhile. > > * regcomp, calloc and gethostbyname all seem to be being called a lot > more often than gprof recognises. > > * gethostbyname seems to be too expensive. Joe, were you using > "persistent connections"? Gabriele, could we reduce the number > of times gethostbyname is called? Perhaps we could cache the > names?
That's the default, and I did not override it in my configuration file. By the way, all indexed documents were on the same server as htdig. Regards, Joe -- _/ _/_/_/ _/ ____________ __o _/ _/ _/ _/ ______________ _-\<,_ _/ _/ _/_/_/ _/ _/ ......(_)/ (_) _/_/ oe _/ _/. _/_/ ah [EMAIL PROTECTED] > * None of the functions in Connection.cc seem to be profiled. > Any ideas why? > > Thanks all, > Lachlan > > On Wed, 14 Apr 2004 14:36, Joe R. Jah wrote: > > > I compiled htdig-3.2.0b5 with -pg; the following patches applied: > > DESTDIR.0 TMPFILE.0 extension_filter.0 fileSpace.0 operator[].0 and > > robots.0 > > > > I ran htdig on ~13k documents; it ran about 40% slower than my > > regular htdig, (without -pg). I ran gprof htdig > htdig.gmon;gzip > > htdig.gmon, and put the profile on the patch site, although it's > > not a patch;) > > > > ftp://ftp.ccsf.org/htdig-patches/3.2.0b5/htdig.gmon.gz > > > > Hope it can help in improving htdig performance. > > > > Regards, > > > > Joe ------------------------------------------------------- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click _______________________________________________ ht://Dig Developer mailing list: [EMAIL PROTECTED] List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-dev