At 1:39 PM -0600 11/29/01, Gilles Detillieux wrote: >According to [EMAIL PROTECTED]: >> i'm using htdig 3.2.0b4. >> >> i meant to ask also if the other "programs" fuzzy, et al, would be >> working considering how i installed (see below, please). >> shall i move htfuzzy and htdig, et al, into cgi-bin with htsearch? > >No, you should only put htsearch in cgi-bin. The other programs can >go wherever works for you, but usually somewhere in your PATH is the >most convenient place for them. Some scripts may need to be customised >if you move htdig, htmerge, htfuzzy, htnotify or others, if these scripts >refer to the pathname of the directory where they were originally to be >installed. The rundig script is one such script that you many need to >customise. > >The tricky part is if you move your htsearch configuration file(s) to >a different directory than you specified when you originally configured >the software, because this directory name is compiled into the htsearch >program, so if you move the directory htsearch won't find its config >files.
Very good of you. Yes, I was careful of my installation, I am confident everything is installed (prefixes, etc.) so that could move the htdig and also install everything else in my home dir instead of the default install locations. > >> and one more question: here is a set of test urls i put in my htdig conf: >> >> start_url: http://slis-two.lis.fsu.edu/~G634-23/LIS5364/ >> http://slis-two.lis.fsu.edu/~G634-1/ip1.htm >> http://slis-two.lis.fsu.edu/~G634-1/ip2.htm >> http://slis-two.lis.fsu.edu/~G634-1/ip3.htm >> http://slis-two.lis.fsu.edu/~G634-1/tp1.htm >> >> (it may not come out right in email but each url is separated by 4 >> spaces, the last one has 3 spaces) when i added these and ran my test >> search (http://slis-two.lis.fsu.edu/~G634-23/test.html) i had to go >> back and run ./rundig again to get it to pickup the 2nd and 3rd urls >> -it doesn't get the last two at all... when i searched for the word >> "information". >> >> why do i haceve to run rundig again and again and it still doesn't >> get all urls? i am soon going to put 50!!!!! > >There are two different ways of interpreting your question. > >1) You're expecting the database to automatically pick up any new >URLs in start_url without having to run rundig again, or run htdig >and htpurge. > >2) You are running rundig again after adding URLs to start_url, and >the database is still not picking up the new URLs. > >If it's the first case, you don't understand how the system works. >The htsearch program doesn't update the databases, it only reads them, >so whenever you change a config attribute that affects what goes into >the database, you need to rebuild or at least update the database, with >the htdig program. The rundig script runs htdig with the -i option, to >rebuild from scratch, and then runs htpurge to clean up unused entries. >You can update the database instead of rebuilding from scratch, by >running htdig (without -i) and htpurge separately. You will need to >do this from time to time to make sure your database picks up any >updates to the web sites as well. This is usually done via a shell >script run from your crontab. (See "man crontab" on your system.) > >If the second point above is what you mean, then you need to find out why >htdig isn't indexing everything. See http://www.htdig.org/FAQ.html#q4.1 I'm sorry to have caused you to type so much. I REALLY appreciate what you do on this list. I was referring to #2, and it turned out (my bad) the pages had meta tag blocks on them. DOH! sorry. > >... and earlier... >> i have tested to the following extent: i can search my own pages and >> as far as i know any other public www pages, e.g., htdig.org and >> others, including a students public site that resides on the same >> server -i use the urls, and i will use the urls, in the form of >> http://blah.blah.blah/ for every student url i add to the conf start >> url "list". >> >> so... will it work? > >Well, if you can index one site and search it, then obviously it works. >There's nothing in htdig to prevent it from also working on 50 or more >sites, so it should work. The only way to know for sure is to try it, and >run it in debugging mode (with -v options) if it doesn't work the way you >think it should. Of course, it helps to have a correct idea of how you >think it should work, and that's where reading the documentation comes in. Great. I will hopefully report back in December how wonderfully this worked for this OS X user, unix beginner. Btw, does the "max_excerpts" attribute work in 3.2.0b4? Thanks very much. Ted Rogers _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

