Questions: 1) If you run an htdump -w before and after the purge, do the db.docs files differ? For me, they differ by one line.. the URL I purged. I do notice the dbfiles don't seem to differ in size. But, the deleted URL won't show up in search results for me (after the purge).
I'll investigate this further.. but my gut is that the record in the db.docdb is not being purged, but is instead 'changing state' to Reference_Obsolete. As for trying to re-add it.... I can't get your htdig command to work at all.... it errors for me that it can't find the default htdig.conf file, even though I gave it the '-c' option. This indicates some error happening somewhere.... I'll keep digging. Thanks. On 13 Nov 2003, Christopher Murtagh wrote: > Greetings htdig folks, > > Recently I've been trying to have htDig purge and re-index items (via a > trigger in Postgres). The purge seems to work as I no longer see the > item in the search results, however, when I try to re-index, I cannot > bring the page back in unless I do a full index. I've just installed > 3.2.0b5 hoping that this would help, but no luck. Here's some output > from my command line attempts to get it to work: > > > [EMAIL PROTECTED] bin]# ./htpurge -c /www/htdig/install/conf/ads.conf -u > http://newfind.mcgill.ca/indexes/ads/?AdsID=10266860 > > [EMAIL PROTECTED] bin]# echo 'http://newfind.mcgill.ca/indexes/ads/?AdsID=1026860' | > ./htdig - -s -v -m -c /www/htdig/install/conf/ads.conf > > ht://dig Start Time: Thu Nov 13 16:36:02 2003 > > New server: newfind.mcgill.ca, 80 > 0:11472:0:http://newfind.mcgill.ca/indexes/ads/?AdsID=1026860: (changed) size = 660 > htdig: Run complete > htdig: 1 server seen: > htdig: newfind.mcgill.ca:80 1 document > > HTTP statistics > =============== > Persistent connections : Yes > HEAD call before GET : Yes > Connections opened : 2 > Connections closed : 1 > Changes of server : 0 > HTTP Requests : 3 > HTTP KBytes requested : 0.442383 > HTTP Average request time : 0 secs > HTTP Average speed : inf KBytes/secs > > ht://dig End Time: Thu Nov 13 16:36:03 2003 > > So although this thing has been purged and re-entered, it no longer > shows up in the query results. Also, it seems that the dbs aren't being > updated after the htperge and htdig. Again more output from my konsole > (note the moddates and filesizes - also, the filesize of db.docdb > doesn't change between the purge and re-index): > > [EMAIL PROTECTED] bin]# ls -ltr /www/htdig/install/var/ads > total 1584 > -rw-r--r-- 1 root root 24576 Nov 13 13:35 db.excerpts.work > -rw-r--r-- 1 root root 24576 Nov 13 13:35 db.docs.index.work > -rw-r--r-- 1 root root 24576 Nov 13 13:35 db.docdb.work > -rw-r--r-- 1 root root 16384 Nov 13 16:14 db.words.db_weakcmpr > -rw-r--r-- 1 root root 619520 Nov 13 16:36 db.words.db > -rw-r--r-- 1 root root 655360 Nov 13 16:36 db.excerpts > -rw-r--r-- 1 root root 172032 Nov 13 16:36 db.docs.index > -rw-r--r-- 1 root root 344064 Nov 13 16:38 db.docdb > > [EMAIL PROTECTED] bin]# ./htpurge -c /www/htdig/install/conf/ads.conf -u > http://newfind.mcgill.ca/indexes/ads/?AdsID=1025825 > > [EMAIL PROTECTED] bin]# echo 'http://newfind.mcgill.ca/indexes/ads/?AdsID=1025825' | > ./htdig - -s -v -m -c /www/htdig/install/conf/ads.conf > > ht://dig Start Time: Thu Nov 13 17:05:14 2003 > > New server: newfind.mcgill.ca, 80 > 0:11475:0:http://newfind.mcgill.ca/indexes/ads/?AdsID=1025825: (changed) size = 336 > htdig: Run complete > htdig: 1 server seen: > htdig: newfind.mcgill.ca:80 1 document > > HTTP statistics > =============== > Persistent connections : Yes > HEAD call before GET : Yes > Connections opened : 2 > Connections closed : 1 > Changes of server : 0 > HTTP Requests : 3 > HTTP KBytes requested : 0.442383 > HTTP Average request time : 0 secs > HTTP Average speed : inf KBytes/secs > > ht://dig End Time: Thu Nov 13 17:05:14 2003 > > [EMAIL PROTECTED] bin]# ls -ltr /www/htdig/install/var/ads > total 1584 > -rw-r--r-- 1 root root 24576 Nov 13 13:35 db.excerpts.work > -rw-r--r-- 1 root root 24576 Nov 13 13:35 db.docs.index.work > -rw-r--r-- 1 root root 24576 Nov 13 13:35 db.docdb.work > -rw-r--r-- 1 root root 16384 Nov 13 16:14 db.words.db_weakcmpr > -rw-r--r-- 1 root root 619520 Nov 13 16:36 db.words.db > -rw-r--r-- 1 root root 655360 Nov 13 16:36 db.excerpts > -rw-r--r-- 1 root root 172032 Nov 13 16:36 db.docs.index > -rw-r--r-- 1 root root 344064 Nov 13 17:05 db.docdb > > > So, and info or help on this would be much appreciated. > > Cheers, > > Chris > > -- > Christopher Murtagh > Enterprise Systems Administrator > ISR / Web Communications Group > McGill University > Montreal, Quebec > Canada > > Tel.: (514) 398-3122 > Fax: (514) 398-2017 > > > ------------------------------------------------------- > This SF.Net email sponsored by: ApacheCon 2003, > 16-19 November in Las Vegas. Learn firsthand the latest > developments in Apache, PHP, Perl, XML, Java, MySQL, > WebDAV, and more! http://www.apachecon.com/ > _______________________________________________ > ht://Dig Developer mailing list: > [EMAIL PROTECTED] > List information (subscribe/unsubscribe, etc.) > https://lists.sourceforge.net/lists/listinfo/htdig-dev > Neal Richter Knowledgebase Developer RightNow Technologies, Inc. Customer Service for Every Web Site Office: 406-522-1485 ------------------------------------------------------- This SF. Net email is sponsored by: GoToMyPC GoToMyPC is the fast, easy and secure way to access your computer from any Web browser or wireless device. Click here to Try it Free! https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl _______________________________________________ ht://Dig Developer mailing list: [EMAIL PROTECTED] List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-dev