Questions:

1) If you run an htdump -w before and after the purge, do the db.docs
files differ?
    For me, they differ by one line.. the URL I purged.  I do notice the
dbfiles don't seem to differ in size.
    But, the deleted URL won't show up in search results for me (after
the purge).

    I'll investigate this further.. but my gut is that the record in the
db.docdb is not being purged, but is instead 'changing state' to
Reference_Obsolete.

    As for trying to re-add it.... I can't get your htdig command to work
at all.... it errors for me that it can't find the default htdig.conf
file, even though I gave it the '-c' option.  This indicates some error
happening somewhere....

   I'll keep digging.

Thanks.

On 13 Nov 2003, Christopher Murtagh wrote:

> Greetings htdig folks,
>
>  Recently I've been trying to have htDig purge and re-index items (via a
> trigger in Postgres). The purge seems to work as I no longer see the
> item in the search results, however, when I try to re-index, I cannot
> bring the page back in unless I do a full index. I've just installed
> 3.2.0b5 hoping that this would help, but no luck. Here's some output
> from my command line attempts to get it to work:
>
>
> [EMAIL PROTECTED] bin]# ./htpurge -c /www/htdig/install/conf/ads.conf -u 
> http://newfind.mcgill.ca/indexes/ads/?AdsID=10266860
>
> [EMAIL PROTECTED] bin]# echo 'http://newfind.mcgill.ca/indexes/ads/?AdsID=1026860' | 
> ./htdig - -s -v -m -c /www/htdig/install/conf/ads.conf
>
> ht://dig Start Time: Thu Nov 13 16:36:02 2003
>
> New server: newfind.mcgill.ca, 80
> 0:11472:0:http://newfind.mcgill.ca/indexes/ads/?AdsID=1026860:  (changed)  size = 660
> htdig: Run complete
> htdig: 1 server seen:
> htdig:     newfind.mcgill.ca:80 1 document
>
> HTTP statistics
> ===============
>  Persistent connections    : Yes
>  HEAD call before GET      : Yes
>  Connections opened        : 2
>  Connections closed        : 1
>  Changes of server         : 0
>  HTTP Requests             : 3
>  HTTP KBytes requested     : 0.442383
>  HTTP Average request time : 0 secs
>  HTTP Average speed        : inf KBytes/secs
>
> ht://dig End Time: Thu Nov 13 16:36:03 2003
>
> So although this thing has been purged and re-entered, it no longer
> shows up in the query results. Also, it seems that the dbs aren't being
> updated after the htperge and htdig. Again more output from my konsole
> (note the moddates and filesizes - also, the filesize of db.docdb
> doesn't change between the purge and re-index):
>
> [EMAIL PROTECTED] bin]# ls -ltr /www/htdig/install/var/ads
> total 1584
> -rw-r--r--    1 root     root        24576 Nov 13 13:35 db.excerpts.work
> -rw-r--r--    1 root     root        24576 Nov 13 13:35 db.docs.index.work
> -rw-r--r--    1 root     root        24576 Nov 13 13:35 db.docdb.work
> -rw-r--r--    1 root     root        16384 Nov 13 16:14 db.words.db_weakcmpr
> -rw-r--r--    1 root     root       619520 Nov 13 16:36 db.words.db
> -rw-r--r--    1 root     root       655360 Nov 13 16:36 db.excerpts
> -rw-r--r--    1 root     root       172032 Nov 13 16:36 db.docs.index
> -rw-r--r--    1 root     root       344064 Nov 13 16:38 db.docdb
>
> [EMAIL PROTECTED] bin]# ./htpurge -c /www/htdig/install/conf/ads.conf -u 
> http://newfind.mcgill.ca/indexes/ads/?AdsID=1025825
>
> [EMAIL PROTECTED] bin]# echo 'http://newfind.mcgill.ca/indexes/ads/?AdsID=1025825' | 
> ./htdig - -s -v -m -c /www/htdig/install/conf/ads.conf
>
> ht://dig Start Time: Thu Nov 13 17:05:14 2003
>
> New server: newfind.mcgill.ca, 80
> 0:11475:0:http://newfind.mcgill.ca/indexes/ads/?AdsID=1025825:  (changed)  size = 336
> htdig: Run complete
> htdig: 1 server seen:
> htdig:     newfind.mcgill.ca:80 1 document
>
> HTTP statistics
> ===============
>  Persistent connections    : Yes
>  HEAD call before GET      : Yes
>  Connections opened        : 2
>  Connections closed        : 1
>  Changes of server         : 0
>  HTTP Requests             : 3
>  HTTP KBytes requested     : 0.442383
>  HTTP Average request time : 0 secs
>  HTTP Average speed        : inf KBytes/secs
>
> ht://dig End Time: Thu Nov 13 17:05:14 2003
>
> [EMAIL PROTECTED] bin]# ls -ltr /www/htdig/install/var/ads
> total 1584
> -rw-r--r--    1 root     root        24576 Nov 13 13:35 db.excerpts.work
> -rw-r--r--    1 root     root        24576 Nov 13 13:35 db.docs.index.work
> -rw-r--r--    1 root     root        24576 Nov 13 13:35 db.docdb.work
> -rw-r--r--    1 root     root        16384 Nov 13 16:14 db.words.db_weakcmpr
> -rw-r--r--    1 root     root       619520 Nov 13 16:36 db.words.db
> -rw-r--r--    1 root     root       655360 Nov 13 16:36 db.excerpts
> -rw-r--r--    1 root     root       172032 Nov 13 16:36 db.docs.index
> -rw-r--r--    1 root     root       344064 Nov 13 17:05 db.docdb
>
>
> So, and info or help on this would be much appreciated.
>
> Cheers,
>
> Chris
>
> --
> Christopher Murtagh
> Enterprise Systems Administrator
> ISR / Web Communications Group
> McGill University
> Montreal, Quebec
> Canada
>
> Tel.: (514) 398-3122
> Fax:  (514) 398-2017
>
>
> -------------------------------------------------------
> This SF.Net email sponsored by: ApacheCon 2003,
> 16-19 November in Las Vegas. Learn firsthand the latest
> developments in Apache, PHP, Perl, XML, Java, MySQL,
> WebDAV, and more! http://www.apachecon.com/
> _______________________________________________
> ht://Dig Developer mailing list:
> [EMAIL PROTECTED]
> List information (subscribe/unsubscribe, etc.)
> https://lists.sourceforge.net/lists/listinfo/htdig-dev
>

Neal Richter
Knowledgebase Developer
RightNow Technologies, Inc.
Customer Service for Every Web Site
Office: 406-522-1485




-------------------------------------------------------
This SF. Net email is sponsored by: GoToMyPC
GoToMyPC is the fast, easy and secure way to access your computer from
any Web browser or wireless device. Click here to Try it Free!
https://www.gotomypc.com/tr/OSDN/AW/Q4_2003/t/g22lp?Target=mm/g22lp.tmpl
_______________________________________________
ht://Dig Developer mailing list:
[EMAIL PROTECTED]
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to