> Yes, it works. As of release version 3.1.0 (or 3.1.1) 

Hmm, I doesn't work for me...

> >I'm trying to use the noindex_start, noindex_stop options to eliminate
> >some HTML code from digging, but with no success. Does this really work?
> >I've setup a test page at http://www.tu-chemnitz.de/~fri/test/htdig.html
> >and tried to ignore words within [...] 
> ><!--htdig-noindex--> ... but with no success.

This page contains:
<!--htdig-noindex-->
        htdig - don't dig this silly text! 
<!--/htdig-noindex-->

"noindex_start" and "..._stop" aren't defined in htdig-build.conf
(<!--htdig-noindex--> should be default).

htdigging:
htdig -vvvvvvvvv -i -l -t -s -c ../conf/htdig-build.conf
..
0:0:0:http://www.tu-chemnitz.de/~fri/test/htdig.html: Retrieval command
for http://www.tu-chemnitz.de/~fri/test/htdig.html: GET
/~fri/test/htdig.html HTTP/1.0
User-Agent: htdig/3.1.1 ([EMAIL PROTECTED])
Host: www.tu-chemnitz.de

Header line: HTTP/1.1 200 OK
..
returnStatus = 0
Read 459 from document
Read a total of 459 bytes
Tag: HTML>, matched -1
Tag: HEAD>, matched -1
Tag: TITLE>, matched 0
word: FTP-Archive@52
Tag: /TITLE>, matched 1

title: FTP-Archive
..
Tag: /H1>, matched 10
word: htdig@758
word: don't@775
word: dig@788
word: this@797
word: silly@808
word: text!@821
word: Does@838
word: this@849
word: work@860
Tag: /BODY>, matched -1
Tag: /HTML>, matched -1
head:   dummy dummy dummystyle dummystyle Willkommen auf der Testseite!
htdig - don't dig this silly text! Does this work? 
 size = 459
pick: www.tu-chemnitz.de, # servers = 1
htdig: Run complete
htdig: 1 server seen:
htdig:     www.tu-chemnitz.de:80 1 document

Then merge:
% htmerge -vvvvvvvv -s -c $DIR/conf/htdig-build.conf

htmerge: Sorting...
htmerge: Merging...
htmerge: Total word count: 12
htmerge: Total documents: 1
htmerge: Total doc db size (in K): 0

db.wordlist contains 13 (not 12!) words - WITH the words inside
<!--htdig-noindex-->:
auf     i:0     l:684   w:1580
dig     i:0     l:788   w:212
does    i:0     l:838   w:162
dont    i:0     l:775   w:225
dummy   i:0     l:302   w:1383  c:2
dummystyle      i:0     l:368   w:1240  c:2
ftparchive      i:0     l:52    w:94800
htdig   i:0     l:758   w:242
silly   i:0     l:808   w:192
testseite       i:0     l:701   w:1495
text    i:0     l:821   w:179
willkommen      i:0     l:660   w:1700
work    i:0     l:860   w:140

And htsearching for "silly" is successfully:
http://www.tu-chemnitz.de/cgi-bin/htsearch?words=silly&method=or&format=builtin-long&config=htdig-test
 
So these words inside <!--htdig-noindex-->... are not left out...

> >Any hints available? 

- Frank
-- 
Email: [EMAIL PROTECTED]  http://www.tu-chemnitz.de/~fri/
Work:  Computing Services,  Chemnitz University of Technology,  Germany

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to