> Yes, it works. As of release version 3.1.0 (or 3.1.1)
Hmm, I doesn't work for me...
> >I'm trying to use the noindex_start, noindex_stop options to eliminate
> >some HTML code from digging, but with no success. Does this really work?
> >I've setup a test page at http://www.tu-chemnitz.de/~fri/test/htdig.html
> >and tried to ignore words within [...]
> ><!--htdig-noindex--> ... but with no success.
This page contains:
<!--htdig-noindex-->
htdig - don't dig this silly text!
<!--/htdig-noindex-->
"noindex_start" and "..._stop" aren't defined in htdig-build.conf
(<!--htdig-noindex--> should be default).
htdigging:
htdig -vvvvvvvvv -i -l -t -s -c ../conf/htdig-build.conf
..
0:0:0:http://www.tu-chemnitz.de/~fri/test/htdig.html: Retrieval command
for http://www.tu-chemnitz.de/~fri/test/htdig.html: GET
/~fri/test/htdig.html HTTP/1.0
User-Agent: htdig/3.1.1 ([EMAIL PROTECTED])
Host: www.tu-chemnitz.de
Header line: HTTP/1.1 200 OK
..
returnStatus = 0
Read 459 from document
Read a total of 459 bytes
Tag: HTML>, matched -1
Tag: HEAD>, matched -1
Tag: TITLE>, matched 0
word: FTP-Archive@52
Tag: /TITLE>, matched 1
title: FTP-Archive
..
Tag: /H1>, matched 10
word: htdig@758
word: don't@775
word: dig@788
word: this@797
word: silly@808
word: text!@821
word: Does@838
word: this@849
word: work@860
Tag: /BODY>, matched -1
Tag: /HTML>, matched -1
head: dummy dummy dummystyle dummystyle Willkommen auf der Testseite!
htdig - don't dig this silly text! Does this work?
size = 459
pick: www.tu-chemnitz.de, # servers = 1
htdig: Run complete
htdig: 1 server seen:
htdig: www.tu-chemnitz.de:80 1 document
Then merge:
% htmerge -vvvvvvvv -s -c $DIR/conf/htdig-build.conf
htmerge: Sorting...
htmerge: Merging...
htmerge: Total word count: 12
htmerge: Total documents: 1
htmerge: Total doc db size (in K): 0
db.wordlist contains 13 (not 12!) words - WITH the words inside
<!--htdig-noindex-->:
auf i:0 l:684 w:1580
dig i:0 l:788 w:212
does i:0 l:838 w:162
dont i:0 l:775 w:225
dummy i:0 l:302 w:1383 c:2
dummystyle i:0 l:368 w:1240 c:2
ftparchive i:0 l:52 w:94800
htdig i:0 l:758 w:242
silly i:0 l:808 w:192
testseite i:0 l:701 w:1495
text i:0 l:821 w:179
willkommen i:0 l:660 w:1700
work i:0 l:860 w:140
And htsearching for "silly" is successfully:
http://www.tu-chemnitz.de/cgi-bin/htsearch?words=silly&method=or&format=builtin-long&config=htdig-test
So these words inside <!--htdig-noindex-->... are not left out...
> >Any hints available?
- Frank
--
Email: [EMAIL PROTECTED] http://www.tu-chemnitz.de/~fri/
Work: Computing Services, Chemnitz University of Technology, Germany
------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.