At 9:50 AM -0600 12/8/99, Gilles Detillieux wrote:
>I'd like to get a handle on this too. However, there may be legitimate
>reasons for word frequencies to change as a result of fixes/enhancements
>to the parser, as is the case with the two changes in 3.1.4, namely
>the bare ampersand bug fix, and the handling of img alt text, which
>can both lead to increased word frequencies. Another fix to 3.1.4,
>to avoid indexing meta keywords or descriptions while under a "noindex"
>condition, would also serve to decrease the word frequencies. Any test
>suite would have to allow for this.
Well I'm not really talking about a test suite. To me, that implies a
"pass/fail" kind of scoring. I would expect we might have to follow
up on things manually. But at least we'd know if, for example, a
version suddenly ignored a whole bunch of documents!
>I think a maindocs snapshot would make a good test suite. By something
>else, I assume you mean something like find and grep. It would also
>make sense to have a snapshot of earlier htdig/htsearch output to compare
>against, but again, you'd have to account for legitimate changes.
Right, I'm talking about a simple form of regression testing, so we
can compare new versions to old ones. Yes, I'm talking about using
find and/or a Perl script to "walk" the pages and give us a count.
And yes, grep would probably be the best solution to count the number
of matches. But then we could compare just about any version to this
standard.
-Geoff
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.