According to Quim Sanmarti: > De: ... En nombre de Geoff Hutchison ... > > At 1:12 PM +1300 1/3/02, Jamie Anstice wrote: > > >phrase searching where if the phrase being queried contains a > > >stop-word then it doesn't match even when there's a match in the > > >database. > > > > Yeah, this is a known bug. The fix isn't as bad as you make it out to > > be if you allow for false positive matches. You do the query, keeping > > track of the word offsets in the phrase query itself. So "Foo and Bar > > Esquire" would search for Bar as +2 and Esquire as +3 relative to > > Foo. Some cases will match with some other word replacing the "and," > > but you certainly won't miss any. > > Hmm. htdig doesn't index like that right now. Wouldn't it be easier to > *remove* stop-words from the query? > This is relatively easy to do with the new parser (I haven't checked the old > one :), and would give similar results.
Maybe I'm confused here, but isn't this the bug that was fixed back in May by a simple patch to the parser (a missing return statement or something like that)? Jamie, you didn't mention which 3.2 distribution you're using, but anything before 3.2.0b4 snapshots of July or later are hopelessly buggy, so if you haven't already done so I recommend updating to the latest 3.2.0b4 snapshot (http://www.htdig.org/files/snapshots/). The latest snapshots (both 3.1.6 and 3.2.0b4) also have improved handling of noindex tags in the HTML parser, to prevent conflicts between different pairs of tags. They also feature handling of <noindex follow>...</noindex> tags, which is akin to one of the features you (Jamie) requested. I'm not sure why you want or need multiple levels of noindex tags, though, other than maybe to work around bugs in the pre-August parser. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

