What I meant by exact scoring formula was whether you have different weights for different headings and how do you calculate wscore. I found the answer by going through the code.
Sorry for sending a personal mail. (The reason was fear of getting disk quota exceeded and ofcourse ignorance.) Thank you. > On Wed, 13 Feb 2002, Geoff Hutchison wrote: > Asking about the "exact scoring formula" is a bit strange. In short, > htsearch in the 3.2 code scores on the fly, adding up the weight of all > occurrences of a word in a document. (So if a word is considered a > heading, it gets the weight of the heading_factor variable.) This is > then added to any ratings from the date_factor and backlink_factor and > other URL-based weightings which are turned off by default. > > I hope that answers your question and I apologize for not writing > sooner, but you may want to see the FAQ about e-mailing people directly, > specifically: > <http://www.htdig.org/FAQ.html#q1.16> > <http://www.htdig.org/FAQ.html#q1.4> > > -- > -Geoff Hutchison > Williams Students Online > http://wso.williams.edu/ > > On Sunday, February 3, 2002, at 05:59 AM, T.Srikanth wrote: > > > > > Hi, > > I am doing a project in which an efficient search is to be > > implemented on educational material (PDF, PS, DOC, PPT). > > > > I am using htdig as the search engine. (ht://Dig 3.2.0b3) > > > > I used external parsers (pdftohtml, antiword) to > > convert the above formats to text and while doing this, > > I am storing the font information as well. The idea > > is to use this font information to achieve better search. > > So I used the heading option of the external parsers > > to assign weight to a word. > > But this does not seem to work well. > > > > Can you give me the exact scoring formula that is used > > by htsearch, so that I can improve the performance. > > > > Thanking you in anticipation. > > > > Srikanth. > > > > > _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

