> Therefore we would need an interface where we could change the lucene=20 > document boost factor during runtime. For example, a=20 > document's ranking=20 > could be based on: > links pointing to that document (like Google) > last modification date, > size of the document, > term frequency, > how often was it displayed by other users, sending the same query=20 > terms to the system > .....
4 of these 5 are based on a pre-calculated document value/weight/score = (I don't exactly understand what term frequency means in this context). = If I could assign a value to every document (as I proposed in a mail) we = could start to implement some algorithm to calculate different values = (for example link calculating popularity/page rank needs a matrix = inversion that isn't too simple) > Let me know if you find that idea interessting, i would like=20 > to work on=20 > that topic. I find it very interesting. peter On 4/13/02 6:05 AM, "Bernhard Messer" > <[EMAIL PROTECTED]> wrote: > > > > > > the topic you are focusing on is a never ending story in content > > retrieval in general. There is no perfect solution which > fits in every > > environment. Retrieving a document's context based on a single query > > term seems to be very difficult also. In Lucene it isn't de very > > difficult to change the ranking algorithm. If you don't > like the field > > normalization, you could comment the following in line in > the TermScorer > > class. > > > > score *= Similarity.norm(norms[d]); > > > > If you put a comment around this line, youre scoring is based on the > > term frequency. > > > > If more people are interested, we could think on a little bit more > > flexible ranking system within Lucene. There would be > several parameters > > which from the environment which could be used to rank a document. > > Therefore we would need an interface where we could change > the lucene > > document boost factor during runtime. For example, a > document's ranking > > could be based on: > > links pointing to that document (like Google) > > last modification date, > > size of the document, > > term frequency, > > how often was it displayed by other users, sending the same query > > terms to the system > > ..... > > > -- > To unsubscribe, e-mail: > <mailto:[EMAIL PROTECTED]> > For additional commands, e-mail: > <mailto:[EMAIL PROTECTED]> > > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
