Is it possible to set a document boost factor in the current CVS? And if not, is anybody working on it? I am VERY interested and would gladly test performance issues :-)
Christian > -----Ursprüngliche Nachricht----- > Von: Halácsy Péter [mailto:[EMAIL PROTECTED]] > Gesendet: 13 April 2002 18:03 > An: Lucene Users List > Betreff: RE: Normalization of Documents > > > > > Therefore we would need an interface where we could change the lucene=20 > > document boost factor during runtime. For example, a=20 > > document's ranking=20 > > could be based on: > > links pointing to that document (like Google) > > last modification date, > > size of the document, > > term frequency, > > how often was it displayed by other users, sending the same query=20 > > terms to the system > > ..... > > 4 of these 5 are based on a pre-calculated document value/weight/score = > (I don't exactly understand what term frequency means in this context). = > If I could assign a value to every document (as I proposed in a mail) we = > could start to implement some algorithm to calculate different values = > (for example link calculating popularity/page rank needs a matrix = > inversion that isn't too simple) > > > > Let me know if you find that idea interessting, i would like=20 > > to work on=20 > > that topic. > I find it very interesting. > > peter > > > On 4/13/02 6:05 AM, "Bernhard Messer" > > <[EMAIL PROTECTED]> wrote: > > > > > > > > > > the topic you are focusing on is a never ending story in content > > > retrieval in general. There is no perfect solution which > > fits in every > > > environment. Retrieving a document's context based on a single query > > > term seems to be very difficult also. In Lucene it isn't de very > > > difficult to change the ranking algorithm. If you don't > > like the field > > > normalization, you could comment the following in line in > > the TermScorer > > > class. > > > > > > score *= Similarity.norm(norms[d]); > > > > > > If you put a comment around this line, youre scoring is based on the > > > term frequency. > > > > > > If more people are interested, we could think on a little bit more > > > flexible ranking system within Lucene. There would be > > several parameters > > > which from the environment which could be used to rank a document. > > > Therefore we would need an interface where we could change > > the lucene > > > document boost factor during runtime. For example, a > > document's ranking > > > could be based on: > > > links pointing to that document (like Google) > > > last modification date, > > > size of the document, > > > term frequency, > > > how often was it displayed by other users, sending the same query > > > terms to the system > > > ..... > > > > > > -- > > To unsubscribe, e-mail: > > <mailto:[EMAIL PROTECTED]> > > For additional commands, e-mail: > > <mailto:[EMAIL PROTECTED]> > > > > > > -- > To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
