Sorry Claude, but I have some trouble following what you are doing with your CustomScoreQuery. It feels like your query is doing something that breaks some assumptions that Lucene makes.
Have you looked at existing ways that Lucene supports boosting documents by recency, such as putting a LongDistanceFeatureQuery as a SHOULD clause in a BooleanQuery? On Mon, Mar 14, 2022 at 7:00 PM Claude Lepere <claudelep...@gmail.com> wrote: > > Adrien, thank you for your answer and sorry for the lack of clarity. > > No, the score of a document does not depend on the score of another > document, the problem lies within a document. > > There are several "only once score" fields; to simplify, I suppose there is > only one "only once score" field; > a document can contain several times this "only once score" field with > different values; > a query can contain several clauses on the different values of this field > and these clauses can be SHOULD or MUST. > But for such a document, the score of this field should only be counted on > the first pass through my CustomScoreQuery subclass, on subsequent passes, > the custom score = 0 ; > to process so, the constructor of the subclass has as argument the map "my > document id (not Lucene doc!) to the field". > > Then, the score of the first pass is multiplied by a date factor which > depends on the age of the document (age = maximum date of the query results > - date of the document): > the score of a document decreases with its age. > > The total score (field + date) is correctly calculated, but the explanation > log shows that the sort score (the first element of fields[]) is not the > total score but the total score minus the "only once score" or to put it > another way, a total score where the "only once score" = 0, and that's why > a hit with a lower total score happens to be ranked before a hit with a > higher total score. > > The log of my CustomScoreQuery subclass shows that even if the document > contains only one "only once score" field, > Lucene passes the CustomScoreProvider's customScore method twice, so the > score = 0 and it seems to me that this value is retained for the sort score. > > I did not find why a TopFieldDocs search (with Sort = SortField.FIELD_SCORE > and date) uses the "diminished" score and not the total score, as TopDocs > does. > > > Thanks in advance. > > > Claude Lepère > > On 2022/03/14 12:59:45 Adrien Grand wrote: > > It's a bit hard for me to parse what you are trying to do, but it > > looks like you are making assumptions about how Lucene works > > internally that are not correct. > > > > Do I understand correctly that your scoring mechanism has dependencies > > on other documents, ie. the score of a document could depend on the > > score of other documents? This is something that Lucene doesn't > > support. > > > > On Thu, Mar 10, 2022 at 12:23 PM Claude Lepere <cl...@gmail.com> wrote: > > > > > > Hi. > > > The problem is that although sorting by score a match with a lower > score is > > > ranked before a match with a greater score. > > > The origin of the problem lies in a subclass of CustomScoreQuery which > > > calculates an "only once" score for each document: on the first pass the > > > document gets its score and, if the document contains several times the > > > same field, on the subsequent passes it gets 0. > > > I wonder if it is possible for Lucene to give a score that depends on a > > > previous pass in the CustomScoreProvider customScore routine for the > same > > > document. > > > I ran 2 searches with IndexSearcher: the first one returns a TopDocs > which > > > is sorted by default by relevance, and the second search - with the Sort > > > array = [SortField.FIELD_SCORE, a date SortField] argument - returns a > > > TopFieldDocs. > > > The TopDocs results are sorted by the score with the first pass value of > > > the only once method while the TopFieldDocs results are sorted by the > score > > > with the value (= 0) of the next pass, hence the ranking errors. > > > I did not find why does the TopFieldDocs search not use to sort the > score > > > of the hit, as the TopDocs search? > > > I did not find how to tell the TopFieldDocs search to use the hit score > to > > > sort. > > > > > > Claude Lepère > > > > > > > > -- > > Adrien > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > -- Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org