Re: Index-time boosting: Deprecated setBoost method

baris . kazar Fri, 18 Oct 2019 12:14:35 -0700

Uwe,-

Two questions there:


i guess this is applicable to TextField, too.

And i was expecting a index writer object in the example for index timeboosting.


Best regards


On 10/18/19 2:57 PM, Uwe Schindler wrote:

Sorry I was imprecise. It's a mix of both. The factors are stored per document 
in index (this is why I called it index time). During query time the expression 
use the index time values to fold them into the query boost at query time.

What's your problem with that approach?

Uwe

Am October 18, 2019 6:50:40 PM UTC schrieb baris.ka...@oracle.com:

Uwe,-

  Thanks, if possible i am looking for a pure Java methodology to do the

index time boosting.

This example looks like a search time boosting example:

https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_7-5F7-5F2_expressions_org_apache_lucene_expressions_Expression.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=6m6i5zZXPZNP6DyVv_xG4vXnVTPEdfKLeLSvGjEXbyw&s=B5_kGwRIbAoGqL0-SVR9r3t78E5XUuzLT37TeyV-bv8&e=



Best regards

On 10/18/19 2:31 PM, Uwe Schindler wrote:

Hi,

Is there a working example for this? Is this mentioned in the Lucene
Javadocs or any other docs so that i can look it?

To index the docvalues, see NumericDocValuesField (it can be added to

documents like indexed or stored fields). You may have used them for
sorting already.

this methodology seems sort of like discouraging using index time

boosting.

Not really. Many use this all the time. It's one of the killer

features of both Solr and Elasticsearch. The problem was how the
Document.setBoost()worked (it did not work correctly, see below).

Previous setBoost method call was fine and easy to use.
Did it have some performance issues and then is that why it was

deprecated?

No the reason for deprecating this was for several reasons: setBoost

was not doing what the user had expected. Internally the boost value
was just multiplied into the document norm factor (which is internally
also a docvalues field). The norm factors are only very inprecise
floats stored in a byte, so precision is not well. If you put some
values into it and the length norm was already consuming all bits, the
boosting was very coarse. It was also only multiplied into and most
users want to do some stuff like record click counts in the index and
then boost for example with the logarithm or some other function. If
the boost is just multiplied into the length norm you have no
flexibility at all.

In addition you can have several docvalues fields and use their

values in a function (e.g. one field with click count and another one
with product price). After that you can combine click count and price
(which can be modified indipenently during index updates) and change
boost to boost lower price and higher click count up.

This is what you can do with the expressions module. You just give it

a function.

Here is an example, the second example is using a FunctionScoreQuery

that modifies the score based on the function and the given docvalues:
https://urldefense.proofpoint.com/v2/url?u=https-3A__lucene.apache.org_core_7-5F7-5F2_expressions_org_apache_lucene_expressions_Expression.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=6m6i5zZXPZNP6DyVv_xG4vXnVTPEdfKLeLSvGjEXbyw&s=B5_kGwRIbAoGqL0-SVR9r3t78E5XUuzLT37TeyV-bv8&e=

FunctionScoreQuery usage with MultiFieldQueryParser would also be

nice

where

MultiFieldQuery already has boosts field to do this in its

constructor.

The boots in the query parser are applied for fields during query

time (to have a different weight per field). Index time boosting is per
document. So you can combine both.

Maybe it is not needed with MultiFieldQueryParser.

You use MultiFieldQueryParser to adjust weights of the fields (e.g.

title versus body). The parsed query is then wrapped with an expression
that modifies the score per document according to the docvalues.

Uwe

On 10/18/19 1:28 PM, Uwe Schindler wrote:

Hi,

that's not true. You can do index time boosting, but you need to do

that

using a separate field. You just index a numeric docvalues field

(which may

contain a long or float value per document). Later you wrap your

query with

some FunctionScoreQuery (e.g., use the Javascript function query

syntax in

the expressions module). This allows you to compile a javascript

function

that calculated the final score based on the score returned by the

inner query

and combines them with docvalues that were indexed per document.

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://urldefense.proofpoint.com/v2/url?u=https-

3A__www.thetaphi.de&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIr
MUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
BKNeyLlULCbaezrgocEvPhQkl4&m=6rVk8db2H8dAcjS3WCWmAPd08C7JQCvZ
8W80yE9L5xY&s=zgKmnmP9gLG4DlEnAfDdtBMEzPXtHNVYojxXIKEnQgs&e=

eMail: u...@thetaphi.de

-----Original Message-----
From: baris.ka...@oracle.com <baris.ka...@oracle.com>
Sent: Friday, October 18, 2019 5:28 PM
To: java-user@lucene.apache.org
Cc: baris.ka...@oracle.com
Subject: Re: Index-time boosting: Deprecated setBoost method

It looks like index-time boosting (field) is not possible since

Lucene

version 7.7.2 and

i was using before for another case the BoostQuery at search time

for

boosting and

this seems to be the only boosting option now in Lucene.

Best regards


On 10/18/19 10:01 AM, baris.ka...@oracle.com wrote:

Hi,-

i saw this in the Field class docs and i am figuring out the

following

note in the docs:

setBoost(float boost)
Deprecated.
Index-time boosts are deprecated, please index index-time scoring
factors into a doc value field and combine them with the score at
query time using eg. FunctionScoreQuery.

I appreciate this note. Is there an example about this? I wish

docs

would give a simple example to further help.

https://urldefense.proofpoint.com/v2/url?u=https-

3A__lucene.apache.org_core_6-5F6-
5F0__core_org_apache_lucene_document_&d=DwIFaQ&c=RoP1YumCXCga
WHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
BKNeyLlULCbaezrgocEvPhQkl4&m=6rVk8db2H8dAcjS3WCWmAPd08C7JQCvZ
8W80yE9L5xY&s=rIVbw3_TGEwpaet5ibCeYze6vSDUiPhwOzlV0z484fM&e=

Field.html

vs

https://urldefense.proofpoint.com/v2/url?u=https-

3A__lucene.apache.org_core_7-5F7-
5F2_core_org_apache_lucene_document_F&d=DwIFaQ&c=RoP1YumCXCgaW
HvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-
BKNeyLlULCbaezrgocEvPhQkl4&m=6rVk8db2H8dAcjS3WCWmAPd08C7JQCvZ
8W80yE9L5xY&s=yt1toHHZQBqd3qKpWeSzywGJhy928Q5qaEO4v9Lj3vg&e=

ield.html

Best regards

---------------------------------------------------------------------

To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------

To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------

To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

--
Uwe Schindler
Achterdiek 19, 28357 Bremen
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.thetaphi.de&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=nlG5z5NcNdIbQAiX-BKNeyLlULCbaezrgocEvPhQkl4&m=6ksT9ArMj83Yxf_GrxLNeJ4UFEeKdVdLK0BlOT0d754&s=33f2nq9rOLI5pN9e_RYl_TiEKnP_f4WMZ__vqyz2bzo&e=


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Index-time boosting: Deprecated setBoost method

Reply via email to