>From looking at
https://github.com/apache/lucene/blob/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/ScaleFloatFunction.java#L70
I conclude that min,max are obtained from all docs in the index.
But if you specify query() as an argument for scale() it takes only
matching docs for evaluating min&max. So, what I get so far you a looking
for a query which matches an intersection of $q AND $fq but yield price
field value as its score.
It seems I've got the problem definition. I'll come up with a proposal a
little bit later.

On Wed, Jun 1, 2022 at 11:33 AM Vincenzo D'Amore <[email protected]> wrote:

> Hi Mikhail,
>
> sorry for not being clear, I'll try again.
> For my understanding the solr scale function, once applied to a field,
> needs min and max for that field.
> Those min and max values by default are calculated by all the existing
> documents, I don't know exactly how this is implemented internally in Solr.
> I assume that, in the worst case scenario, all the documents have to be
> traversed reading all the values for the given field and then somehow
> saving the min/max.
> In the Solr scale function documentation is also written:
> > The current implementation cannot distinguish when documents have been
> deleted or documents that have no value. It uses 0.0 values for these
> cases.
> This means that often the min value can be 0 if you have only positive
> values.
>
> But what happens if I need to scale the values of a field only within the
> documents that are the result of a query? Only a few hundreds or thousands
> of documents?
> First of all min and max has to be calculated only on the result set of
> your query.
> That is what I was trying to say when I wrote "apply the scale function
> only to the result set (and not to the entire collection)".
>
> For example, if you apply the scale function to the field price in Solr
> techproducts example, "min" and "max" are between 0.0 and 2199.0
>
>
> http://localhost:8983/solr/techproducts/select?q=*:*&rows=0&stats=true&stats.field=price
>
> So even if a filter query is added - fq=popularity:(1 OR 7) - the values
> are scaled between 0.0 and 2199.0.
>
>
> http://localhost:8983/solr/techproducts/select?q=*:*&fq=popularity:(1%20OR%207)&rows=100&fl=price,scale(price,%200,%201)
>
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":30,
>     "params":{
>       "q":"*:*",
>       "fl":"price,scale(price, 0, 1)",
>       "fq":"popularity:(1 OR 7)",
>       "rows":"100"}},
>   "response":{"numFound":6,"start":0,"numFoundExact":true,"docs":[
>       {
>         "price":74.99,
>         "scale(price, 0, 1)":0.034101862},
>       {
>         "price":19.95,
>         "scale(price, 0, 1)":0.009072306},
>       {
>         "price":11.5,
>         "scale(price, 0, 1)":0.0052296496},
>       {
>         "price":329.95,
>         "scale(price, 0, 1)":0.15004548},
>       {
>         "price":479.95,
>         "scale(price, 0, 1)":0.2182583},
>       {
>         "price":649.99,
>         "scale(price, 0, 1)":0.29558435}]
>   }}
>
> As you can see in the results of this query, prices are between 11.5 and
> 649.99.
> What if I want to scale the prices between 11.5 and 649.99?
> Or, in other words, what is the easiest way to scale all the values of a
> field with the min and max of the current query results?
>
> Right now I'm investigating what's the best way to scale the values of one
> or more fields within Solr, but only within the documents that are in the
> current result set.
>
> Hope this helps to make things clearer.
>
> Best regards,
> Vincenzo
>
>
>
>
> On Tue, May 31, 2022 at 9:27 PM Mikhail Khludnev <[email protected]> wrote:
>
> > Vincenzo,
> > Can you elaborate what it means ' apply the scale function only to the
> > result set (and not to
> > the entire collection).'  ?
> >
> > On Tue, May 31, 2022 at 4:33 PM Vincenzo D'Amore <[email protected]>
> > wrote:
> >
> > > Hi Mikhail,
> > >
> > > I'm trying to apply the scale function only to the result set (and not
> to
> > > the entire collection).
> > > And I discovered that adding "query($q)" to the scale function does the
> > > trick.
> > > In other words, adding "query($q)" forces solr to restrict the scale
> > > function only to the result set.
> > >
> > > But if I add an fq to the query parameters the scale function applies
> > only
> > > to the q param.
> > > For example:
> > >
> > >
> > >
> >
> http://localhost:8983/solr/techproducts/select?q=manu_id_s:(corsair%20belkin%20canon%20viewsonic)&fq=price:[0%20TO%20200]&rows=100&fl=price,scale(sum(price,query($q)),%200,%201),manu_id_s
> > >
> > > {
> > >   "responseHeader":{
> > >     "status":0,
> > >     "QTime":8,
> > >     "params":{
> > >       "q":"*:*",
> > >       "fl":"price,scale(sum(price,query($q)), 0, 1)",
> > >       "fq":"popularity:(1 OR 7)",
> > >       "rows":"100"}},
> > >   "response":{"numFound":6,"start":0,"numFoundExact":true,"docs":[
> > >       {
> > >         "price":74.99,
> > >         "scale(sum(price,query($q)), 0, 1)":0.034101862},
> > >       {
> > >         "price":19.95,
> > >         "scale(sum(price,query($q)), 0, 1)":0.009072306},
> > >       {
> > >         "price":11.5,
> > >         "scale(sum(price,query($q)), 0, 1)":0.0052296496},
> > >       {
> > >         "price":329.95,
> > >         "scale(sum(price,query($q)), 0, 1)":0.15004548},
> > >       {
> > >         "price":479.95,
> > >         "scale(sum(price,query($q)), 0, 1)":0.2182583},
> > >       {
> > >         "price":649.99,
> > >         "scale(sum(price,query($q)), 0, 1)":0.29558435}]
> > >   }}
> > >
> > > I can avoid this problem by adding a new parameter query($fq) to the
> > scale
> > > function, but this solution is cumbersome and not maintainable.
> > > For example:
> > >
> > >
> > >
> >
> http://localhost:8983/solr/techproducts/select?q=manu_id_s:(corsair%20belkin%20canon%20viewsonic)&fq=price:[0%20TO%20200]&rows=100&fl=price,scale(sum(sum(price,query($q)),query($fq)),%200,%201),manu_id_s
> > >
> > > {
> > >   "responseHeader":{
> > >     "status":0,
> > >     "QTime":1,
> > >     "params":{
> > >       "q":"manu_id_s:(corsair belkin canon viewsonic)",
> > >       "fl":"price,scale(sum(sum(price,query($q)),query($fq)), 0,
> > > 1),manu_id_s",
> > >       "fq":"price:[0 TO 200]",
> > >       "rows":"100"}},
> > >   "response":{"numFound":5,"start":0,"numFoundExact":true,"docs":[
> > >       {
> > >         "manu_id_s":"belkin",
> > >         "price":19.95,
> > >         "scale(sum(sum(price,query($q)),query($fq)), 0,
> 1)":0.048746154},
> > >       {
> > >         "manu_id_s":"belkin",
> > >         "price":11.5,
> > >         "scale(sum(sum(price,query($q)),query($fq)), 0, 1)":0.0},
> > >       {
> > >         "manu_id_s":"canon",
> > >         "price":179.99,
> > >         "scale(sum(sum(price,query($q)),query($fq)), 0,
> 1)":0.97198087},
> > >       {
> > >         "manu_id_s":"corsair",
> > >         "price":185.0,
> > >         "scale(sum(sum(price,query($q)),query($fq)), 0, 1)":1.0},
> > >       {
> > >         "manu_id_s":"corsair",
> > >         "price":74.99,
> > >         "scale(sum(sum(price,query($q)),query($fq)), 0, 1)":0.3653772}]
> > >   }}
> > >
> > >
> > >
> > >
> > > On Tue, May 31, 2022 at 2:48 PM Mikhail Khludnev <[email protected]>
> > wrote:
> > >
> > > > Hello Vincenzo,
> > > >
> > > > I'm not getting your point:
> > > >
> > > > > if I add an fq parameter the scale function still continues to work
> > > only
> > > > on
> > > > the q param .
> > > >
> > > > well, but the function actually refers to q param:
> > > > scale(sum(price,query($q)), 0, 1).
> > > >
> > > > What's your expectation values of  query($q) with  "q":"popularity:(1
> > OR
> > > > 7)"? I suggest to check it with fl=score
> > > >
> > > >
> > > > On Tue, May 31, 2022 at 2:05 PM Vincenzo D'Amore <[email protected]
> >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > playing with the solr scale function I found a few corner cases
> > where I
> > > > > need to scale only the results set.
> > > > >
> > > > > I found a workaround that works but it does not seem to be viable,
> > > > because
> > > > > if I add an fq parameter the scale function still continues to work
> > > only
> > > > on
> > > > > the q param .
> > > > >
> > > > > For example with q=popularity:(1 OR 7):
> > > > >
> > > > > http://localhost:8983/solr/techproducts/select?q=popularity:(1 OR
> > > > > 7)&rows=100&fl=price,scale(sum(price,query($q)), 0, 1)
> > > > >
> > > > > {
> > > > >   "responseHeader":{
> > > > >     "status":0,
> > > > >     "QTime":1,
> > > > >     "params":{
> > > > >       "q":"popularity:(1 OR 7)",
> > > > >       "fl":"price,scale(sum(price,query($q)), 0, 1)",
> > > > >       "rows":"100"}},
> > > > >   "response":{"numFound":6,"start":0,"numFoundExact":true,"docs":[
> > > > >       {
> > > > >         "price":74.99,
> > > > >         "scale(sum(price,query($q)), 0, 1)":0.099437736},
> > > > >       {
> > > > >         "price":19.95,
> > > > >         "scale(sum(price,query($q)), 0, 1)":0.013234352},
> > > > >       {
> > > > >         "price":11.5,
> > > > >         "scale(sum(price,query($q)), 0, 1)":0.0},
> > > > >       {
> > > > >         "price":329.95,
> > > > >         "scale(sum(price,query($q)), 0, 1)":0.49875492},
> > > > >       {
> > > > >         "price":479.95,
> > > > >         "scale(sum(price,query($q)), 0, 1)":0.7336842},
> > > > >       {
> > > > >         "price":649.99,
> > > > >         "scale(sum(price,query($q)), 0, 1)":1.0}]
> > > > >   }}
> > > > >
> > > > > but moving the filter in fq:
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> http://localhost:8983/solr/techproducts/select?q=*:*&fq=popularity:(1%20OR%207)&rows=100&fl=price,scale(sum(price,query($q)),%200,%201)
> > > > >
> > > > > {
> > > > >   "responseHeader":{
> > > > >     "status":0,
> > > > >     "QTime":8,
> > > > >     "params":{
> > > > >       "q":"*:*",
> > > > >       "fl":"price,scale(sum(price,query($q)), 0, 1)",
> > > > >       "fq":"popularity:(1 OR 7)",
> > > > >       "rows":"100"}},
> > > > >   "response":{"numFound":6,"start":0,"numFoundExact":true,"docs":[
> > > > >       {
> > > > >         "price":74.99,
> > > > >         "scale(sum(price,query($q)), 0, 1)":0.034101862},
> > > > >       {
> > > > >         "price":19.95,
> > > > >         "scale(sum(price,query($q)), 0, 1)":0.009072306},
> > > > >       {
> > > > >         "price":11.5,
> > > > >         "scale(sum(price,query($q)), 0, 1)":0.0052296496},
> > > > >       {
> > > > >         "price":329.95,
> > > > >         "scale(sum(price,query($q)), 0, 1)":0.15004548},
> > > > >       {
> > > > >         "price":479.95,
> > > > >         "scale(sum(price,query($q)), 0, 1)":0.2182583},
> > > > >       {
> > > > >         "price":649.99,
> > > > >         "scale(sum(price,query($q)), 0, 1)":0.29558435}]
> > > > >   }}
> > > > >
> > > > >
> > > > > On the other hand, I was thinking of implementing a custom scale
> > > function
> > > > > that by default works only on the current result set and not on the
> > > > entire
> > > > > collection.
> > > > >
> > > > > Any suggestions on how to solve this problem?
> > > > >
> > > > > Best regards,
> > > > > Vincenzo
> > > > >
> > > > >
> > > > > --
> > > > > Vincenzo D'Amore
> > > > >
> > > >
> > > >
> > > > --
> > > > Sincerely yours
> > > > Mikhail Khludnev
> > > >
> > >
> > >
> > > --
> > > Vincenzo D'Amore
> > >
> >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>
>
> --
> Vincenzo D'Amore
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to