[ 
https://issues.apache.org/jira/browse/LUCENE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135876#comment-17135876
 ] 

Michael McCandless commented on LUCENE-9395:
--------------------------------------------

{quote}Hello! {{org.apache.lucene.search.DoubleValuesSource#getValues}} is 
called about once per segment, and so it's not an allocation hotspot.
{quote}
Hmm it is true there are other possible allocation hot spots (e.g. anything 
per-hit, or per collected hit, etc.), but 1) some indices have a long tail of 
small segments (LUCENE-8962 is trying to help there), 2) if you use many 
dynamic expressions, you might be creating quite a few constant 
{{DoubleValues}} per segment and 3) every little bit of allocation does add 
some (yes, tiny, but still some) load to GC.  Likely most of these allocations 
"die young" and so the added cost is really small, I agree.
{quote}There are tons of implementations of this method and similarly for 
{{org.apache.lucene.search.Weight#scorer}} as well which is also called about 
once per segment that typically allocate a bunch of stuff.
{quote}
It is true there are other places that allocate per-segment objects, but I do 
not think that is a valid argument against fixing this one?  That is like 
saying "the world is already dirty so why should I pick up this trash lying on 
the sidewalk myself and throw it away?".
{quote}I don't think it's worth bothering changing the code.
{quote}
Whose/what "bother" are you referring to here?  We commiters who would actually 
push the change?   I would say the "bother" was really on [~hypothesisx86] who 
has already taken the initiative here to contribute a small improvement.  Thank 
you [~hypothesisx86]!
{quote}There is a very slight help for the GC (that I doubt you could even 
measure) and a very slight negative impact on complexity.
{quote}
+1.

I also doubt it is measurable over the noise limit, but I don't think for small 
improvements like this that we really must be able to measure the gain as a 
blocker to improving.  Say I find a change that removes one multiplication or 
addition per-hit at defaults.  Likely I could not prove that change moves the 
needle, yet, it is still an improvement that we should want to make (all else 
being equal).  And we should not flaunt waste in Lucene's sources.

Yes, there is a miniscule change in code complexity, but it's exceptionally 
tiny in my opinion.

I think when a newish developer offers a small contribution rather than 
replying with "let's not bother", we should strongly welcome and encourage it.  
This is how a healthy open-source community grows!

 

> ConstantValuesSource creates more than one DoubleValues unnecessarily 
> ----------------------------------------------------------------------
>
>                 Key: LUCENE-9395
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9395
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>    Affects Versions: 8.5.2
>            Reporter: Tony Xu
>            Priority: Minor
>         Attachments: LUCENE-9395.patch
>
>
> At my day job, we use ConstantValuesSource to represent default values or a 
> constant query-level feature by calling *_DoubleValuesSource.constant_*. I 
> realized under the hood the _*ConstantValuesSource.getDoubleValues*_ creates 
> a new _*DoubleValues*_ which simply return the specified value each time it 
> is called. 
> Unless I missed something, I don't see a risk of creating one 
> _*DoubleValues*_ as use it as the return value of all _*getDoubleValues**()*_ 
> calls given that the constant _*DoubleValues*_ doesn't maintain any state.
> We can also offer the user flexibilities of how to initialize it. 
> 1) _*DoubleValuesSource.constant(double constant)*_ – we can eagerly 
> initialize an `DoubleValues` that returns the constant and make it the return 
> value of all _*getDoubleValues()*_ calls.
> 2) _*DoubleValuesSource.constant(DoubleSupplier doubleSupplier)*_  – For lazy 
> evaluation if the constant takes some time to compute and user expects the 
> returned DVS will not be used in all code path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to