[
https://issues.apache.org/jira/browse/SOLR-12688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16809734#comment-16809734
]
Kamal Kishore commented on SOLR-12688:
--------------------------------------
Hi Stanislav,
It is mentioned in 3rd point that 'LTRScoringModel was a mutable object. It was
leading to the calculation of hashcode on each query. So I decided to make
LTRScoringModel immutable and cache hashCode calculation. '
But, if my model contains query related or EFI params in rerank query, so is it
worth making the model as immutable.
> LTR Multiple performance fixes + pure DocValues support for FieldValueFeature
> -----------------------------------------------------------------------------
>
> Key: SOLR-12688
> URL: https://issues.apache.org/jira/browse/SOLR-12688
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: contrib - LTR
> Reporter: Stanislav Livotov
> Priority: Major
> Attachments: DocValuesSupportForFieldValueFeature.patch,
> LTRModelHashCodeAfter.png, LTRModelHashCodeBefore.png,
> LTRScoringModelHashCodeCaching.patch, LTRSolrFeatureAfter.png,
> LTRSolrFeatureBefore.png, LTRwithDVOptimisation.png,
> LTRwithoutDVOptimisation.png, MultiplePerformanceFixes.patch,
> NoFQSolrFeatureOptimisation.patch
>
>
> This ticket is related to 2 performance and 1 functional/performance issue
> that I had found during integrating LTR in our e-commerce search engine :
> # FieldValueFeature doesn't support pure DocValues fields (Stored false).
> Please also note that for fields which are both stored and DocValues it is
> working not optimal because it is extracting just one field from the stored
> document. DocValues are obviously faster for such usecases. Below are
> screenshots of JFR profiles without and with new support of DocValues for the
> case when it can be read from DocValues.
> !LTRwithoutDVOptimisation.png!
> !LTRwithDVOptimisation.png!
> # SolrFeature was not optimally implemented for the case when no fq
> parameter was passed. I'm not absolutely sure what was the intention to
> introduce both q(which is supposed to be a function query) and fq parameter
> for the same SolrFeature at all(Is there a case when they will be used
> together ? ), so I decided not to change behavior but just optimize described
> case !LTRSolrFeatureBefore.png! !LTRSolrFeatureAfter.png!
> # LTRScoringModel was a mutable object. It was leading to the calculation of
> hashcode on each query, which in turn can consume a lot of time in cases when
> a model is big(In our case we were using LambdaMART with 100 trees and leaves
> which was consuming 3MB of the disk space). So I decided to make
> LTRScoringModel immutable and cache hashCode calculation. Below are the
> screenshots before and after.
> !LTRModelHashCodeBefore.png!!LTRModelHashCodeAfter.png!
> In our case, we had a feature.json file with 8 FieldValueFeatures, 5
> SolrFeatures and 1 OriginalScoreFeature.
> Before introducing the optimizations performance overhead for LTR reranking
> of top 48 documents was 300ms. With all the optimizations in it was decreased
> to 35ms.
> Please also note that JFR screenshots were captured on Solr 6.6 codebase. All
> the numbers are also taken from Solr version 6.6.
> I hope that changes of the DocValues interface(method get() was removed and
> advanceExact was added) won't affect it (At least for DenseNumericDocValues
> it will work as expected.)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]