[ 
https://issues.apache.org/jira/browse/SOLR-12688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591893#comment-16591893
 ] 

Christine Poerschke commented on SOLR-12688:
--------------------------------------------

Thanks [~slivotov] for opening this ticket with analysis results and patches!

I'll aim to take a proper look at the changes towards the middle or end of next 
week hopefully.

bq. And excuse me in advance for creating one Jira for 3 potential problems. 
Probably it is better to create 3 separate Jira tickets? Anyway I had uploaded 
3 separate patches ...

No worries. May I suggest using JIRA's sub-task feature here? So this 
SOLR-12688 here would remain as the parent ticket with the overall context and 
analysis results and we'd have 3 sub-tasks for the 3 different changes.

I've gone and created the sub-tasks, would that approach work for you? And if 
the future patch attachments to those sub-tasks match the SOLR-NNNNN.patch 
naming pattern then automatic patch validation can also be used -- 
https://wiki.apache.org/solr/HowToContribute#Contributing_your_work has more 
info on how that works.

> LTR Multiple performance fixes + pure DocValues support for FieldValueFeature
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-12688
>                 URL: https://issues.apache.org/jira/browse/SOLR-12688
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: contrib - LTR
>            Reporter: Stanislav Livotov
>            Priority: Major
>         Attachments: DocValuesSupportForFieldValueFeature.patch, 
> LTRModelHashCodeAfter.png, LTRModelHashCodeBefore.png, 
> LTRScoringModelHashCodeCaching.patch, LTRSolrFeatureAfter.png, 
> LTRSolrFeatureBefore.png, LTRwithDVOptimisation.png, 
> LTRwithoutDVOptimisation.png, MultiplePerformanceFixes.patch, 
> NoFQSolrFeatureOptimisation.patch
>
>
> This ticket is related to 2 performance and 1 functional/performance issue 
> that I had found during integrating LTR in our e-commerce search engine : 
>  # FieldValueFeature doesn't support pure DocValues fields (Stored false). 
> Please also note that for fields which are both stored and DocValues it is 
> working not optimal because it is extracting just one field from the stored 
> document. DocValues are obviously faster for such usecases. Below are 
> screenshots of JFR profiles without and with new support of DocValues for the 
> case when it can be read from DocValues. 
>  !LTRwithoutDVOptimisation.png! 
>  !LTRwithDVOptimisation.png!
>  # SolrFeature was not optimally implemented for the case when no fq 
> parameter was passed. I'm not absolutely sure what was the intention to 
> introduce both q(which is supposed to be a function query) and fq parameter 
> for the same SolrFeature at all(Is there a case when they will be used 
> together ? ), so I decided not to change behavior but just optimize described 
> case !LTRSolrFeatureBefore.png! !LTRSolrFeatureAfter.png!
>  # LTRScoringModel was a mutable object. It was leading to the calculation of 
> hashcode on each query, which in turn can consume a lot of time in cases when 
> a model is big(In our case we were using LambdaMART with 100 trees and leaves 
> which was consuming 3MB of the disk space). So I decided to make 
> LTRScoringModel immutable and cache hashCode calculation. Below are the 
> screenshots before and after.  
> !LTRModelHashCodeBefore.png!!LTRModelHashCodeAfter.png!
> In our case, we had a feature.json file with 8 FieldValueFeatures, 5 
> SolrFeatures and 1 OriginalScoreFeature. 
> Before introducing the optimizations performance overhead for LTR reranking 
> of top 48 documents was 300ms. With all the optimizations in it was decreased 
> to 35ms. 
> Please also note that JFR screenshots were captured on Solr 6.6 codebase. All 
> the numbers are also taken from Solr version 6.6. 
> I hope that changes of the DocValues interface(method get() was removed and 
> advanceExact was added) won't affect it (At least for DenseNumericDocValues 
> it will work as expected.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to