1. So I think it is a spark problem first (https://issues.apache.org/jir a/browse/SPARK-10413). What we can do is to create our own model (cf https://github.com/apache/lucene-solr/tree/master/solr/contr ib/ltr/src/java/org/apache/solr/ltr/model) that applies the prediction, it should be easy to do for a simple model, like logistic regression. For PMML, the idea would also be to implement a Model that reuse a java lib able to apply PMML.
2. This function query gives you TF IDF of textField vs userQuery for the doc {!edismax qf='textField' mm=100% v=${userQuery} tie=0.1} Also it seems to me LTR only allows float features which is a limitation. 3. If the boost value is an index time boost I don't think it is possible. You could put the feature you want in a field at index time and then use FieldValueFeature to extract it. On Thu, May 11, 2017 at 8:16 PM, Grant Ingersoll <gsing...@apache.org> wrote: > Hi, > > Just getting up to speed on LTR and have a few questions (most of which are > speculative at this point and exploratory, as I have a couple of talks > coming up on this and other relevance features): > > 1. Has anyone looked at what's involved with supporting SparkML or other > models (e.g. PMML)? > > 2. Has anyone looked at features for text? i.e. returning TF-IDF vectors > or similar. FieldValueFeature is kind of like this, but I might want > weights for the terms, not just the actual values. I could get this via > term vectors, but then it doesn't fit the framework. > > 3. How about payloads and/or things like boost values for documents as > features? > > 4. Are there example docs of training and using the > MultipleAdditiveTreesModel? I see unit tests for them, but looking for > something similar to the python script in the example dir. > > On 2 and 3, I imagine some of this can be done creatively via the > SolrFeature and function queries. > > Thanks, > Grant >