[
https://issues.apache.org/jira/browse/SOLR-16596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alessandro Benedetti updated SOLR-16596:
----------------------------------------
Component/s: contrib - LTR
> LTR MultipleAdditiveTreeModel do not support missing features' value
> --------------------------------------------------------------------
>
> Key: SOLR-16596
> URL: https://issues.apache.org/jira/browse/SOLR-16596
> Project: Solr
> Issue Type: Improvement
> Components: contrib - LTR
> Reporter: Anna
> Priority: Minor
> Fix For: 9.2
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> The current MultipleAdditiveTree model doesn't support missing features'
> values.
> When a feature value is not passed, the model directly translates it to zero.
> Other LTR model libraries, like xgboost, are able to differentiate missing
> values from other values and also from zero values. They learn how to treat
> missing values at training time and add an additional "missing" branch to the
> tree with the direction learned to be the best in that situation.
> It would be nice to integrate this feature also in Solr MultipleAdditiveTree
> models. An additional "missing" parameter should be added to the
> RegressionTreeNode. This will determine the direction to take in case the
> feature value is missing.
> This integration will allow us to differentiate between zero and missing
> features.
> For example, if the feature is "hotel_avg_review" (with a ranking between
> zero and five stars), we would like to behave differently if the hotel has no
> reviews (we do not know if it is good) or if it has a review of zero stars
> (the hotel is bad).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]