[
https://issues.apache.org/jira/browse/SPARK-22871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16307210#comment-16307210
]
Nick Pentreath commented on SPARK-22871:
----------------------------------------
Tree-based feature transformation is covered in SPARK-13677. I think this
duplicates that ticket. I also think it is best to leave the functionality
separate rather than create a new estimator in Spark. i.e. we could add the
leaf-based feature transformation to the tree models, and leave it up to the
user to combine that with LR etc. I think this separation of concerns and
modularity is better.
Finally, as [~srowen] mentions in SPARK-22867, I think this particular model is
best kept as a separate Spark package.
> Add GBT+LR Algorithm in MLlib
> -----------------------------
>
> Key: SPARK-22871
> URL: https://issues.apache.org/jira/browse/SPARK-22871
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Affects Versions: 2.2.1
> Reporter: Fangzhou Yang
>
> GBTLRClassifier is a hybrid model of Gradient Boosting Trees and Logistic
> Regression.
> It is quite practical and popular in many data mining competitions. In this
> hybrid model, input features are transformed by means of boosted decision
> trees. The output of each individual tree is treated as a categorical input
> feature to a sparse linear classifer. Boosted decision trees prove to be very
> powerful feature transforms.
> Model details about GBTLR can be found in the following paper:
> <a href="https://dl.acm.org/citation.cfm?id=2648589">Practical Lessons from
> Predicting Clicks on Ads at Facebook</a>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]