[ 
https://issues.apache.org/jira/browse/SPARK-22871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16307210#comment-16307210
 ] 

Nick Pentreath commented on SPARK-22871:
----------------------------------------

Tree-based feature transformation is covered in SPARK-13677. I think this 
duplicates that ticket. I also think it is best to leave the functionality 
separate rather than create a new estimator in Spark. i.e. we could add the 
leaf-based feature transformation to the tree models, and leave it up to the 
user to combine that with LR etc. I think this separation of concerns and 
modularity is better.

Finally, as [~srowen] mentions in SPARK-22867, I think this particular model is 
best kept as a separate Spark package.

> Add GBT+LR Algorithm in MLlib
> -----------------------------
>
>                 Key: SPARK-22871
>                 URL: https://issues.apache.org/jira/browse/SPARK-22871
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>    Affects Versions: 2.2.1
>            Reporter: Fangzhou Yang
>
> GBTLRClassifier is a hybrid model of Gradient Boosting Trees and Logistic 
> Regression. 
> It is quite practical and popular in many data mining competitions. In this 
> hybrid model, input features are transformed by means of boosted decision 
> trees. The output of each individual tree is treated as a categorical input 
> feature to a sparse linear classifer. Boosted decision trees prove to be very 
> powerful feature transforms.
> Model details about GBTLR can be found in the following paper:
> <a href="https://dl.acm.org/citation.cfm?id=2648589";>Practical Lessons from 
> Predicting Clicks on Ads at Facebook</a> 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to