Joseph K. Bradley created SPARK-7129:
----------------------------------------

             Summary: Add generic boosting algorithm to spark.ml
                 Key: SPARK-7129
                 URL: https://issues.apache.org/jira/browse/SPARK-7129
             Project: Spark
          Issue Type: New Feature
          Components: ML
            Reporter: Joseph K. Bradley


The Pipelines API will make it easier to create a generic Boosting algorithm 
which can work with any Classifier or Regressor. Creating this feature will 
require researching the possible variants and extensions of boosting which we 
may want to support now and/or in the future, and planning an API which will be 
properly extensible.

In particular, it will be important to think about supporting:
* multiple loss functions (for AdaBoost, LogitBoost, gradient boosting, etc.)
* multiclass variants
* multilabel variants (which will probably be in a separate class and JIRA)
* For more esoteric variants, we should consider them but not design too much 
around them: totally corrective boosting, cascaded models

Note: This may interact some with the existing tree ensemble methods, but it 
should be largely separate since the tree ensemble APIs and implementations are 
specialized for trees.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to