[ https://issues.apache.org/jira/browse/FLINK-1727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Till Rohrmann updated FLINK-1727: --------------------------------- Issue Type: New Feature (was: Improvement) > Add decision tree to machine learning library > --------------------------------------------- > > Key: FLINK-1727 > URL: https://issues.apache.org/jira/browse/FLINK-1727 > Project: Flink > Issue Type: New Feature > Components: Machine Learning Library > Reporter: Till Rohrmann > Labels: ML > > Decision trees are widely used for classification and regression tasks. Thus, > it would be worthwhile to add support for them to Flink's machine learning > library. > A streaming parallel decision tree learning algorithm has been proposed by > Ben-Haim and Tom-Tov [1]. This can maybe adapted to a batch use case as well. > [2] contains an overview of different techniques of how to scale inductive > learning algorithms up. A presentation of Spark's MLlib decision tree > implementation can be found in [3]. > Resources: > [1] [http://www.jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf] > [2] > [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.46.8226&rep=rep1&type=pdf] > [3] > [http://spark-summit.org/wp-content/uploads/2014/07/Scalable-Distributed-Decision-Trees-in-Spark-Made-Das-Sparks-Talwalkar.pdf] -- This message was sent by Atlassian JIRA (v6.3.4#6332)