All supervised learning algorithms in Spark work the same way. You provide a 
set of ‘features’ (X) and a corresponding label (y) as part of a pipeline and 
call the fit method on the pipeline. The output of this is a model. You can 
then provide new examples (new Xs) to a transform method on the model that will 
give you a prediction for those examples. This means that the code for running 
different algorithms often looks very similar. The details of the algorithm are 
hidden behind the fit/transform interface.

In the case of Random Forest the implementation in Spark (i.e. behind the 
interface) is to create a number of different decision tree models (often quite 
simple models) and then ensemble the results of each decision tree. You don’t 
need to ‘create’ the decision trees yourself, that is handled by the 
implementation.

Hope that helps

Robin
-------------------------------------------------------------------------------
Robin East
Spark GraphX in Action Michael Malak and Robin East
Manning Publications Co.
http://www.manning.com/books/spark-graphx-in-action 
<http://www.manning.com/books/spark-graphx-in-action>





> On 4 Aug 2016, at 09:48, 陈哲 <czhenj...@gmail.com> wrote:
> 
> Hi all
>      I'm trying to use spark ml to do some prediction with random forest. By 
> reading the example code 
> https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/ml/JavaRandomForestClassifierExample.java
>  
> <https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/ml/JavaRandomForestClassifierExample.java>
>  , I can only find out it's similar to 
> https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/ml/JavaDecisionTreeClassificationExample.java
>  
> <https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/ml/JavaDecisionTreeClassificationExample.java>.
>  Is random forest algorithm suppose to use multiple decision trees to work. 
>      I'm new about spark and ml. Is there  anyone help me, maybe provide 
> example about using multiple decision trees in random forest in spark
> 
> Thanks
> Best Regards
> Patrick

Reply via email to