Re: Feature importance for RandomForestRegressor in Spark 1.5

2016-01-17 Thread Yanbo Liang
Hi Robin,

#1 This feature is available from Spark 1.5.0.
#2 You should use the new ML rather than the old MLlib package to train the
Random Forest model and get featureImportances, because it was only exposed
at ML package. You can refer the documents:
https://spark.apache.org/docs/latest/ml-classification-regression.html#random-forest-classifier
.

Thanks
Yanbo

2016-01-16 0:16 GMT+08:00 Robin East :

> re 1.
> The pull requests reference the JIRA ticket in this case
> https://issues.apache.org/jira/browse/SPARK-5133. The JIRA says it was
> released in 1.5.
>
>
>
> ---
> Robin East
> *Spark GraphX in Action* Michael Malak and Robin East
> Manning Publications Co.
> http://www.manning.com/books/spark-graphx-in-action
>
>
>
>
>
> On 15 Jan 2016, at 16:06, Scott Imig  wrote:
>
> Hello,
>
> I have a couple of quick questions about this pull request, which adds
> feature importance calculations to the random forests in MLLib.
>
> https://github.com/apache/spark/pull/7838
>
> 1. Can someone help me determine the Spark version where this is first
> available?  (1.5.0?  1.5.1?)
>
> 2. Following the templates in this  documentation to construct a
> RandomForestModel, should I be able to retrieve model.featureImportances?
> Or is there a different pattern for random forests in more recent spark
> versions?
>
> https://spark.apache.org/docs/1.2.0/mllib-ensembles.html
>
> Thanks for the help!
> Imig
> --
> S. Imig | Senior Data Scientist Engineer | *rich**relevance *|m:
> 425.999.5725
>
> I support Bip 101 and BitcoinXT .
>
>
>


Feature importance for RandomForestRegressor in Spark 1.5

2016-01-15 Thread Scott Imig
Hello,

I have a couple of quick questions about this pull request, which adds feature 
importance calculations to the random forests in MLLib.

https://github.com/apache/spark/pull/7838

1. Can someone help me determine the Spark version where this is first 
available?  (1.5.0?  1.5.1?)

2. Following the templates in this  documentation to construct a 
RandomForestModel, should I be able to retrieve model.featureImportances?  Or 
is there a different pattern for random forests in more recent spark versions?

https://spark.apache.org/docs/1.2.0/mllib-ensembles.html

Thanks for the help!
Imig
--
S. Imig | Senior Data Scientist Engineer | richrelevance |m: 425.999.5725

I support Bip 101 and BitcoinXT.


Re: Feature importance for RandomForestRegressor in Spark 1.5

2016-01-15 Thread Robin East
re 1.
The pull requests reference the JIRA ticket in this case 
https://issues.apache.org/jira/browse/SPARK-5133 
. The JIRA says it was 
released in 1.5.


---
Robin East
Spark GraphX in Action Michael Malak and Robin East
Manning Publications Co.
http://www.manning.com/books/spark-graphx-in-action 






> On 15 Jan 2016, at 16:06, Scott Imig  wrote:
> 
> Hello,
> 
> I have a couple of quick questions about this pull request, which adds 
> feature importance calculations to the random forests in MLLib.
> 
> https://github.com/apache/spark/pull/7838 
> 
> 
> 1. Can someone help me determine the Spark version where this is first 
> available?  (1.5.0?  1.5.1?)
> 
> 2. Following the templates in this  documentation to construct a 
> RandomForestModel, should I be able to retrieve model.featureImportances?  Or 
> is there a different pattern for random forests in more recent spark versions?
> 
> https://spark.apache.org/docs/1.2.0/mllib-ensembles.html 
> 
> 
> Thanks for the help!
> Imig
> -- 
> S. Imig | Senior Data Scientist Engineer | richrelevance |m: 425.999.5725
> 
> I support Bip 101 and BitcoinXT .