This might not help, but I once tried Spark's Random Forest on a Kaggle competition, and its predictions were terrible compared to R. So maybe you should rather look for an external library instead of using MLLib's Random Forest.
— http://mariussoutier.com/blog > On 27.06.2016, at 07:47, Neha Mehta <nehamehta...@gmail.com> wrote: > > Hi All, > > Request help with problem mentioned in the mail below. I have an existing > random forest model in R which needs to be deployed on Spark. I am trying to > recreate the model in Spark but facing the problem mentioned below. > > Thanks, > Neha > > On Jun 24, 2016 5:10 PM, wrote: > > > > Hi Sun, > > > > I am trying to build a model in Spark. Here are the parameters which were > > used in R for creating the model, I am unable to figure out how to specify > > a similar input to the random forest regressor in Spark so that I get a > > similar model in Spark. > > > > https://cran.r-project.org/web/packages/randomForest/randomForest.pdf > > <https://cran.r-project.org/web/packages/randomForest/randomForest.pdf> > > > > mytry=3 > > > > ntree=500 > > > > importance=TRUE > > > > maxnodes = NULL > > > > On May 31, 2016 7:03 AM, "Sun Rui" <sunrise_...@163.com > > <mailto:sunrise_...@163.com>> wrote: > >> > >> I mean train random forest model (not using R) and use it for prediction > >> together using Spark ML. > >> > >>> On May 30, 2016, at 20:15, Neha Mehta <nehamehta...@gmail.com > >>> <mailto:nehamehta...@gmail.com>> wrote: > >>> > >>> Thanks Sujeet.. will try it out. > >>> > >>> Hi Sun, > >>> > >>> Can you please tell me what did you mean by "Maybe you can try using the > >>> existing random forest model" ? did you mean creating the model again > >>> using Spark MLLIB? > >>> > >>> Thanks, > >>> Neha > >>> > >>> > >>> > >>>> > >>>> From: sujeet jog <sujeet....@gmail.com <mailto:sujeet....@gmail.com>> > >>>> Date: Mon, May 30, 2016 at 4:52 PM > >>>> Subject: Re: Can we use existing R model in Spark > >>>> To: Sun Rui <sunrise_...@163.com <mailto:sunrise_...@163.com>> > >>>> Cc: Neha Mehta <nehamehta...@gmail.com <mailto:nehamehta...@gmail.com>>, > >>>> user <user@spark.apache.org <mailto:user@spark.apache.org>> > >>>> > >>>> > >>>> Try to invoke a R script from Spark using rdd pipe method , get the work > >>>> done & and receive the model back in RDD. > >>>> > >>>> > >>>> for ex :- > >>>> . rdd.pipe("<FileName.R>") > >>>> > >>>> > >>>> On Mon, May 30, 2016 at 3:57 PM, Sun Rui <sunrise_...@163.com > >>>> <mailto:sunrise_...@163.com>> wrote: > >>>>> > >>>>> Unfortunately no. Spark does not support loading external modes (for > >>>>> examples, PMML) for now. > >>>>> Maybe you can try using the existing random forest model in Spark. > >>>>> > >>>>>> On May 30, 2016, at 18:21, Neha Mehta <nehamehta...@gmail.com > >>>>>> <mailto:nehamehta...@gmail.com>> wrote: > >>>>>> > >>>>>> Hi, > >>>>>> > >>>>>> I have an existing random forest model created using R. I want to use > >>>>>> that to predict values on Spark. Is it possible to do the same? If > >>>>>> yes, then how? > >>>>>> > >>>>>> Thanks & Regards, > >>>>>> Neha > >>>>> > >>>>> > >>>> > >>>> > >>> > >>