OK. Did u change spark version? Java/scala/python version? Have u tried with different versions of any of the above? Hope this helps Kr
On 10 Dec 2016 10:37 pm, "Morten Hornbech" <mor...@datasolvr.com> wrote: > I haven’t actually experienced any non-determinism. We have nightly > integration tests comparing output from random forests with no variations. > > The workaround we will probably try is to split the dataset, either > randomly or on one of the variables, and then train a forest on each > partition, which should then be sufficiently small. > > I hope to be able to provide a good repro case in some weeks. If the > problem was in our own code I will also post it in this thread. > > Morten > > Den 10. dec. 2016 kl. 23.25 skrev Marco Mistroni <mmistr...@gmail.com>: > > Hello Morten > ok. > afaik there is a tiny bit of randomness in these ML algorithms (pls anyone > correct me if i m wrong). > In fact if you run your RDF code multiple times, it will not give you > EXACTLY the same results (though accuracy and errors should me more or less > similar)..at least this is what i found when playing around with > RDF and decision trees and other ML algorithms > > If RDF is not a must for your usecase, could you try 'scale back' to > Decision Trees and see if you still get intermittent failures? > this at least to exclude issues with the data > > hth > marco > > On Sat, Dec 10, 2016 at 5:20 PM, Morten Hornbech <mor...@datasolvr.com> > wrote: > >> Already did. There are no issues with smaller samples. I am running this >> in a cluster of three t2.large instances on aws. >> >> I have tried to find the threshold where the error occurs, but it is not >> a single factor causing it. Input size and subsampling rate seems to be >> most significant, and number of trees the least. >> >> I have also tried running on a test frame of randomized numbers with the >> same number of rows, and could not reproduce the problem here. >> >> By the way maxDepth is 5 and maxBins is 32. >> >> I will probably need to leave this for a few weeks to focus on more >> short-term stuff, but I will write here if I solve it or reproduce it more >> consistently. >> >> Morten >> >> Den 10. dec. 2016 kl. 17.29 skrev Marco Mistroni <mmistr...@gmail.com>: >> >> Hi >> Bring back samples to 1k range to debug....or as suggested reduce tree >> and bins.... had rdd running on same size data with no issues.....or send >> me some sample code and data and I try it out on my ec2 instance ... >> Kr >> >> On 10 Dec 2016 3:16 am, "Md. Rezaul Karim" <rezaul.karim@insight-centre.o >> rg> wrote: >> >>> I had similar experience last week. Even I could not find any error >>> trace. >>> >>> Later on, I did the following to get rid of the problem: >>> i) I downgraded to Spark 2.0.0 >>> ii) Decreased the value of maxBins and maxDepth >>> >>> Additionally, make sure that you set the featureSubsetStrategy as "auto" to >>> let the algorithm choose the best feature subset strategy for your >>> data. Finally, set the impurity as "gini" for the information gain. >>> >>> However, setting the value of no. of trees to just 1 does not give you >>> either real advantage of the forest neither better predictive performance. >>> >>> >>> >>> Best, >>> Karim >>> >>> >>> On Dec 9, 2016 11:29 PM, "mhornbech" <mor...@datasolvr.com> wrote: >>> >>>> Hi >>>> >>>> I have spent quite some time trying to debug an issue with the Random >>>> Forest >>>> algorithm on Spark 2.0.2. The input dataset is relatively large at >>>> around >>>> 600k rows and 200MB, but I use subsampling to make each tree manageable. >>>> However even with only 1 tree and a low sample rate of 0.05 the job >>>> hangs at >>>> one of the final stages (see attached). I have checked the logs on all >>>> executors and the driver and find no traces of error. Could it be a >>>> memory >>>> issue even though no error appears? The error does seem sporadic to some >>>> extent so I also wondered whether it could be a data issue, that only >>>> occurs >>>> if the subsample includes the bad data rows. >>>> >>>> Please comment if you have a clue. >>>> >>>> Morten >>>> >>>> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n2 >>>> 8192/Sk%C3%A6rmbillede_2016-12-10_kl.png> >>>> >>>> >>>> >>>> -- >>>> View this message in context: http://apache-spark-user-list. >>>> 1001560.n3.nabble.com/Random-Forest-hangs-without-trace-of-e >>>> rror-tp28192.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com >>>> <http://nabble.com/>. >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>> >>>> >> > >