No. I am running Spark on YARN on a 3 node testing cluster. My guess is that given the amount of splits done by a hundred trees of depth 30 (which should be more than 100 * 2^30), either the executors or the driver die OOM while trying to store all the split metadata. I guess that the same issue affects both local and distributed modes. But those are just conjectures.
-- Julio > El 10 ene 2017, a las 11:22, Marco Mistroni <mmistr...@gmail.com> escribió: > > You running locally? Found exactly same issue. > 2 solutions: > _ reduce datA size. > _ run on EMR > Hth > >> On 10 Jan 2017 10:07 am, "Julio Antonio Soto" <ju...@esbet.es> wrote: >> Hi, >> >> I am running into OOM problems while training a Spark ML >> RandomForestClassifier (maxDepth of 30, 32 maxBins, 100 trees). >> >> My dataset is arguably pretty big given the executor count and size (8x5G), >> with approximately 20M rows and 130 features. >> >> The "fun fact" is that a single DecisionTreeClassifier with the same specs >> (same maxDepth and maxBins) is able to train without problems in a couple of >> minutes. >> >> AFAIK the current random forest implementation grows each tree sequentially, >> which means that DecisionTreeClassifiers are fit one by one, and therefore >> the training process should be similar in terms of memory consumption. Am I >> missing something here? >> >> Thanks >> Julio