HI all i am trying to run a ML program against some data, using DecisionTrees. To fine tune the parameters, i am running this loop to find the optimal values for impurity, depth and bins
for (impurity <- Array("gini", "entropy"); depth <- Array(1,2,3, 4, 5); bins <- Array(10,20,25,28)) yield { val model = DecisionTree.trainClassifier( trainingData, numClasses, categoricalFeaturesInfo, impurity, depth, bins) val accuracy = getMetrics(model, testData).precision ((impurity, depth, bins), accuracy) Could anyone explain me why, if i run my program multiple times against the SAME data, i get different optimal results for the parameters above? i assume if i run the loop above agains the same data i will always get the same results? to give you an example run1 returned following top results ((gini,4,28),0.8) ((gini,4,25),0.8) ((gini,3,28),0.8) ((gini,3,25),0.8) ((entropy,3,28),0.7333333333333333) while run2 gives me this top results ((entropy,2,28),0.6842105263157895) ((entropy,2,25),0.6842105263157895) ((entropy,2,20),0.6842105263157895) ((entropy,2,10),0.6842105263157895) ((entropy,1,28),0.684210526315789 could anyone explain why? kind regards marco