Hi Lokesh, Glad the update fixed the bug. maxBins is a parameter you can tune based on your data. Essentially, larger maxBins is potentially more accurate, but will run more slowly and use more memory. maxBins must be <= training set size; I would say try some small values (4, 8, 16). If there is a difference in performance between those, then you can tune it more; otherwise, just pick one. Good luck! Joseph
On Fri, Oct 24, 2014 at 12:54 AM, lokeshkumar <lok...@dataken.net> wrote: > Hi Joseph, > > Thanks for the help. > > I have tried this DecisionTree example with the latest spark code and it is > working fine now. But how do we choose the maxBins for this model? > > Thanks > Lokesh > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-MLLIB-Decision-Tree-ArrayIndexOutOfBounds-Exception-tp16907p17195.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >