Re: Input size increasing every iteration of gradient boosted trees [1.4]

2015-09-03 Thread Sean Owen
Since it sounds like this has been encountered 3 times, and I've personally seen it and mostly verified it, I think it's legit enough for a JIRA: SPARK-10433 I am sorry to say I don't know what is going here though. On Thu, Sep 3, 2015 at 1:56 PM, Peter Rudenko wrote: > Confirm, having the same

Re: Input size increasing every iteration of gradient boosted trees [1.4]

2015-09-03 Thread Peter Rudenko
Confirm, having the same issue (1.4.1 mllib package). For smaller dataset accuracy degradeted also. Haven’t tested yet in 1.5 with ml package implementation. |val boostingStrategy = BoostingStrategy.defaultParams("Classification") boostingStrategy.setNumIterations(30) boostingStrategy.setLear

Re: Input size increasing every iteration of gradient boosted trees [1.4]

2015-08-13 Thread Matt Forbes
Is this an artifact of a recent change? Does this not show up in any of the tests or benchmarks? On Thu, Aug 13, 2015 at 2:33 PM, Sean Owen wrote: > Not that I have any answer at this point, but I was discussing this > exact same problem with Johannes today. An input size of ~20K records > was g

Re: Input size increasing every iteration of gradient boosted trees [1.4]

2015-08-13 Thread Sean Owen
Not that I have any answer at this point, but I was discussing this exact same problem with Johannes today. An input size of ~20K records was growing each iteration by ~15M records. I could not see why on a first look. @jkbradley I know it's not much info but does that ring any bells? I think Joha