137alpha commented on pull request #32813: URL: https://github.com/apache/spark/pull/32813#issuecomment-857194103
@asolimando > As a closing remark, I understand that this have caused some issues and frustrations to some people including yourself, but sometimes trying to make things better (maybe by volunteering in our spare time, like it was the case for me for this PR), we can cause other issues, which can in turn be tackled and hopefully solved, that's the beauty of OSS. We definitely appreciate the all the time volunteers like yourself have spent improving the code. We wouldn't be able to do our work without it :) My comments above were intended to convey the fact that I personally know significant numbers of people who have encountered this in one form or another and therefore this isn't a rare side effect - it's actually a common one. There are workarounds too (although not ideal): * Instead of using DecisionTreeClassifier, use GBTClassifier with numiter=1 and settings which make it functionally identical to a decision tree * Instead of using RandomForestClassifier, use GBTClassifier (although this isn't helpful if you actually want a RandomForestClassifier for some reason) Thankfully it's an easily addressable issue, notwithstanding the performance hit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
