[GitHub] spark pull request: [SPARK-5133] [ml] [WIP] Added featureImportanc...

jkbradley Fri, 31 Jul 2015 21:17:47 -0700

Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/7838#issuecomment-126859467
  
    @feynmanliang Thanks for writing those tests.  I could not think of a good 
way to make the tests robust.
    
    The issue is that Random Forests could be run in the same way for both 
MLlib and sklearn, but that would require not resampling on each iteration.  If 
we did that, then all of the trees in the forest would be the same, so it would 
not be much of a test of the feature importance calculation.
    
    So I wrote some tests by hand instead.  Not a great solution, but hopefully 
good enough for now.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-5133] [ml] [WIP] Added featureImportanc...

Reply via email to