Github user jkbradley commented on the pull request: https://github.com/apache/spark/pull/2607#issuecomment-59589703 @manishamde Sorry for the delay; the code is looking good! I made some small comments inline. My main overall comment is about specifying parameters. How would it be if we started mimicking the coming API update (as much as possible)? Parameter specification would work as follows: In DecisionTree, add a static âdefaultParamsâ method so users can construct a tree.Strategy instance without having to worry about importing Strategy (and remembering its name). Likewise for GradientBoosting. Change GradientBoostingStrategy to store tree params in a field weakLearnerParams: tree.Strategy Hereâs the use pattern I envision: val treeParams = DecisionTree.defaultParams() treeParams.maxDepth = ... val boostingParams = GradientBoosting.defaultParams() boostingParams.weakLearnerParams = treeParams val model = GradientBoosting.train(myData, boostingParams) This API should work for Scala and Python right away. (Though a Python API can be another PR.) For Java, this API should almost work; I believe the only issue will be setting fields which take special types (e.g., quantileCalculationStrategy and categoricalFeaturesInfo). For those, there is a nice annotation you can use which will automatically add getParamName and setParamName methods for Java users to call, and you can override them as needed. For the special params like categoricalFeaturesInfo, you can overload them with versions which take Java-friendly types (such as a Java map for categoricalFeaturesInfo and a string for quantileCalculationStrategy). Hereâs the BeanProperty doc: [http://www.scala-lang.org/api/current/scala/beans/BeanProperty.html] Does that sound reasonable? Let me know when itâs ready for another pass and for testing.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org