[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-12-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r91624043
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala
 ---
@@ -52,33 +52,49 @@ class DecisionTreeClassifier @Since("1.4.0") (
 
   // Override parameter setters from parent trait for Java API 
compatibility.
 
+  /** @group setParam */
--- End diff --

I support that idea too! Thanks for your comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-12-08 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r91591028
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala
 ---
@@ -52,33 +52,49 @@ class DecisionTreeClassifier @Since("1.4.0") (
 
   // Override parameter setters from parent trait for Java API 
compatibility.
 
+  /** @group setParam */
--- End diff --

Btw, I created http://issues.apache.org/jira/browse/SPARK-18692 for 
monitoring Java 8 unidoc in the future


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-12-08 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r91495968
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala
 ---
@@ -52,33 +52,49 @@ class DecisionTreeClassifier @Since("1.4.0") (
 
   // Override parameter setters from parent trait for Java API 
compatibility.
 
+  /** @group setParam */
--- End diff --

Oh, I should have built and checked it first closely by myself. Thank you 
for correcting me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-12-08 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r91484503
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala
 ---
@@ -52,33 +52,49 @@ class DecisionTreeClassifier @Since("1.4.0") (
 
   // Override parameter setters from parent trait for Java API 
compatibility.
 
+  /** @group setParam */
--- End diff --

I checked the generated documentation and found it works well. Could you 
let me know what you found and how to reproduce it? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-12-07 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r91432051
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala
 ---
@@ -52,33 +52,49 @@ class DecisionTreeClassifier @Since("1.4.0") (
 
   // Override parameter setters from parent trait for Java API 
compatibility.
 
+  /** @group setParam */
--- End diff --

I am sorry but please allow me leave another trivial comment about 
documentation. I haven't checked the built documentation but it seems we should 
use multiple-lines in this case (see 
https://github.com/apache/spark/pull/13855).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-12-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r90586118
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala
 ---
@@ -52,33 +52,49 @@ class DecisionTreeClassifier @Since("1.4.0") (
 
   // Override parameter setters from parent trait for Java API 
compatibility.
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMaxDepth(value: Int): this.type = 
super.setMaxDepth(value)
+  override def setMaxDepth(value: Int): this.type = set(maxDepth, value)
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMaxBins(value: Int): this.type = super.setMaxBins(value)
+  override def setMaxBins(value: Int): this.type = set(maxBins, value)
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMinInstancesPerNode(value: Int): this.type =
-super.setMinInstancesPerNode(value)
+  override def setMinInstancesPerNode(value: Int): this.type = 
set(minInstancesPerNode, value)
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMinInfoGain(value: Double): this.type = 
super.setMinInfoGain(value)
+  override def setMinInfoGain(value: Double): this.type = set(minInfoGain, 
value)
 
+  /** @group expertSetParam */
   @Since("1.4.0")
-  override def setMaxMemoryInMB(value: Int): this.type = 
super.setMaxMemoryInMB(value)
+  override def setMaxMemoryInMB(value: Int): this.type = 
set(maxMemoryInMB, value)
 
+  /** @group expertSetParam */
   @Since("1.4.0")
-  override def setCacheNodeIds(value: Boolean): this.type = 
super.setCacheNodeIds(value)
+  override def setCacheNodeIds(value: Boolean): this.type = 
set(cacheNodeIds, value)
 
+  /**
+   * Specifies how often to checkpoint the cached node IDs.
+   * E.g. 10 means that the cache will get checkpointed every 10 
iterations.
+   * This is only used if cacheNodeIds is true and if the checkpoint 
directory is set in
+   * [[org.apache.spark.SparkContext]].
+   * Must be >= 1.
--- End diff --

Hi all, this might be too trivial but I just want to let you know we 
probably should write `<` or `>` as other ones such as `{@literal <}` or 
`{@literal >}`. Please refer 
https://github.com/apache/spark/pull/16013#discussion_r89666914.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-11-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/16017


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-11-29 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r90072953
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala ---
@@ -107,25 +107,41 @@ private[ml] trait DecisionTreeParams extends 
PredictorParams
   setDefault(maxDepth -> 5, maxBins -> 32, minInstancesPerNode -> 1, 
minInfoGain -> 0.0,
 maxMemoryInMB -> 256, cacheNodeIds -> false, checkpointInterval -> 10)
 
-  /** @group setParam */
+  /**
+   * @deprecated This method is deprecated and will be removed in 2.2.0.
+   * @group setParam
+   */
+  @deprecated("This method is deprecated and will be removed in 2.2.0.", 
"2.1.0")
--- End diff --

Great, thank you for checking!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-11-28 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r89949760
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala ---
@@ -107,25 +107,41 @@ private[ml] trait DecisionTreeParams extends 
PredictorParams
   setDefault(maxDepth -> 5, maxBins -> 32, minInstancesPerNode -> 1, 
minInfoGain -> 0.0,
 maxMemoryInMB -> 256, cacheNodeIds -> false, checkpointInterval -> 10)
 
-  /** @group setParam */
+  /**
+   * @deprecated This method is deprecated and will be removed in 2.2.0.
+   * @group setParam
+   */
+  @deprecated("This method is deprecated and will be removed in 2.2.0.", 
"2.1.0")
--- End diff --

This will not appear in estimator classes, only in models. This can be 
confirmed by the following ways:
The IDE will prompt deprecated methods only for models: 

![image](https://cloud.githubusercontent.com/assets/1962026/20699424/247abfc4-b5bc-11e6-8eb2-a9fe53661cba.png)
We can also confirm it in Scala API docs:
```DecisionTreeClassifier```

![image](https://cloud.githubusercontent.com/assets/1962026/20699474/63b03688-b5bc-11e6-9e8d-b6d83f6bb00a.png)
```DecisionTreeClassificationModel```

![image](https://cloud.githubusercontent.com/assets/1962026/20699522/a0bf3556-b5bc-11e6-9289-d2c893f9428e.png)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-11-28 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r89824958
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala ---
@@ -107,25 +107,41 @@ private[ml] trait DecisionTreeParams extends 
PredictorParams
   setDefault(maxDepth -> 5, maxBins -> 32, minInstancesPerNode -> 1, 
minInfoGain -> 0.0,
 maxMemoryInMB -> 256, cacheNodeIds -> false, checkpointInterval -> 10)
 
-  /** @group setParam */
+  /**
+   * @deprecated This method is deprecated and will be removed in 2.2.0.
+   * @group setParam
+   */
+  @deprecated("This method is deprecated and will be removed in 2.2.0.", 
"2.1.0")
--- End diff --

Actually, if this is a problem, we could mark the setters in Models as 
deprecated and leave the treeParams alone since the treeParams traits are 
private.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-11-28 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r89819028
  
--- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala ---
@@ -134,27 +150,31 @@ private[ml] trait DecisionTreeParams extends 
PredictorParams
   /** @group setParam */
   def setSeed(value: Long): this.type = set(seed, value)
--- End diff --

Are you intentionally keeping setSeed in Models?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-11-28 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r89806761
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala
 ---
@@ -52,33 +52,49 @@ class DecisionTreeClassifier @Since("1.4.0") (
 
   // Override parameter setters from parent trait for Java API 
compatibility.
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMaxDepth(value: Int): this.type = 
super.setMaxDepth(value)
+  override def setMaxDepth(value: Int): this.type = set(maxDepth, value)
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMaxBins(value: Int): this.type = super.setMaxBins(value)
+  override def setMaxBins(value: Int): this.type = set(maxBins, value)
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMinInstancesPerNode(value: Int): this.type =
-super.setMinInstancesPerNode(value)
+  override def setMinInstancesPerNode(value: Int): this.type = 
set(minInstancesPerNode, value)
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMinInfoGain(value: Double): this.type = 
super.setMinInfoGain(value)
+  override def setMinInfoGain(value: Double): this.type = set(minInfoGain, 
value)
 
+  /** @group expertSetParam */
   @Since("1.4.0")
-  override def setMaxMemoryInMB(value: Int): this.type = 
super.setMaxMemoryInMB(value)
+  override def setMaxMemoryInMB(value: Int): this.type = 
set(maxMemoryInMB, value)
 
+  /** @group expertSetParam */
   @Since("1.4.0")
-  override def setCacheNodeIds(value: Boolean): this.type = 
super.setCacheNodeIds(value)
+  override def setCacheNodeIds(value: Boolean): this.type = 
set(cacheNodeIds, value)
 
+  /**
+   * Specifies how often to checkpoint the cached node IDs.
+   * E.g. 10 means that the cache will get checkpointed every 10 
iterations.
+   * This is only used if cacheNodeIds is true and if the checkpoint 
directory is set in
+   * [[org.apache.spark.SparkContext]].
+   * Must be >= 1.
+   * (default = 10)
+   * @group setParam
--- End diff --

Right, that would make the setter appear in models.  Your way is better.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-11-27 Thread yanboliang
Github user yanboliang commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r89715218
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala
 ---
@@ -52,33 +52,49 @@ class DecisionTreeClassifier @Since("1.4.0") (
 
   // Override parameter setters from parent trait for Java API 
compatibility.
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMaxDepth(value: Int): this.type = 
super.setMaxDepth(value)
+  override def setMaxDepth(value: Int): this.type = set(maxDepth, value)
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMaxBins(value: Int): this.type = super.setMaxBins(value)
+  override def setMaxBins(value: Int): this.type = set(maxBins, value)
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMinInstancesPerNode(value: Int): this.type =
-super.setMinInstancesPerNode(value)
+  override def setMinInstancesPerNode(value: Int): this.type = 
set(minInstancesPerNode, value)
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMinInfoGain(value: Double): this.type = 
super.setMinInfoGain(value)
+  override def setMinInfoGain(value: Double): this.type = set(minInfoGain, 
value)
 
+  /** @group expertSetParam */
   @Since("1.4.0")
-  override def setMaxMemoryInMB(value: Int): this.type = 
super.setMaxMemoryInMB(value)
+  override def setMaxMemoryInMB(value: Int): this.type = 
set(maxMemoryInMB, value)
 
+  /** @group expertSetParam */
   @Since("1.4.0")
-  override def setCacheNodeIds(value: Boolean): this.type = 
super.setCacheNodeIds(value)
+  override def setCacheNodeIds(value: Boolean): this.type = 
set(cacheNodeIds, value)
 
+  /**
+   * Specifies how often to checkpoint the cached node IDs.
+   * E.g. 10 means that the cache will get checkpointed every 10 
iterations.
+   * This is only used if cacheNodeIds is true and if the checkpoint 
directory is set in
+   * [[org.apache.spark.SparkContext]].
+   * Must be >= 1.
+   * (default = 10)
+   * @group setParam
--- End diff --

The cause of this change was suggested at 
https://github.com/apache/spark/pull/15913#discussion_r89662469 , since Param 
setter methods in traits used to have the wrong type in Java. We would like to 
remove the setter method from the trait since it does not make sense to have it 
in the Model classes. We could put the setter method in each subclass and then 
deprecate the method in the Model classes.

So if we remove the setter method from the traits, we can not inherit docs 
from them. BTW, the current change is consistent with other ML algorithms which 
inherit traits.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-11-27 Thread jkbradley
Github user jkbradley commented on a diff in the pull request:

https://github.com/apache/spark/pull/16017#discussion_r89694871
  
--- Diff: 
mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala
 ---
@@ -52,33 +52,49 @@ class DecisionTreeClassifier @Since("1.4.0") (
 
   // Override parameter setters from parent trait for Java API 
compatibility.
 
+  /** @group setParam */
   @Since("1.4.0")
-  override def setMaxDepth(value: Int): this.type = 
super.setMaxDepth(value)
+  override def setMaxDepth(value: Int): this.type = set(maxDepth, value)
--- End diff --

Do you need to change this?  I'd prefer the former, in case some special 
logic is ever added to super.setMaxDepth.

Same elsewhere


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-11-26 Thread yanboliang
GitHub user yanboliang opened a pull request:

https://github.com/apache/spark/pull/16017

[SPARK-18592][ML] Move DT/RF/GBT Param setter methods to subclasses

## What changes were proposed in this pull request?
Mainly two changes:
* Move DT/RF/GBT Param setter methods to subclasses.
* Deprecate corresponding setter methods in the model classes.

See discussion here 
https://github.com/apache/spark/pull/15913#discussion_r89662469.

## How was this patch tested?
Existing tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yanboliang/spark spark-18592

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16017.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16017


commit 7e52a0856c1381cd751ddc10f0195866bd40b404
Author: Yanbo Liang 
Date:   2016-11-26T15:36:23Z

Put DT/RF/GBT setter methods into each subclass.

commit 39cbf4267d9d136f6a16f85e8e2d88939a35e22f
Author: Yanbo Liang 
Date:   2016-11-26T15:46:40Z

Deprecate DT/RF/GBT setter methods in the Model classes




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org