[2/2] spark git commit: [SPARK-14812][ML][MLLIB][PYTHON] Experimental, DeveloperApi annotation audit for ML

2016-07-13 Thread jkbradley
[SPARK-14812][ML][MLLIB][PYTHON] Experimental, DeveloperApi annotation audit 
for ML

## What changes were proposed in this pull request?

General decisions to follow, except where noted:
* spark.mllib, pyspark.mllib: Remove all Experimental annotations.  Leave 
DeveloperApi annotations alone.
* spark.ml, pyspark.ml
** Annotate Estimator-Model pairs of classes and companion objects the same way.
** For all algorithms marked Experimental with Since tag <= 1.6, remove 
Experimental annotation.
** For all algorithms marked Experimental with Since tag = 2.0, leave 
Experimental annotation.
* DeveloperApi annotations are left alone, except where noted.
* No changes to which types are sealed.

Exceptions where I am leaving items Experimental in spark.ml, pyspark.ml, 
mainly because the items are new:
* Model Summary classes
* MLWriter, MLReader, MLWritable, MLReadable
* Evaluator and subclasses: There is discussion of changes around evaluating 
multiple metrics at once for efficiency.
* RFormula: Its behavior may need to change slightly to match R in edge cases.
* AFTSurvivalRegression
* MultilayerPerceptronClassifier

DeveloperApi changes:
* ml.tree.Node, ml.tree.Split, and subclasses should no longer be DeveloperApi

## How was this patch tested?

N/A

Note to reviewers:
* spark.ml.clustering.LDA underwent significant changes (additional methods), 
so let me know if you want me to leave it Experimental.
* Be careful to check for cases where a class should no longer be Experimental 
but has an Experimental method, val, or other feature.  I did not find such 
cases, but please verify.

Author: Joseph K. Bradley 

Closes #14147 from jkbradley/experimental-audit.

(cherry picked from commit 01f09b161217193b797c8c85969d17054c958615)
Signed-off-by: Joseph K. Bradley 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2e97f3a0
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2e97f3a0
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2e97f3a0

Branch: refs/heads/branch-2.0
Commit: 2e97f3a08e3b48ce8ad0d669ef844210d0a3d2be
Parents: 90f0e81
Author: Joseph K. Bradley 
Authored: Wed Jul 13 12:33:39 2016 -0700
Committer: Joseph K. Bradley 
Committed: Wed Jul 13 12:34:15 2016 -0700

--
 .../scala/org/apache/spark/ml/Pipeline.scala|  6 +-
 .../classification/DecisionTreeClassifier.scala |  7 +--
 .../spark/ml/classification/GBTClassifier.scala |  7 +--
 .../ml/classification/LogisticRegression.scala  |  4 --
 .../spark/ml/classification/NaiveBayes.scala|  6 +-
 .../spark/ml/classification/OneVsRest.scala |  7 +--
 .../classification/RandomForestClassifier.scala |  7 +--
 .../org/apache/spark/ml/feature/Binarizer.scala |  4 +-
 .../apache/spark/ml/feature/Bucketizer.scala|  4 +-
 .../apache/spark/ml/feature/ChiSqSelector.scala |  6 +-
 .../spark/ml/feature/CountVectorizer.scala  |  6 +-
 .../scala/org/apache/spark/ml/feature/DCT.scala |  4 +-
 .../spark/ml/feature/ElementwiseProduct.scala   |  4 +-
 .../org/apache/spark/ml/feature/HashingTF.scala |  4 +-
 .../scala/org/apache/spark/ml/feature/IDF.scala |  6 +-
 .../apache/spark/ml/feature/Interaction.scala   |  4 +-
 .../apache/spark/ml/feature/LabeledPoint.scala  |  2 +
 .../apache/spark/ml/feature/MinMaxScaler.scala  |  6 +-
 .../org/apache/spark/ml/feature/NGram.scala |  4 +-
 .../apache/spark/ml/feature/Normalizer.scala|  4 +-
 .../apache/spark/ml/feature/OneHotEncoder.scala |  4 +-
 .../scala/org/apache/spark/ml/feature/PCA.scala |  7 +--
 .../spark/ml/feature/PolynomialExpansion.scala  |  4 +-
 .../spark/ml/feature/QuantileDiscretizer.scala  |  4 +-
 .../spark/ml/feature/SQLTransformer.scala   |  4 +-
 .../spark/ml/feature/StandardScaler.scala   |  6 +-
 .../spark/ml/feature/StopWordsRemover.scala |  4 +-
 .../apache/spark/ml/feature/StringIndexer.scala |  8 +--
 .../org/apache/spark/ml/feature/Tokenizer.scala |  6 +-
 .../spark/ml/feature/VectorAssembler.scala  |  4 +-
 .../apache/spark/ml/feature/VectorIndexer.scala |  6 +-
 .../apache/spark/ml/feature/VectorSlicer.scala  |  4 +-
 .../org/apache/spark/ml/feature/Word2Vec.scala  |  7 +--
 .../org/apache/spark/ml/param/params.scala  |  9 +--
 .../apache/spark/ml/recommendation/ALS.scala|  8 +--
 .../ml/regression/DecisionTreeRegressor.scala   |  7 +--
 .../spark/ml/regression/GBTRegressor.scala  |  6 --
 .../ml/regression/IsotonicRegression.scala  |  6 +-
 .../spark/ml/regression/LinearRegression.scala  |  4 --
 .../ml/regression/RandomForestRegressor.scala   |  7 +--
 .../scala/org/apache/spark/ml/tree/Node.scala   | 10 +--
 .../scala/org/apache/spark/ml/tree/Split.scala  |  8 +--
 .../apache/spark/ml/tuning/CrossValidator.scala |  6 +-
 .../spark/ml/tuning/ParamGridBuilder.scala  |  4 +-
 

[2/2] spark git commit: [SPARK-14812][ML][MLLIB][PYTHON] Experimental, DeveloperApi annotation audit for ML

2016-07-13 Thread jkbradley
[SPARK-14812][ML][MLLIB][PYTHON] Experimental, DeveloperApi annotation audit 
for ML

## What changes were proposed in this pull request?

General decisions to follow, except where noted:
* spark.mllib, pyspark.mllib: Remove all Experimental annotations.  Leave 
DeveloperApi annotations alone.
* spark.ml, pyspark.ml
** Annotate Estimator-Model pairs of classes and companion objects the same way.
** For all algorithms marked Experimental with Since tag <= 1.6, remove 
Experimental annotation.
** For all algorithms marked Experimental with Since tag = 2.0, leave 
Experimental annotation.
* DeveloperApi annotations are left alone, except where noted.
* No changes to which types are sealed.

Exceptions where I am leaving items Experimental in spark.ml, pyspark.ml, 
mainly because the items are new:
* Model Summary classes
* MLWriter, MLReader, MLWritable, MLReadable
* Evaluator and subclasses: There is discussion of changes around evaluating 
multiple metrics at once for efficiency.
* RFormula: Its behavior may need to change slightly to match R in edge cases.
* AFTSurvivalRegression
* MultilayerPerceptronClassifier

DeveloperApi changes:
* ml.tree.Node, ml.tree.Split, and subclasses should no longer be DeveloperApi

## How was this patch tested?

N/A

Note to reviewers:
* spark.ml.clustering.LDA underwent significant changes (additional methods), 
so let me know if you want me to leave it Experimental.
* Be careful to check for cases where a class should no longer be Experimental 
but has an Experimental method, val, or other feature.  I did not find such 
cases, but please verify.

Author: Joseph K. Bradley 

Closes #14147 from jkbradley/experimental-audit.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/01f09b16
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/01f09b16
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/01f09b16

Branch: refs/heads/master
Commit: 01f09b161217193b797c8c85969d17054c958615
Parents: d8220c1
Author: Joseph K. Bradley 
Authored: Wed Jul 13 12:33:39 2016 -0700
Committer: Joseph K. Bradley 
Committed: Wed Jul 13 12:33:39 2016 -0700

--
 .../scala/org/apache/spark/ml/Pipeline.scala|  6 +-
 .../classification/DecisionTreeClassifier.scala |  7 +--
 .../spark/ml/classification/GBTClassifier.scala |  7 +--
 .../ml/classification/LogisticRegression.scala  |  4 --
 .../spark/ml/classification/NaiveBayes.scala|  6 +-
 .../spark/ml/classification/OneVsRest.scala |  7 +--
 .../classification/RandomForestClassifier.scala |  7 +--
 .../org/apache/spark/ml/feature/Binarizer.scala |  4 +-
 .../apache/spark/ml/feature/Bucketizer.scala|  4 +-
 .../apache/spark/ml/feature/ChiSqSelector.scala |  6 +-
 .../spark/ml/feature/CountVectorizer.scala  |  6 +-
 .../scala/org/apache/spark/ml/feature/DCT.scala |  4 +-
 .../spark/ml/feature/ElementwiseProduct.scala   |  4 +-
 .../org/apache/spark/ml/feature/HashingTF.scala |  4 +-
 .../scala/org/apache/spark/ml/feature/IDF.scala |  6 +-
 .../apache/spark/ml/feature/Interaction.scala   |  4 +-
 .../apache/spark/ml/feature/LabeledPoint.scala  |  2 +
 .../apache/spark/ml/feature/MinMaxScaler.scala  |  6 +-
 .../org/apache/spark/ml/feature/NGram.scala |  4 +-
 .../apache/spark/ml/feature/Normalizer.scala|  4 +-
 .../apache/spark/ml/feature/OneHotEncoder.scala |  4 +-
 .../scala/org/apache/spark/ml/feature/PCA.scala |  7 +--
 .../spark/ml/feature/PolynomialExpansion.scala  |  4 +-
 .../spark/ml/feature/QuantileDiscretizer.scala  |  4 +-
 .../spark/ml/feature/SQLTransformer.scala   |  4 +-
 .../spark/ml/feature/StandardScaler.scala   |  6 +-
 .../spark/ml/feature/StopWordsRemover.scala |  4 +-
 .../apache/spark/ml/feature/StringIndexer.scala |  8 +--
 .../org/apache/spark/ml/feature/Tokenizer.scala |  6 +-
 .../spark/ml/feature/VectorAssembler.scala  |  4 +-
 .../apache/spark/ml/feature/VectorIndexer.scala |  6 +-
 .../apache/spark/ml/feature/VectorSlicer.scala  |  4 +-
 .../org/apache/spark/ml/feature/Word2Vec.scala  |  7 +--
 .../org/apache/spark/ml/param/params.scala  |  9 +--
 .../apache/spark/ml/recommendation/ALS.scala|  8 +--
 .../ml/regression/DecisionTreeRegressor.scala   |  7 +--
 .../spark/ml/regression/GBTRegressor.scala  |  6 --
 .../ml/regression/IsotonicRegression.scala  |  6 +-
 .../spark/ml/regression/LinearRegression.scala  |  4 --
 .../ml/regression/RandomForestRegressor.scala   |  7 +--
 .../scala/org/apache/spark/ml/tree/Node.scala   | 10 +--
 .../scala/org/apache/spark/ml/tree/Split.scala  |  8 +--
 .../apache/spark/ml/tuning/CrossValidator.scala |  6 +-
 .../spark/ml/tuning/ParamGridBuilder.scala  |  4 +-
 .../spark/ml/tuning/TrainValidationSplit.scala  |  6 +-
 .../mllib/clustering/BisectingKMeans.scala  |  8 +--