[jira] [Commented] (SPARK-18319) ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-12-07 Thread Nick Pentreath (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731413#comment-15731413
 ] 

Nick Pentreath commented on SPARK-18319:


Went ahead and re-marked fix version to {{2.1.0}} since RC2 has been cut.

> ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit
> --
>
> Key: SPARK-18319
> URL: https://issues.apache.org/jira/browse/SPARK-18319
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Reporter: Joseph K. Bradley
>Assignee: yuhao yang
>Priority: Blocker
> Fix For: 2.1.0
>
>
> We should make a pass through the items marked as Experimental or 
> DeveloperApi and see if any are stable enough to be unmarked.
> We should also check for items marked final or sealed to see if they are 
> stable enough to be opened up as APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18319) ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-11-21 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15685345#comment-15685345
 ] 

Apache Spark commented on SPARK-18319:
--

User 'hhbyyh' has created a pull request for this issue:
https://github.com/apache/spark/pull/15972

> ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit
> --
>
> Key: SPARK-18319
> URL: https://issues.apache.org/jira/browse/SPARK-18319
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Reporter: Joseph K. Bradley
>Assignee: yuhao yang
>Priority: Blocker
>
> We should make a pass through the items marked as Experimental or 
> DeveloperApi and see if any are stable enough to be unmarked.
> We should also check for items marked final or sealed to see if they are 
> stable enough to be opened up as APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18319) ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-11-18 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15678230#comment-15678230
 ] 

Joseph K. Bradley commented on SPARK-18319:
---

I'd prefer not to open up Vector and Matrix.  There have been many people 
asking for improved linear algebra functionality, and if we unseal those, then 
we can never add functionality to those types.

I'm OK with opening up LDAModel and LDAOptimizer.  I would say no if 
LDAOptimizer were in spark.ml, but I'm OK with opening it in spark.mllib.

> ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit
> --
>
> Key: SPARK-18319
> URL: https://issues.apache.org/jira/browse/SPARK-18319
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Reporter: Joseph K. Bradley
>Assignee: yuhao yang
>Priority: Blocker
>
> We should make a pass through the items marked as Experimental or 
> DeveloperApi and see if any are stable enough to be unmarked.
> We should also check for items marked final or sealed to see if they are 
> stable enough to be opened up as APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18319) ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-11-18 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15678231#comment-15678231
 ] 

Joseph K. Bradley commented on SPARK-18319:
---

Thanks [~yuhaoyan] for the audit!

> ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit
> --
>
> Key: SPARK-18319
> URL: https://issues.apache.org/jira/browse/SPARK-18319
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Reporter: Joseph K. Bradley
>Assignee: yuhao yang
>Priority: Blocker
>
> We should make a pass through the items marked as Experimental or 
> DeveloperApi and see if any are stable enough to be unmarked.
> We should also check for items marked final or sealed to see if they are 
> stable enough to be opened up as APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18319) ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-11-18 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15678218#comment-15678218
 ] 

Joseph K. Bradley commented on SPARK-18319:
---

I agree with the "probably ready to be unmarked" items.

Among the items "to be discussed," I would recommend unmarking:
* LabeledPoint: simple & safe
* MLWriter/Reader (able): These are pretty thin APIs, so I think it's 
reasonable to unmark them.
* AFTSurvivalRegression (I have not heard complaints.  But I'm OK with leaving 
it Experimental too since there may be few users trying it out.)

I would prefer to leave the summaries and GeneralizedLinearRegression as 
Experimental since they have seen significant changes in 2.1.  I'm ambivalent 
about the Evaluators since they work reasonably well but do not fit some use 
cases (such as computing more stats at once, which several people have 
mentioned over time).

> ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit
> --
>
> Key: SPARK-18319
> URL: https://issues.apache.org/jira/browse/SPARK-18319
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Reporter: Joseph K. Bradley
>Assignee: yuhao yang
>Priority: Blocker
>
> We should make a pass through the items marked as Experimental or 
> DeveloperApi and see if any are stable enough to be unmarked.
> We should also check for items marked final or sealed to see if they are 
> stable enough to be opened up as APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18319) ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-11-15 Thread yuhao yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669027#comment-15669027
 ] 

yuhao yang commented on SPARK-18319:


Experimental classes in ml:

*probably ready to be unmarked:*
org.apache.spark.ml.classification.MultilayerPerceptronClassifier
org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
org.apache.spark.ml.clustering.GaussianMixtureModel, Model
org.apache.spark.ml.clustering.BisectingKMeans, Model
org.apache.spark.ml.clustering.KMeans, Model
org.apache.spark.ml.clustering.LDA, Model
org.apache.spark.ml.feature.MaxAbsScaler

*To be discussed: training summary, evaluator, new features, regression, 
Writer/Readers:*
org.apache.spark.ml.classification.BinaryLogisticRegressionTrainingSummary
org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
org.apache.spark.ml.clustering.ClusteringSummary
org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
org.apache.spark.ml.evaluation.RegressionEvaluator
org.apache.spark.ml.feature.LabeledPoint
org.apache.spark.ml.feature.MinHashModel
org.apache.spark.ml.feature.RandomProjectionModel
org.apache.spark.ml.feature.RFormula
org.apache.spark.ml.regression.AFTSurvivalRegression
org.apache.spark.ml.regression.GeneralizedLinearRegression
org.apache.spark.ml.regression.LinearRegressionTrainingSummary
org.apache.spark.ml.util.MLWriter, Reader (able)







> ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit
> --
>
> Key: SPARK-18319
> URL: https://issues.apache.org/jira/browse/SPARK-18319
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Reporter: Joseph K. Bradley
>Priority: Blocker
>
> We should make a pass through the items marked as Experimental or 
> DeveloperApi and see if any are stable enough to be unmarked.
> We should also check for items marked final or sealed to see if they are 
> stable enough to be opened up as APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18319) ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-11-15 Thread yuhao yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668892#comment-15668892
 ] 

yuhao yang commented on SPARK-18319:


sealed: 
org.apache.spark.ml.linalg.Matrix
org.apache.spark.ml.linalg.Vector
org.apache.spark.ml.attribute.Attribute
org.apache.spark.ml.attribute.AttributeType
org.apache.spark.ml.classification.LogisticRegressionTrainingSummary
org.apache.spark.ml.classification.LogisticRegressionSummary
org.apache.spark.ml.clustering.LDAModel
org.apache.spark.ml.feature.Term
org.apache.spark.ml.feature.InteractableTerm
org.apache.spark.ml.optim.NormalEquationSolver
org.apache.spark.ml.tree.Node
org.apache.spark.ml.tree.Split
org.apache.spark.ml.util.BaseReadWrite
org.apache.spark.mllib.clustering.LDAOptimizer
org.apache.spark.mllib.linalg.Matrix
org.apache.spark.mllib.linalg.Vector
org.apache.spark.mllib.stat.test.StreamingTestMethod
org.apache.spark.mllib.tree.model.TreeEnsembleModel

I think Vector, Matrix, LDAModel, LDAOptimizer can be opened up to allow 
extension in Spark Package, such like HashVector and Gibbs Sampling LDA.

> ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit
> --
>
> Key: SPARK-18319
> URL: https://issues.apache.org/jira/browse/SPARK-18319
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Reporter: Joseph K. Bradley
>Priority: Blocker
>
> We should make a pass through the items marked as Experimental or 
> DeveloperApi and see if any are stable enough to be unmarked.
> We should also check for items marked final or sealed to see if they are 
> stable enough to be opened up as APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18319) ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-11-15 Thread yuhao yang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668703#comment-15668703
 ] 

yuhao yang commented on SPARK-18319:


I'll start make a pass.

> ML, Graph 2.1 QA: API: Experimental, DeveloperApi, final, sealed audit
> --
>
> Key: SPARK-18319
> URL: https://issues.apache.org/jira/browse/SPARK-18319
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, GraphX, ML, MLlib
>Reporter: Joseph K. Bradley
>Priority: Blocker
>
> We should make a pass through the items marked as Experimental or 
> DeveloperApi and see if any are stable enough to be unmarked.
> We should also check for items marked final or sealed to see if they are 
> stable enough to be opened up as APIs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org