[
https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539258#comment-14539258
]
Joseph K. Bradley commented on SPARK-7443:
------------------------------------------
Note: The Naive Bayes user guide section has already been updated for the
Bernoulli model.
> MLlib 1.4 QA plan
> -----------------
>
> Key: SPARK-7443
> URL: https://issues.apache.org/jira/browse/SPARK-7443
> Project: Spark
> Issue Type: Umbrella
> Components: ML, MLlib
> Affects Versions: 1.4.0
> Reporter: Xiangrui Meng
> Assignee: Joseph K. Bradley
> Priority: Critical
>
> TODO: create JIRAs for each task and assign them accordingly.
> h2. API
> * Check API compliance using java-compliance-checker (SPARK-7458)
> * Audit new public APIs (from the generated html doc)
> ** Scala (do not forget to check the object doc) (SPARK-7537)
> ** Java compatibility (SPARK-7529)
> ** Python API coverage (SPARK-7536)
> * audit Pipeline APIs (SPARK-7535)
> * graduate spark.ml from alpha
> ** remove AlphaComponent annotations
> ** remove mima excludes for spark.ml
> h2. Algorithms and performance
> *Performance*
> * _List any other missing performance tests from spark-perf here_
> * LDA online/EM (SPARK-7455)
> * ElasticNet for linear regression and logistic regression (SPARK-7456)
> * Bernoulli naive Bayes (SPARK-7453)
> * PIC (SPARK-7454)
> * ALS.recommendAll (SPARK-7457)
> * perf-tests in Python (SPARK-7539)
> *Correctness*
> * PMML
> ** scoring using PMML evaluator vs. MLlib models (SPARK-7540)
> * model save/load (SPARK-7541)
> h2. Documentation and example code
> * Create JIRAs for the user guide to each new algorithm and assign them to
> the corresponding author. Link here as "requires"
> ** Now that we have algorithms in spark.ml which are not in spark.mllib, we
> should start making subsections for the spark.ml API as needed. We can
> follow the structure of the spark.mllib user guide.
> *** The spark.ml user guide can provide: (a) code examples and (b) info on
> algorithms which do not exist in spark.mllib.
> *** We should not duplicate info in the spark.ml guides. Since spark.mllib
> is still the primary API, we should provide links to the corresponding
> algorithms in the spark.mllib user guide for more info.
> * Create example code for major components. Link here as "requires"
> ** cross validation in python
> ** pipeline with complex feature transformations (scala/java/python)
> ** elastic-net (possibly with cross validation)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]