[jira] [Commented] (SPARK-7443) MLlib 1.4 QA plan

Joseph K. Bradley (JIRA) Mon, 11 May 2015 22:16:49 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14539258#comment-14539258
 ]


Joseph K. Bradley commented on SPARK-7443:
------------------------------------------

Note: The Naive Bayes user guide section has already been updated for the 
Bernoulli model.

> MLlib 1.4 QA plan
> -----------------
>
>                 Key: SPARK-7443
>                 URL: https://issues.apache.org/jira/browse/SPARK-7443
>             Project: Spark
>          Issue Type: Umbrella
>          Components: ML, MLlib
>    Affects Versions: 1.4.0
>            Reporter: Xiangrui Meng
>            Assignee: Joseph K. Bradley
>            Priority: Critical
>
> TODO: create JIRAs for each task and assign them accordingly.
> h2. API
> * Check API compliance using java-compliance-checker (SPARK-7458)
> * Audit new public APIs (from the generated html doc)
> ** Scala (do not forget to check the object doc) (SPARK-7537)
> ** Java compatibility (SPARK-7529)
> ** Python API coverage (SPARK-7536)
> * audit Pipeline APIs (SPARK-7535)
> * graduate spark.ml from alpha
> ** remove AlphaComponent annotations
> ** remove mima excludes for spark.ml
> h2. Algorithms and performance
> *Performance*
> * _List any other missing performance tests from spark-perf here_
> * LDA online/EM (SPARK-7455)
> * ElasticNet for linear regression and logistic regression (SPARK-7456)
> * Bernoulli naive Bayes (SPARK-7453)
> * PIC (SPARK-7454)
> * ALS.recommendAll (SPARK-7457)
> * perf-tests in Python (SPARK-7539)
> *Correctness*
> * PMML
> ** scoring using PMML evaluator vs. MLlib models (SPARK-7540)
> * model save/load (SPARK-7541)
> h2. Documentation and example code
> * Create JIRAs for the user guide to each new algorithm and assign them to 
> the corresponding author.  Link here as "requires"
> ** Now that we have algorithms in spark.ml which are not in spark.mllib, we 
> should start making subsections for the spark.ml API as needed.  We can 
> follow the structure of the spark.mllib user guide.
> *** The spark.ml user guide can provide: (a) code examples and (b) info on 
> algorithms which do not exist in spark.mllib.
> *** We should not duplicate info in the spark.ml guides.  Since spark.mllib 
> is still the primary API, we should provide links to the corresponding 
> algorithms in the spark.mllib user guide for more info.
> * Create example code for major components.  Link here as "requires"
> ** cross validation in python
> ** pipeline with complex feature transformations (scala/java/python)
> ** elastic-net (possibly with cross validation)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-7443) MLlib 1.4 QA plan

Reply via email to