[jira] [Commented] (SPARK-16872) Include Gaussian Naive Bayes Classifier

2016-08-05 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408986#comment-15408986 ] Nick Pentreath commented on SPARK-16872: I think this would be good to add - though we would only

[jira] [Closed] (SPARK-3692) RBF Kernel implementation to SVM

2016-08-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-3692. - Resolution: Duplicate > RBF Kernel implementation to SVM > > >

[jira] [Commented] (SPARK-16775) Reduce internal warnings from deprecated accumulator API

2016-08-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15402049#comment-15402049 ] Nick Pentreath commented on SPARK-16775: Ah I misunderstood that _all_ warnings would be worked

[jira] [Commented] (SPARK-16728) migrate internal API for MLlib trees from spark.mllib to spark.ml

2016-08-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401639#comment-15401639 ] Nick Pentreath commented on SPARK-16728: cc [~sethah] > migrate internal API for MLlib trees

[jira] [Commented] (SPARK-16765) Add Pipeline API example for KMeans

2016-08-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401637#comment-15401637 ] Nick Pentreath commented on SPARK-16765: I think the example is somewhat redundant. The idea with

[jira] [Closed] (SPARK-16765) Add Pipeline API example for KMeans

2016-08-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-16765. -- Resolution: Won't Fix > Add Pipeline API example for KMeans >

[jira] [Comment Edited] (SPARK-16765) Add Pipeline API example for KMeans

2016-08-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401637#comment-15401637 ] Nick Pentreath edited comment on SPARK-16765 at 8/1/16 7:21 AM: I think

[jira] [Commented] (SPARK-16775) Reduce internal warnings from deprecated accumulator API

2016-08-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401631#comment-15401631 ] Nick Pentreath commented on SPARK-16775: Is the best solution not to replace all usages of the

[jira] [Updated] (SPARK-15254) Improve ML pipeline Cross Validation Scaladoc & PyDoc

2016-07-27 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15254: --- Assignee: Krishna Kalyan > Improve ML pipeline Cross Validation Scaladoc & PyDoc >

[jira] [Resolved] (SPARK-15254) Improve ML pipeline Cross Validation Scaladoc & PyDoc

2016-07-27 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15254. Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 13894

[jira] [Assigned] (SPARK-14489) RegressionEvaluator returns NaN for ALS in Spark ml

2016-07-27 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-14489: -- Assignee: Nick Pentreath > RegressionEvaluator returns NaN for ALS in Spark ml >

[jira] [Comment Edited] (SPARK-14489) RegressionEvaluator returns NaN for ALS in Spark ml

2016-07-27 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395275#comment-15395275 ] Nick Pentreath edited comment on SPARK-14489 at 7/27/16 9:03 AM: - Thanks

[jira] [Commented] (SPARK-14489) RegressionEvaluator returns NaN for ALS in Spark ml

2016-07-27 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395275#comment-15395275 ] Nick Pentreath commented on SPARK-14489: Thanks for the thoughts Krishna. # Initially I also

[jira] [Commented] (SPARK-16365) Ideas for moving "mllib-local" forward

2016-07-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382193#comment-15382193 ] Nick Pentreath commented on SPARK-16365: The idea is more like [~rnowling] said - use Spark to

[jira] [Commented] (SPARK-16365) Ideas for moving "mllib-local" forward

2016-07-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382178#comment-15382178 ] Nick Pentreath commented on SPARK-16365: Fair point Sean about the lists. However, I've seen many

[jira] [Commented] (SPARK-16421) Improve output from ML examples

2016-07-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382155#comment-15382155 ] Nick Pentreath commented on SPARK-16421: You should be good to go? > Improve output from ML

[jira] [Commented] (SPARK-16495) Add ADMM optimizer in mllib package

2016-07-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382148#comment-15382148 ] Nick Pentreath commented on SPARK-16495: I think most people would agree ADMM is an interesting

[jira] [Commented] (SPARK-14464) Logistic regression performs poorly for very large vectors, even when the number of non-zero features is small

2016-07-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382142#comment-15382142 ] Nick Pentreath commented on SPARK-14464: [~daniel.siegmann.aol] would you mind trying out the

[jira] [Commented] (SPARK-14813) ML 2.0 QA: API: Python API coverage

2016-07-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15370526#comment-15370526 ] Nick Pentreath commented on SPARK-14813: [~holdenk] [~srowen] [~yanboliang] [~josephkb] I'd say

[jira] [Commented] (SPARK-15581) MLlib 2.1 Roadmap

2016-07-08 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368318#comment-15368318 ] Nick Pentreath commented on SPARK-15581: I think it would be a pretty interesting to explore a

[jira] [Commented] (SPARK-16365) Ideas for moving "mllib-local" forward

2016-07-08 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368310#comment-15368310 ] Nick Pentreath commented on SPARK-16365: Good question - and part of the reason for getting

[jira] [Commented] (SPARK-15790) Audit @Since annotations in ML

2016-07-05 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363017#comment-15363017 ] Nick Pentreath commented on SPARK-15790: Yeah all the {{ml.feature}} stuff is done. There might

[jira] [Resolved] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-07-05 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14811. Resolution: Fixed Fix Version/s: 2.0.0 > ML, Graph 2.0 QA: API: New Scala APIs,

[jira] [Comment Edited] (SPARK-14812) ML, Graph 2.0 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-07-05 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362200#comment-15362200 ] Nick Pentreath edited comment on SPARK-14812 at 7/5/16 8:22 AM:

[jira] [Commented] (SPARK-14812) ML, Graph 2.0 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-07-05 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362200#comment-15362200 ] Nick Pentreath commented on SPARK-14812: [~josephkb] are we going to "graduate" anything for

[jira] [Commented] (SPARK-15163) Mark experimental algorithms experimental in PySpark

2016-07-05 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362196#comment-15362196 ] Nick Pentreath commented on SPARK-15163: [~holdenk] are there outstanding tasks for this? > Mark

[jira] [Commented] (SPARK-14815) ML, Graph, R 2.0 QA: Update user guide for new features & APIs

2016-07-05 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362195#comment-15362195 ] Nick Pentreath commented on SPARK-14815: Ok I resolved for {{2.0.0}} since we haven't yet cut

[jira] [Resolved] (SPARK-14815) ML, Graph, R 2.0 QA: Update user guide for new features & APIs

2016-07-05 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14815. Resolution: Fixed Fix Version/s: 2.0.0 > ML, Graph, R 2.0 QA: Update user guide for

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-07-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293316#comment-15293316 ] Nick Pentreath edited comment on SPARK-14810 at 7/4/16 2:31 PM: List of

[jira] [Resolved] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-07-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14810. Resolution: Fixed > ML, Graph 2.0 QA: API: Binary incompatible changes >

[jira] [Commented] (SPARK-14815) ML, Graph, R 2.0 QA: Update user guide for new features & APIs

2016-07-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361333#comment-15361333 ] Nick Pentreath commented on SPARK-14815: [~yuhaoyan] is this done now? > ML, Graph, R 2.0 QA:

[jira] [Commented] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-07-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361326#comment-15361326 ] Nick Pentreath commented on SPARK-13448: [~yanboliang] is this done now with the PRs for

[jira] [Commented] (SPARK-13944) Separate out local linear algebra as a standalone module without Spark dependency

2016-07-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15361316#comment-15361316 ] Nick Pentreath commented on SPARK-13944: FYI, I created a sort of "follow up" at SPARK-16365, to

[jira] [Created] (SPARK-16365) Ideas for moving "mllib-local" forward

2016-07-04 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-16365: -- Summary: Ideas for moving "mllib-local" forward Key: SPARK-16365 URL: https://issues.apache.org/jira/browse/SPARK-16365 Project: Spark Issue Type:

[jira] [Commented] (SPARK-15575) Remove breeze from dependencies?

2016-07-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358831#comment-15358831 ] Nick Pentreath commented on SPARK-15575: Also related to the discussion on SPARK-15581 and here,

[jira] [Commented] (SPARK-15581) MLlib 2.1 Roadmap

2016-07-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358821#comment-15358821 ] Nick Pentreath commented on SPARK-15581: Could we move the discussion for Breeze specifically

[jira] [Commented] (SPARK-15575) Remove breeze from dependencies?

2016-07-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15358820#comment-15358820 ] Nick Pentreath commented on SPARK-15575: So it seems from the various discussion on SPARK-15581,

[jira] [Comment Edited] (SPARK-15944) Make spark.ml package backward compatible with spark.mllib vectors

2016-06-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356944#comment-15356944 ] Nick Pentreath edited comment on SPARK-15944 at 6/30/16 11:46 AM: -- As I

[jira] [Commented] (SPARK-15944) Make spark.ml package backward compatible with spark.mllib vectors

2016-06-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356944#comment-15356944 ] Nick Pentreath commented on SPARK-15944: As I commented on [PR

[jira] [Created] (SPARK-16328) Implement conversion utility functions for single instances in Python

2016-06-30 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-16328: -- Summary: Implement conversion utility functions for single instances in Python Key: SPARK-16328 URL: https://issues.apache.org/jira/browse/SPARK-16328 Project:

[jira] [Resolved] (SPARK-16261) Fix Incorrect appNames in PySpark ML Examples

2016-06-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-16261. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 > Fix Incorrect

[jira] [Comment Edited] (SPARK-15944) Make spark.ml package backward compatible with spark.mllib vectors

2016-06-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355014#comment-15355014 ] Nick Pentreath edited comment on SPARK-15944 at 6/29/16 11:12 AM: -- Just

[jira] [Commented] (SPARK-15944) Make spark.ml package backward compatible with spark.mllib vectors

2016-06-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355014#comment-15355014 ] Nick Pentreath commented on SPARK-15944: Just checking - we don't have {{asML}} / {{fromML}} for

[jira] [Commented] (SPARK-16149) API consistency discussion: CountVectorizer.{minDF -> minDocFreq, minTF -> minTermFreq}

2016-06-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348173#comment-15348173 ] Nick Pentreath commented on SPARK-16149: I'd generally vote for: * if it's a new param / model,

[jira] [Resolved] (SPARK-15997) Audit ml.feature Update documentation for ml feature transformers

2016-06-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15997. Resolution: Fixed Fix Version/s: 2.0.1 Issue resolved by pull request 13745

[jira] [Resolved] (SPARK-15164) Mark classification algorithms as experimental where marked so in scala

2016-06-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15164. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12938

[jira] [Resolved] (SPARK-15162) Update PySpark LogisticRegression threshold PyDoc to be as complete as Scaladoc

2016-06-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15162. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12938

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-06-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293316#comment-15293316 ] Nick Pentreath edited comment on SPARK-14810 at 6/22/16 9:17 AM: - List of

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-06-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293316#comment-15293316 ] Nick Pentreath edited comment on SPARK-14810 at 6/22/16 9:17 AM: - List of

[jira] [Updated] (SPARK-16127) Audit @Since annotations related to ml.linalg

2016-06-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-16127: --- Description: SPARK-14615 converted {{spark.ml}} to use the new {{Vector}}/{{Matrix}} classes

[jira] [Assigned] (SPARK-16127) Audit @Since annotations related to ml.linalg

2016-06-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-16127: -- Assignee: Nick Pentreath > Audit @Since annotations related to ml.linalg >

[jira] [Created] (SPARK-16127) Audit @Since annotations related to ml.linalg

2016-06-22 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-16127: -- Summary: Audit @Since annotations related to ml.linalg Key: SPARK-16127 URL: https://issues.apache.org/jira/browse/SPARK-16127 Project: Spark Issue

[jira] [Commented] (SPARK-16075) Make VectorUDT/MatrixUDT singleton under spark.ml package

2016-06-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15341506#comment-15341506 ] Nick Pentreath commented on SPARK-16075: [~wangmiao1981] SPARK-15746 will probably be superceded

[jira] [Updated] (SPARK-16063) Add storageLevel to Dataset

2016-06-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-16063: --- Summary: Add storageLevel to Dataset (was: Add getStorageLevel to Dataset) > Add

[jira] [Updated] (SPARK-16063) Add storageLevel to Dataset

2016-06-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-16063: --- Description: SPARK-11905 added {{cache}}/{{persist}} to {{Dataset}}. We should add

[jira] [Updated] (SPARK-10258) Add @Since annotation to ml.feature

2016-06-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-10258: --- Assignee: Martin Brown (was: Nick Pentreath) > Add @Since annotation to ml.feature >

[jira] [Assigned] (SPARK-10258) Add @Since annotation to ml.feature

2016-06-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-10258: -- Assignee: Nick Pentreath (was: Martin Brown) > Add @Since annotation to ml.feature >

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-06-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293316#comment-15293316 ] Nick Pentreath edited comment on SPARK-14810 at 6/20/16 8:17 AM: - List of

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-06-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293316#comment-15293316 ] Nick Pentreath edited comment on SPARK-14810 at 6/20/16 8:16 AM: - List of

[jira] [Created] (SPARK-16063) Add getStorageLevel to Dataset

2016-06-20 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-16063: -- Summary: Add getStorageLevel to Dataset Key: SPARK-16063 URL: https://issues.apache.org/jira/browse/SPARK-16063 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335961#comment-15335961 ] Nick Pentreath commented on SPARK-15501: It's done - resolved it. > ML 2.0 QA: Scala APIs audit

[jira] [Resolved] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15501. Resolution: Fixed Fix Version/s: 2.0.0 > ML 2.0 QA: Scala APIs audit for

[jira] [Resolved] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15447. Resolution: Fixed Fix Version/s: 2.0.0 > Performance test for ALS in Spark 2.0 >

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335956#comment-15335956 ] Nick Pentreath commented on SPARK-15447: Finalized results in the linked Google sheet. Also

[jira] [Updated] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15447: --- Description: We made several changes to ALS in 2.0. It is necessary to run some tests to

[jira] [Updated] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15447: --- Description: We made several changes to ALS in 2.0. It is necessary to run some tests to

[jira] [Commented] (SPARK-15995) Gradient Boosted Trees - handling of Categorical Inputs

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335801#comment-15335801 ] Nick Pentreath commented on SPARK-15995: cc [~sethah] > Gradient Boosted Trees - handling of

[jira] [Updated] (SPARK-16008) ML Logistic Regression aggregator serializes unnecessary data

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-16008: --- Assignee: Seth Hendrickson > ML Logistic Regression aggregator serializes unnecessary data >

[jira] [Updated] (SPARK-15997) Audit ml.feature Update documentation for ml feature transformers

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15997: --- Assignee: Gayathri Murali > Audit ml.feature Update documentation for ml feature

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1530#comment-1530 ] Nick Pentreath commented on SPARK-15447: Almost there - I'll be able to close this off by Friday

[jira] [Commented] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327237#comment-15327237 ] Nick Pentreath commented on SPARK-15746: I think you can go ahead now - I also vote for the

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327220#comment-15327220 ] Nick Pentreath commented on SPARK-15904: Could you explain why you're using K>3000 when your

[jira] [Commented] (SPARK-15790) Audit @Since annotations in ML

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327193#comment-15327193 ] Nick Pentreath commented on SPARK-15790: Yes, I've just looked at things in the concrete classes

[jira] [Commented] (SPARK-15790) Audit @Since annotations in ML

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327028#comment-15327028 ] Nick Pentreath commented on SPARK-15790: Ah thanks - missed that umbrella. It's actually really

[jira] [Resolved] (SPARK-15788) PySpark IDFModel missing "idf" property

2016-06-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15788. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13540

[jira] [Created] (SPARK-15790) Audit @Since annotations in ML

2016-06-06 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15790: -- Summary: Audit @Since annotations in ML Key: SPARK-15790 URL: https://issues.apache.org/jira/browse/SPARK-15790 Project: Spark Issue Type: Documentation

[jira] [Created] (SPARK-15788) PySpark IDFModel missing "idf" property

2016-06-06 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15788: -- Summary: PySpark IDFModel missing "idf" property Key: SPARK-15788 URL: https://issues.apache.org/jira/browse/SPARK-15788 Project: Spark Issue Type:

[jira] [Updated] (SPARK-15761) pyspark shell should load if PYSPARK_DRIVER_PYTHON is ipython an Python3

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15761: --- Assignee: Manoj Kumar > pyspark shell should load if PYSPARK_DRIVER_PYTHON is ipython an

[jira] [Resolved] (SPARK-15168) Add missing params to Python's MultilayerPerceptronClassifier

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15168. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12943

[jira] [Updated] (SPARK-15168) Add missing params to Python's MultilayerPerceptronClassifier

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15168: --- Assignee: holdenk > Add missing params to Python's MultilayerPerceptronClassifier >

[jira] [Commented] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314627#comment-15314627 ] Nick Pentreath commented on SPARK-15746: I'd say hold off on working on it until we decide which

[jira] [Commented] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314489#comment-15314489 ] Nick Pentreath commented on SPARK-14811: Yes, that does make sense. I will take a pass through

[jira] [Comment Edited] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314441#comment-15314441 ] Nick Pentreath edited comment on SPARK-15447 at 6/3/16 5:22 PM: Added a

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314441#comment-15314441 ] Nick Pentreath commented on SPARK-15447: Added a second tab to the sheet for testing DF-based API

[jira] [Updated] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15746: --- Summary: SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

[jira] [Created] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details

2016-06-02 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15746: -- Summary: SchemaUtils.checkColumnType with VectorUDT prints instance details Key: SPARK-15746 URL: https://issues.apache.org/jira/browse/SPARK-15746 Project:

[jira] [Resolved] (SPARK-15668) ml.feature: update check schema to avoid confusion when user use MLlib.vector as input type

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15668. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13411

[jira] [Updated] (SPARK-15139) PySpark TreeEnsemble missing methods

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15139: --- Assignee: holdenk > PySpark TreeEnsemble missing methods >

[jira] [Resolved] (SPARK-15139) PySpark TreeEnsemble missing methods

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15139. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12919

[jira] [Resolved] (SPARK-15092) toDebugString missing from ML DecisionTreeClassifier

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15092. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12919

[jira] [Comment Edited] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313208#comment-15313208 ] Nick Pentreath edited comment on SPARK-14811 at 6/2/16 10:31 PM: -

[jira] [Comment Edited] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313208#comment-15313208 ] Nick Pentreath edited comment on SPARK-14811 at 6/2/16 10:31 PM: -

[jira] [Commented] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313208#comment-15313208 ] Nick Pentreath commented on SPARK-14811: Question on this - we seem to be inconsistent with the

[jira] [Updated] (SPARK-15668) ml.feature: update check schema to avoid confusion when user use MLlib.vector as input type

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15668: --- Assignee: yuhao yang > ml.feature: update check schema to avoid confusion when user use

[jira] [Updated] (SPARK-15164) Mark classification algorithms as experimental where marked so in scala

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15164: --- Assignee: holdenk > Mark classification algorithms as experimental where marked so in scala

[jira] [Updated] (SPARK-15162) Update PySpark LogisticRegression threshold PyDoc to be as complete as Scaladoc

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15162: --- Assignee: holdenk > Update PySpark LogisticRegression threshold PyDoc to be as complete as

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293316#comment-15293316 ] Nick Pentreath edited comment on SPARK-14810 at 6/1/16 5:56 PM: List of

[jira] [Updated] (SPARK-15587) ML 2.0 QA: Scala APIs audit for feature

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15587: --- Assignee: Yanbo Liang > ML 2.0 QA: Scala APIs audit for feature >

[jira] [Resolved] (SPARK-15587) ML 2.0 QA: Scala APIs audit for feature

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15587. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13410

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-05-31 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308797#comment-15308797 ] Nick Pentreath commented on SPARK-15447: Created a Google sheet with initial results:

<    1   2   3   4   5   6   7   8   9   10   >