[jira] [Updated] (SPARK-16063) Add storageLevel to Dataset

2016-06-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-16063: --- Summary: Add storageLevel to Dataset (was: Add getStorageLevel to Dataset) >

[jira] [Updated] (SPARK-16063) Add storageLevel to Dataset

2016-06-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-16063: --- Description: SPARK-11905 added {{cache}}/{{persist}} to {{Dataset}}. We should add

[jira] [Updated] (SPARK-10258) Add @Since annotation to ml.feature

2016-06-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-10258: --- Assignee: Martin Brown (was: Nick Pentreath) > Add @Since annotation to ml.feat

[jira] [Assigned] (SPARK-10258) Add @Since annotation to ml.feature

2016-06-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-10258: -- Assignee: Nick Pentreath (was: Martin Brown) > Add @Since annotation to ml.feat

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-06-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293316#comment-15293316 ] Nick Pentreath edited comment on SPARK-14810 at 6/20/16 8:1

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-06-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293316#comment-15293316 ] Nick Pentreath edited comment on SPARK-14810 at 6/20/16 8:1

[jira] [Created] (SPARK-16063) Add getStorageLevel to Dataset

2016-06-20 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-16063: -- Summary: Add getStorageLevel to Dataset Key: SPARK-16063 URL: https://issues.apache.org/jira/browse/SPARK-16063 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335961#comment-15335961 ] Nick Pentreath commented on SPARK-15501: It's done - resolved it. &g

[jira] [Resolved] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15501. Resolution: Fixed Fix Version/s: 2.0.0 > ML 2.0 QA: Scala APIs audit

[jira] [Resolved] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15447. Resolution: Fixed Fix Version/s: 2.0.0 > Performance test for ALS in Spark

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335956#comment-15335956 ] Nick Pentreath commented on SPARK-15447: Finalized results in the linked Go

[jira] [Updated] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15447: --- Description: We made several changes to ALS in 2.0. It is necessary to run some tests to

[jira] [Updated] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15447: --- Description: We made several changes to ALS in 2.0. It is necessary to run some tests to

[jira] [Commented] (SPARK-15995) Gradient Boosted Trees - handling of Categorical Inputs

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335801#comment-15335801 ] Nick Pentreath commented on SPARK-15995: cc [~sethah] > Gradient Booste

[jira] [Updated] (SPARK-16008) ML Logistic Regression aggregator serializes unnecessary data

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-16008: --- Assignee: Seth Hendrickson > ML Logistic Regression aggregator serializes unnecessary d

[jira] [Updated] (SPARK-15997) Audit ml.feature Update documentation for ml feature transformers

2016-06-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15997: --- Assignee: Gayathri Murali > Audit ml.feature Update documentation for ml feat

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1530#comment-1530 ] Nick Pentreath commented on SPARK-15447: Almost there - I'll be able

[jira] [Commented] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327237#comment-15327237 ] Nick Pentreath commented on SPARK-15746: I think you can go ahead now - I

[jira] [Commented] (SPARK-15904) High Memory Pressure using MLlib K-means

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327220#comment-15327220 ] Nick Pentreath commented on SPARK-15904: Could you explain why you'r

[jira] [Commented] (SPARK-15790) Audit @Since annotations in ML

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327193#comment-15327193 ] Nick Pentreath commented on SPARK-15790: Yes, I've just looked at thin

[jira] [Commented] (SPARK-15790) Audit @Since annotations in ML

2016-06-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327028#comment-15327028 ] Nick Pentreath commented on SPARK-15790: Ah thanks - missed that umbrella.

[jira] [Resolved] (SPARK-15788) PySpark IDFModel missing "idf" property

2016-06-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15788. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13540 [https

[jira] [Created] (SPARK-15790) Audit @Since annotations in ML

2016-06-06 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15790: -- Summary: Audit @Since annotations in ML Key: SPARK-15790 URL: https://issues.apache.org/jira/browse/SPARK-15790 Project: Spark Issue Type: Documentation

[jira] [Created] (SPARK-15788) PySpark IDFModel missing "idf" property

2016-06-06 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15788: -- Summary: PySpark IDFModel missing "idf" property Key: SPARK-15788 URL: https://issues.apache.org/jira/browse/SPARK-15788 Project: Spark

Re: Welcoming Yanbo Liang as a committer

2016-06-04 Thread Nick Pentreath
Congratulations Yanbo and welcome On Sat, 4 Jun 2016 at 10:17, Hortonworks wrote: > Congratulations, Yanbo > > Zhan Zhang > > Sent from my iPhone > > > On Jun 3, 2016, at 8:39 PM, Dongjoon Hyun wrote: > > > > Congratulations > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for

[jira] [Updated] (SPARK-15761) pyspark shell should load if PYSPARK_DRIVER_PYTHON is ipython an Python3

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15761: --- Assignee: Manoj Kumar > pyspark shell should load if PYSPARK_DRIVER_PYTHON is ipython

[jira] [Resolved] (SPARK-15168) Add missing params to Python's MultilayerPerceptronClassifier

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15168. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12943 [https

[jira] [Updated] (SPARK-15168) Add missing params to Python's MultilayerPerceptronClassifier

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15168: --- Assignee: holdenk > Add missing params to Python's MultilayerPerceptronCl

[jira] [Commented] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314627#comment-15314627 ] Nick Pentreath commented on SPARK-15746: I'd say hold off on working on

[jira] [Commented] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314489#comment-15314489 ] Nick Pentreath commented on SPARK-14811: Yes, that does make sense. I will

[jira] [Comment Edited] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314441#comment-15314441 ] Nick Pentreath edited comment on SPARK-15447 at 6/3/16 5:2

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-06-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314441#comment-15314441 ] Nick Pentreath commented on SPARK-15447: Added a second tab to the sheet

[jira] [Updated] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15746: --- Summary: SchemaUtils.checkColumnType with VectorUDT prints instance details in error message

[jira] [Created] (SPARK-15746) SchemaUtils.checkColumnType with VectorUDT prints instance details

2016-06-02 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15746: -- Summary: SchemaUtils.checkColumnType with VectorUDT prints instance details Key: SPARK-15746 URL: https://issues.apache.org/jira/browse/SPARK-15746 Project

[jira] [Resolved] (SPARK-15668) ml.feature: update check schema to avoid confusion when user use MLlib.vector as input type

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15668. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13411 [https

[jira] [Updated] (SPARK-15139) PySpark TreeEnsemble missing methods

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15139: --- Assignee: holdenk > PySpark TreeEnsemble missing meth

Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Nick Pentreath
hat out since your last email... I deleted > the 2.10 by accident but then put 2+2 together. > > Got it working now. > > Still sticking to my story that it's somewhat complicated to setup :) > > Kevin > > On Thu, Jun 2, 2016 at 3:59 PM, Nick Pentreath > wrote: &

Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Nick Pentreath
dAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) > at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala

[jira] [Resolved] (SPARK-15139) PySpark TreeEnsemble missing methods

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15139. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12919 [https

[jira] [Resolved] (SPARK-15092) toDebugString missing from ML DecisionTreeClassifier

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15092. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12919 [https

Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Nick Pentreath
Hey there When I used es-hadoop, I just pulled in the dependency into my pom.xml, with spark as a "provided" dependency, and built a fat jar with assembly. Then with spark-submit use the --jars option to include your assembly jar (IIRC I sometimes also needed to use --driver-classpath too, but pe

[jira] [Comment Edited] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15313208#comment-15313208 ] Nick Pentreath edited comment on SPARK-14811 at 6/2/16 10:3

[jira] [Comment Edited] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15313208#comment-15313208 ] Nick Pentreath edited comment on SPARK-14811 at 6/2/16 10:3

[jira] [Commented] (SPARK-14811) ML, Graph 2.0 QA: API: New Scala APIs, docs

2016-06-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15313208#comment-15313208 ] Nick Pentreath commented on SPARK-14811: Question on this - we seem t

[jira] [Updated] (SPARK-15668) ml.feature: update check schema to avoid confusion when user use MLlib.vector as input type

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15668: --- Assignee: yuhao yang > ml.feature: update check schema to avoid confusion when user

[jira] [Updated] (SPARK-15164) Mark classification algorithms as experimental where marked so in scala

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15164: --- Assignee: holdenk > Mark classification algorithms as experimental where marked so in sc

[jira] [Updated] (SPARK-15162) Update PySpark LogisticRegression threshold PyDoc to be as complete as Scaladoc

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15162: --- Assignee: holdenk > Update PySpark LogisticRegression threshold PyDoc to be as complete

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293316#comment-15293316 ] Nick Pentreath edited comment on SPARK-14810 at 6/1/16 5:5

[jira] [Updated] (SPARK-15587) ML 2.0 QA: Scala APIs audit for feature

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15587: --- Assignee: Yanbo Liang > ML 2.0 QA: Scala APIs audit for feat

[jira] [Resolved] (SPARK-15587) ML 2.0 QA: Scala APIs audit for feature

2016-06-01 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15587. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13410 [https

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-05-31 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308797#comment-15308797 ] Nick Pentreath commented on SPARK-15447: Created a Google sheet with ini

[jira] [Commented] (SPARK-15575) Remove breeze from dependencies?

2016-05-27 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15304546#comment-15304546 ] Nick Pentreath commented on SPARK-15575: What specifically are the "pe

[jira] [Resolved] (SPARK-15492) Binarization scala example copy & paste to spark-shell error

2016-05-26 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15492. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13266 [https

[jira] [Resolved] (SPARK-15500) Remove defaults in storage level param doc in ALS

2016-05-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15500. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13277 [https

Re: Cannot build master with sbt

2016-05-25 Thread Nick Pentreath
I've filed https://issues.apache.org/jira/browse/SPARK-15525 For now, you would have to check out sbt-antlr4 at https://github.com/ihji/sbt-antlr4/commit/23eab68b392681a7a09f6766850785afe8dfa53d (since I don't see any branches or tags in the github repo for different versions), and sbt publishLoca

[jira] [Created] (SPARK-15525) Clean sbt build fails to resolve sbt-antlr4 plugin

2016-05-25 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15525: -- Summary: Clean sbt build fails to resolve sbt-antlr4 plugin Key: SPARK-15525 URL: https://issues.apache.org/jira/browse/SPARK-15525 Project: Spark Issue

[jira] [Resolved] (SPARK-15504) Could MatrixFactorizationModel support recommend for some users only ?

2016-05-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15504. Resolution: Duplicate Please see SPARK-10802 which already exists. For the old RDD-based

[jira] [Updated] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15501: --- Component/s: ML Documentation > ML 2.0 QA: Scala APIs audit

[jira] [Updated] (SPARK-15500) Remove defaults in storage level param doc in ALS

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15500: --- Component/s: PySpark ML Documentation > Remove defaults

[jira] [Assigned] (SPARK-15502) Add note in ML ALS docs that user / item column only supports Int

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-15502: -- Assignee: Nick Pentreath > Add note in ML ALS docs that user / item column o

[jira] [Created] (SPARK-15502) Add note in ML ALS docs that user / item column only supports Int

2016-05-24 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15502: -- Summary: Add note in ML ALS docs that user / item column only supports Int Key: SPARK-15502 URL: https://issues.apache.org/jira/browse/SPARK-15502 Project: Spark

[jira] [Created] (SPARK-15501) ML 2.0 QA: Scala APIs audit for recommendation

2016-05-24 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15501: -- Summary: ML 2.0 QA: Scala APIs audit for recommendation Key: SPARK-15501 URL: https://issues.apache.org/jira/browse/SPARK-15501 Project: Spark Issue

[jira] [Created] (SPARK-15500) Remove defaults in storage level param doc in ALS

2016-05-24 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15500: -- Summary: Remove defaults in storage level param doc in ALS Key: SPARK-15500 URL: https://issues.apache.org/jira/browse/SPARK-15500 Project: Spark Issue

[jira] [Updated] (SPARK-15254) Improve ML pipeline Cross Validation Scaladoc & PyDoc

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15254: --- Description: The ML pipeline Cross Validation Scaladoc & PyDoc is very sparse - we sh

[jira] [Commented] (SPARK-15254) Improve ML pipeline Cross Validation Scaladoc & PyDoc

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297871#comment-15297871 ] Nick Pentreath commented on SPARK-15254: Please go ahead! > Improve ML p

[jira] [Resolved] (SPARK-15442) PySpark QuantileDiscretizer missing "relativeError" param

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15442. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13228 [https

[jira] [Updated] (SPARK-15492) Binarization scala example copy & paste to spark-shell error

2016-05-24 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15492: --- Assignee: Miao Wang > Binarization scala example copy & paste to spark-shel

Re: [DISCUSS] PredictionIO incubation proposal

2016-05-24 Thread Nick Pentreath
Hi everyone I just want to make it clear that my suggestion was in no way some sort of attempt to hijack the project or push a corporate agenda. For me personally, I have not been directly involved in PredictionIO, that is true. I have however spent the past 3 years prior to joining IBM building

[jira] [Assigned] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-05-23 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-15447: -- Assignee: Nick Pentreath > Performance test for ALS in Spark

Re: [VOTE] Removing module maintainer process

2016-05-22 Thread Nick Pentreath
+1 (binding) On Mon, 23 May 2016 at 04:19, Matei Zaharia wrote: > Correction, let's run this for 72 hours, so until 9 PM EST May 25th. > > > On May 22, 2016, at 8:34 PM, Matei Zaharia > wrote: > > > > It looks like the discussion thread on this has only had positive > replies, so I'm going to ca

[jira] [Commented] (SPARK-15447) Performance test for ALS in Spark 2.0

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15294116#comment-15294116 ] Nick Pentreath commented on SPARK-15447: [~mengxr] yes will aim to run

[jira] [Assigned] (SPARK-15442) PySpark QuantileDiscretizer missing "relativeError" param

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-15442: -- Assignee: Nick Pentreath > PySpark QuantileDiscretizer missing "relativeErro

[jira] [Comment Edited] (SPARK-15442) PySpark QuantileDiscretizer missing "relativeError" param

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293753#comment-15293753 ] Nick Pentreath edited comment on SPARK-15442 at 5/20/16 5:1

[jira] [Commented] (SPARK-15442) PySpark QuantileDiscretizer missing "relativeError" param

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293753#comment-15293753 ] Nick Pentreath commented on SPARK-15442: When do you plan to submit a PR?

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293318#comment-15293318 ] Nick Pentreath edited comment on SPARK-14810 at 5/20/16 1:0

[jira] [Comment Edited] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15280592#comment-15280592 ] Nick Pentreath edited comment on SPARK-14810 at 5/20/16 1:0

[jira] [Commented] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293318#comment-15293318 ] Nick Pentreath commented on SPARK-14810: Yeah makes sense - I've

[jira] [Updated] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14810: --- Description: Generate a list of binary incompatible changes using MiMa and create new JIRAs

[jira] [Commented] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293316#comment-15293316 ] Nick Pentreath commented on SPARK-14810: List of changes since {{1.6.0}} aud

[jira] [Updated] (SPARK-15412) Improve linear & isotonic regression methods PyDocs

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15412: --- Assignee: holdenk > Improve linear & isotonic regression methods

[jira] [Updated] (SPARK-15444) Default value mismatch of param linkPredictionCol for GeneralizedLinearRegression

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15444: --- Assignee: Liang-Chi Hsieh > Default value mismatch of param linkPredictionCol

[jira] [Resolved] (SPARK-15444) Default value mismatch of param linkPredictionCol for GeneralizedLinearRegression

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15444. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13220 [https

[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292899#comment-15292899 ] Nick Pentreath commented on SPARK-15100: I created SPARK-15442 for #1 >

[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292895#comment-15292895 ] Nick Pentreath commented on SPARK-15100: I'm not sure we need to set

[jira] [Created] (SPARK-15442) PySpark QuantileDiscretizer missing "relativeError" param

2016-05-20 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15442: -- Summary: PySpark QuantileDiscretizer missing "relativeError" param Key: SPARK-15442 URL: https://issues.apache.org/jira/browse/SPARK-15442 Proj

[jira] [Resolved] (SPARK-15316) PySpark GeneralizedLinearRegression missing linkPredictionCol param

2016-05-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15316. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 13106 [https

[jira] [Resolved] (SPARK-14891) ALS in ML never validates input schema

2016-05-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14891. Resolution: Fixed Fix Version/s: 2.0.0 > ALS in ML never validates input sch

[jira] [Commented] (SPARK-14978) PySpark TrainValidationSplitModel should support validationMetrics

2016-05-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15288827#comment-15288827 ] Nick Pentreath commented on SPARK-14978: thanks! >

[jira] [Commented] (SPARK-14978) PySpark TrainValidationSplitModel should support validationMetrics

2016-05-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15288790#comment-15288790 ] Nick Pentreath commented on SPARK-14978: [~srowen] how do I add JIRA user

[jira] [Comment Edited] (SPARK-15378) Unable to load NLTK in spark RDD pipeline

2016-05-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15288665#comment-15288665 ] Nick Pentreath edited comment on SPARK-15378 at 5/18/16 9:1

[jira] [Commented] (SPARK-15378) Unable to load NLTK in spark RDD pipeline

2016-05-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15288665#comment-15288665 ] Nick Pentreath commented on SPARK-15378: If you are trying to run on a clu

[jira] [Resolved] (SPARK-14978) PySpark TrainValidationSplitModel should support validationMetrics

2016-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14978. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12767 [https

Re: [DISCUSS] PredictionIO incubation proposal

2016-05-17 Thread Nick Pentreath
Hi there I'm glad to see the proposal to incubate PredictionIO. In my previous life as a startup co-founder, I kept a close eye on the project, and it would be fantastic to see it become an Apache incubating project! The folks working on Apache Spark and Apache SystemML (incubating) here at IBM a

[jira] [Resolved] (SPARK-15182) Copy MLlib doc to ML: ml.feature

2016-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-15182. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12957 [https

[jira] [Updated] (SPARK-15182) Copy MLlib doc to ML: ml.feature

2016-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15182: --- Assignee: yuhao yang > Copy MLlib doc to ML: ml.feat

[jira] [Updated] (SPARK-14434) User guide doc and examples for GaussianMixture in spark.ml

2016-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14434: --- Assignee: Miao Wang > User guide doc and examples for GaussianMixture in spark

[jira] [Resolved] (SPARK-14434) User guide doc and examples for GaussianMixture in spark.ml

2016-05-17 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14434. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12788 [https

[jira] [Commented] (SPARK-14709) spark.ml API for linear SVM

2016-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284304#comment-15284304 ] Nick Pentreath commented on SPARK-14709: It would be great to get the lis

[jira] [Resolved] (SPARK-14979) Add examples for GeneralizedLinearRegression

2016-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14979. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12754 [https

[jira] [Updated] (SPARK-15316) PySpark GeneralizedLinearRegression missing linkPredictionCol param

2016-05-16 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15316: --- Assignee: holdenk > PySpark GeneralizedLinearRegression missing linkPredictionCol pa

<    3   4   5   6   7   8   9   10   11   12   >