[jira] [Updated] (SPARK-15188) PySpark NaiveBayes is missing Thresholds param

2016-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15188: --- Assignee: holdenk > PySpark NaiveBayes is missing Thresholds param >

[jira] [Updated] (SPARK-15188) PySpark NaiveBayes is missing Thresholds param

2016-05-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15188: --- Summary: PySpark NaiveBayes is missing Thresholds param (was: NaiveBayes is missing Threshol

[jira] [Commented] (SPARK-14815) ML, Graph, R 2.0 QA: Update user guide for new features & APIs

2016-05-05 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15272258#comment-15272258 ] Nick Pentreath commented on SPARK-14815: Not sure if this is the best JIRA to com

[jira] [Updated] (SPARK-15092) toDebugString missing from ML DecisionTreeClassifier

2016-05-05 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15092: --- Assignee: holdenk > toDebugString missing from ML DecisionTreeClassifier > --

[jira] [Assigned] (SPARK-14810) ML, Graph 2.0 QA: API: Binary incompatible changes

2016-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-14810: -- Assignee: Nick Pentreath > ML, Graph 2.0 QA: API: Binary incompatible changes > --

[jira] [Commented] (SPARK-14900) spark.ml classification metrics should include accuracy

2016-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270574#comment-15270574 ] Nick Pentreath commented on SPARK-14900: +1 for deprecating "precision" in favour

[jira] [Resolved] (SPARK-14844) KMeansModel in spark.ml should allow to change featureCol and predictionCol

2016-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14844. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12609 [https:/

[jira] [Commented] (SPARK-14900) spark.ml classification metrics should include accuracy

2016-05-04 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270403#comment-15270403 ] Nick Pentreath commented on SPARK-14900: It's ok to put this in MultiClassMetrics

[jira] [Updated] (SPARK-15094) CodeGenerator: failed to compile - when using dataset.rdd with generic case class

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-15094: --- Component/s: SQL > CodeGenerator: failed to compile - when using dataset.rdd with generic cas

[jira] [Commented] (SPARK-15027) ALS.train should use DataFrame instead of RDD

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269367#comment-15269367 ] Nick Pentreath commented on SPARK-15027: Will take a look at repartitioning. 2.1

[jira] [Created] (SPARK-15094) CodeGenerator: failed to compile - when using dataset.rdd with generic case class

2016-05-03 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-15094: -- Summary: CodeGenerator: failed to compile - when using dataset.rdd with generic case class Key: SPARK-15094 URL: https://issues.apache.org/jira/browse/SPARK-15094

[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13448: --- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can rememb

[jira] [Updated] (SPARK-13448) Document MLlib behavior changes in Spark 2.0

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13448: --- Description: This JIRA keeps a list of MLlib behavior changes in Spark 2.0. So we can rememb

[jira] [Resolved] (SPARK-14971) PySpark ML Params setter code clean up

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14971. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12749 [https:/

[jira] [Updated] (SPARK-14971) PySpark ML Params setter code clean up

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14971: --- Assignee: Yanbo Liang > PySpark ML Params setter code clean up >

[jira] [Commented] (SPARK-15027) ALS.train should use DataFrame instead of RDD

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268662#comment-15268662 ] Nick Pentreath commented on SPARK-15027: I've managed to get it working for the f

[jira] [Updated] (SPARK-14971) PySpark ML Params setter code clean up

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14971: --- Shepherd: Nick Pentreath > PySpark ML Params setter code clean up > -

[jira] [Commented] (SPARK-14812) ML, Graph 2.0 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-05-03 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15268326#comment-15268326 ] Nick Pentreath commented on SPARK-14812: fair enough - any changes to ALS {{trans

[jira] [Commented] (SPARK-15027) ALS.train should use DataFrame instead of RDD

2016-04-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265232#comment-15265232 ] Nick Pentreath commented on SPARK-15027: Ok - it would make sense to have it in 2

[jira] [Commented] (SPARK-15027) ALS.train should use DataFrame instead of RDD

2016-04-30 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265228#comment-15265228 ] Nick Pentreath commented on SPARK-15027: [~mengxr] are you intending this to be a

[jira] [Updated] (SPARK-14571) Log instrumentation in ALS

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14571: --- Assignee: Miao Wang > Log instrumentation in ALS > -- > >

[jira] [Resolved] (SPARK-14571) Log instrumentation in ALS

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14571. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12560 [https:/

[jira] [Commented] (SPARK-14900) spark.ml classification metrics should include accuracy

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263778#comment-15263778 ] Nick Pentreath commented on SPARK-14900: Sure, go ahead > spark.ml classificatio

[jira] [Assigned] (SPARK-14891) ALS in ML never validates input schema

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-14891: -- Assignee: Nick Pentreath > ALS in ML never validates input schema > --

[jira] [Updated] (SPARK-14886) RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsException

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14886: --- Assignee: Sean Owen > RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsException

[jira] [Resolved] (SPARK-14886) RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsException

2016-04-29 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14886. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12756 [https:/

[jira] [Comment Edited] (SPARK-8971) Support balanced class labels when splitting train/cross validation sets

2016-04-28 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261655#comment-15261655 ] Nick Pentreath edited comment on SPARK-8971 at 4/28/16 7:18 AM:

[jira] [Commented] (SPARK-8971) Support balanced class labels when splitting train/cross validation sets

2016-04-28 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15261655#comment-15261655 ] Nick Pentreath commented on SPARK-8971: --- I think it would be good to have something

[jira] [Commented] (SPARK-9656) Add missing methods to linalg.distributed

2016-04-27 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15260707#comment-15260707 ] Nick Pentreath commented on SPARK-9656: --- I can't seem to find your name in the searc

[jira] [Resolved] (SPARK-9656) Add missing methods to linalg.distributed

2016-04-27 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-9656. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 9441 [https://git

[jira] [Commented] (SPARK-14891) ALS in ML never validates input schema

2016-04-26 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257682#comment-15257682 ] Nick Pentreath commented on SPARK-14891: {{ALS.train}} is generic in the ID type,

[jira] [Resolved] (SPARK-13962) spark.ml Evaluators should support other numeric types for label

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-13962. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12500 [https:/

[jira] [Updated] (SPARK-14844) KMeansModel in spark.ml should allow to change featureCol and predictionCol

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14844: --- Shepherd: Nick Pentreath (was: Dominik Jastrzębski) > KMeansModel in spark.ml should allow t

[jira] [Updated] (SPARK-14768) Remove expectedType arg for PySpark Param

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14768: --- Assignee: Jason C Lee > Remove expectedType arg for PySpark Param > -

[jira] [Resolved] (SPARK-14768) Remove expectedType arg for PySpark Param

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14768. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12581 [https:/

[jira] [Updated] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14409: --- Shepherd: Nick Pentreath > Investigate adding a RankingEvaluator to ML >

[jira] [Commented] (SPARK-14891) ALS in ML never validates input schema

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256140#comment-15256140 ] Nick Pentreath commented on SPARK-14891: Currently the only doc is {code} /** *

[jira] [Comment Edited] (SPARK-14891) ALS in ML never validates input schema

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256140#comment-15256140 ] Nick Pentreath edited comment on SPARK-14891 at 4/25/16 9:27 AM: --

[jira] [Commented] (SPARK-14891) ALS in ML never validates input schema

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256084#comment-15256084 ] Nick Pentreath commented on SPARK-14891: [~srowen] [~mengxr] [~josephkb] thoughts

[jira] [Created] (SPARK-14891) ALS in ML never validates input schema

2016-04-25 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-14891: -- Summary: ALS in ML never validates input schema Key: SPARK-14891 URL: https://issues.apache.org/jira/browse/SPARK-14891 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-14886) RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsException

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256065#comment-15256065 ] Nick Pentreath edited comment on SPARK-14886 at 4/25/16 8:26 AM: --

[jira] [Commented] (SPARK-14886) RankingMetrics.ndcgAt throw java.lang.ArrayIndexOutOfBoundsException

2016-04-25 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256065#comment-15256065 ] Nick Pentreath commented on SPARK-14886: Are you saying that the "maxDCG" should

[jira] [Updated] (SPARK-6717) Clear shuffle files after checkpointing in ALS

2016-04-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-6717: -- Shepherd: Nick Pentreath > Clear shuffle files after checkpointing in ALS >

[jira] [Updated] (SPARK-14843) Error while encoding: java.lang.ClassCastException with LibSVMRelation

2016-04-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14843: --- Component/s: SQL > Error while encoding: java.lang.ClassCastException with LibSVMRelation > -

[jira] [Created] (SPARK-14843) Error while encoding: java.lang.ClassCastException with LibSVMRelation

2016-04-22 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-14843: -- Summary: Error while encoding: java.lang.ClassCastException with LibSVMRelation Key: SPARK-14843 URL: https://issues.apache.org/jira/browse/SPARK-14843 Project: S

[jira] [Commented] (SPARK-14489) RegressionEvaluator returns NaN for ALS in Spark ml

2016-04-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253536#comment-15253536 ] Nick Pentreath commented on SPARK-14489: Is naive sampling not an option then for

[jira] [Commented] (SPARK-14812) ML 2.0 QA: API: Experimental, DeveloperApi, final, sealed audit

2016-04-22 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253508#comment-15253508 ] Nick Pentreath commented on SPARK-14812: I would like to keep ALS experimental un

[jira] [Updated] (SPARK-9656) Add missing methods to linalg.distributed

2016-04-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-9656: -- Shepherd: Nick Pentreath > Add missing methods to linalg.distributed > -

[jira] [Updated] (SPARK-13857) Feature parity for ALS ML with MLLIB

2016-04-21 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13857: --- Shepherd: (was: Nick Pentreath) > Feature parity for ALS ML with MLLIB > --

[jira] [Commented] (SPARK-14760) Feature transformers should always invoke transformSchema in transform or fit

2016-04-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251391#comment-15251391 ] Nick Pentreath commented on SPARK-14760: In general, given the name {{transformSc

[jira] [Commented] (SPARK-14760) Feature transformers should always invoke transformSchema in transform or fit

2016-04-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251372#comment-15251372 ] Nick Pentreath commented on SPARK-14760: It seems it is there for validation now,

[jira] [Comment Edited] (SPARK-14760) Feature transformers should always invoke transformSchema in transform or fit

2016-04-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250475#comment-15250475 ] Nick Pentreath edited comment on SPARK-14760 at 4/20/16 6:38 PM: --

[jira] [Commented] (SPARK-14760) Feature transformers should always invoke transformSchema in transform or fit

2016-04-20 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250475#comment-15250475 ] Nick Pentreath commented on SPARK-14760: I've noticed that most (in fact pretty m

[jira] [Commented] (SPARK-6174) Improve doc: Python ALS, MatrixFactorizationModel

2016-04-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15247288#comment-15247288 ] Nick Pentreath commented on SPARK-6174: --- [~josephkb] I think SPARK-12632 took care o

[jira] [Updated] (SPARK-6174) Improve doc: Python ALS, MatrixFactorizationModel

2016-04-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-6174: -- Component/s: (was: Documentation) > Improve doc: Python ALS, MatrixFactorizationModel >

[jira] [Closed] (SPARK-14644) Binary param can be a shared param with rewording

2016-04-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath closed SPARK-14644. -- Resolution: Won't Fix > Binary param can be a shared param with rewording > ---

[jira] [Commented] (SPARK-14644) Binary param can be a shared param with rewording

2016-04-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245265#comment-15245265 ] Nick Pentreath commented on SPARK-14644: Closed since we decided not to do this a

[jira] [Updated] (SPARK-13289) Word2Vec generate infinite distances when numIterations>5

2016-04-18 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13289: --- Shepherd: Nick Pentreath > Word2Vec generate infinite distances when numIterations>5 > --

[jira] [Updated] (SPARK-11171) PMML for Pipelines API

2016-04-15 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-11171: --- Assignee: holdenk > PMML for Pipelines API > -- > > Key:

[jira] [Commented] (SPARK-13944) Separate out local linear algebra as a standalone module without Spark dependency

2016-04-15 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242779#comment-15242779 ] Nick Pentreath commented on SPARK-13944: [~josephkb] [~mengxr] [~dbtsai] [~srowen

[jira] [Commented] (SPARK-13944) Separate out local linear algebra as a standalone module without Spark dependency

2016-04-15 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242663#comment-15242663 ] Nick Pentreath commented on SPARK-13944: Ok, fair enough on just leaving the old

[jira] [Resolved] (SPARK-14238) Add binary toggle Param to PySpark HashingTF in ML & MLlib

2016-04-14 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14238. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12079 [https:/

[jira] [Comment Edited] (SPARK-14352) approxQuantile should support multi columns

2016-04-14 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241729#comment-15241729 ] Nick Pentreath edited comment on SPARK-14352 at 4/14/16 7:02 PM: --

[jira] [Commented] (SPARK-14352) approxQuantile should support multi columns

2016-04-14 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241729#comment-15241729 ] Nick Pentreath commented on SPARK-14352: This duplicates SPARK-14432 - which did

[jira] [Resolved] (SPARK-13967) Add binary toggle Param to PySpark CountVectorizer

2016-04-14 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-13967. Resolution: Fixed Fix Version/s: 2.0.0 > Add binary toggle Param to PySpark CountVec

[jira] [Comment Edited] (SPARK-13857) Feature parity for ALS ML with MLLIB

2016-04-14 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241360#comment-15241360 ] Nick Pentreath edited comment on SPARK-13857 at 4/14/16 3:33 PM: --

[jira] [Commented] (SPARK-13857) Feature parity for ALS ML with MLLIB

2016-04-14 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241360#comment-15241360 ] Nick Pentreath commented on SPARK-13857: For now I won't do this, but later we co

[jira] [Created] (SPARK-14635) Documentation and Examples for TF-IDF only refer to HashingTF

2016-04-14 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-14635: -- Summary: Documentation and Examples for TF-IDF only refer to HashingTF Key: SPARK-14635 URL: https://issues.apache.org/jira/browse/SPARK-14635 Project: Spark

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2016-04-14 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15240797#comment-15240797 ] Nick Pentreath commented on SPARK-14409: [~yongtang] [~josephkb] it would also be

[jira] [Commented] (SPARK-14489) RegressionEvaluator returns NaN for ALS in Spark ml

2016-04-14 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15240718#comment-15240718 ] Nick Pentreath commented on SPARK-14489: +1 for having CrossValidator be able to

[jira] [Commented] (SPARK-14489) RegressionEvaluator returns NaN for ALS in Spark ml

2016-04-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239847#comment-15239847 ] Nick Pentreath commented on SPARK-14489: In the live setting you definitely want

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15238707#comment-15238707 ] Nick Pentreath commented on SPARK-14409: Given the amount of existing code in mll

[jira] [Resolved] (SPARK-3724) RandomForest: More options for feature subset size

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-3724. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11989 [https://gi

[jira] [Updated] (SPARK-3724) RandomForest: More options for feature subset size

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-3724: -- Shepherd: Nick Pentreath > RandomForest: More options for feature subset size >

[jira] [Updated] (SPARK-3724) RandomForest: More options for feature subset size

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-3724: -- Assignee: Yong Tang > RandomForest: More options for feature subset size > -

[jira] [Commented] (SPARK-13969) Extend input format that feature hashing can handle

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237098#comment-15237098 ] Nick Pentreath commented on SPARK-13969: [~josephkb] thoughts? [~sowen] I know yo

[jira] [Comment Edited] (SPARK-13857) Feature parity for ALS ML with MLLIB

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237074#comment-15237074 ] Nick Pentreath edited comment on SPARK-13857 at 4/12/16 12:28 PM: -

[jira] [Commented] (SPARK-13857) Feature parity for ALS ML with MLLIB

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237074#comment-15237074 ] Nick Pentreath commented on SPARK-13857: Do we want to support user-user and item

[jira] [Comment Edited] (SPARK-13857) Feature parity for ALS ML with MLLIB

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237067#comment-15237067 ] Nick Pentreath edited comment on SPARK-13857 at 4/12/16 12:22 PM: -

[jira] [Commented] (SPARK-13857) Feature parity for ALS ML with MLLIB

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15237067#comment-15237067 ] Nick Pentreath commented on SPARK-13857: My main point is that in cross-validatio

[jira] [Comment Edited] (SPARK-13857) Feature parity for ALS ML with MLLIB

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236796#comment-15236796 ] Nick Pentreath edited comment on SPARK-13857 at 4/12/16 9:24 AM: --

[jira] [Comment Edited] (SPARK-13857) Feature parity for ALS ML with MLLIB

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236796#comment-15236796 ] Nick Pentreath edited comment on SPARK-13857 at 4/12/16 9:20 AM: --

[jira] [Comment Edited] (SPARK-13857) Feature parity for ALS ML with MLLIB

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236796#comment-15236796 ] Nick Pentreath edited comment on SPARK-13857 at 4/12/16 9:19 AM: --

[jira] [Updated] (SPARK-14238) Add binary toggle Param to PySpark HashingTF in ML & MLlib

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14238: --- Shepherd: Nick Pentreath > Add binary toggle Param to PySpark HashingTF in ML & MLlib > -

[jira] [Updated] (SPARK-14238) Add binary toggle Param to PySpark HashingTF in ML & MLlib

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14238: --- Assignee: Yong Tang > Add binary toggle Param to PySpark HashingTF in ML & MLlib > --

[jira] [Commented] (SPARK-10574) HashingTF should use MurmurHash3

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236842#comment-15236842 ] Nick Pentreath commented on SPARK-10574: This also has the problem that {{mllib.f

[jira] [Updated] (SPARK-13967) Add binary toggle Param to PySpark CountVectorizer

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13967: --- Shepherd: Nick Pentreath > Add binary toggle Param to PySpark CountVectorizer > -

[jira] [Updated] (SPARK-13967) Add binary toggle Param to PySpark CountVectorizer

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13967: --- Assignee: Bryan Cutler > Add binary toggle Param to PySpark CountVectorizer > ---

[jira] [Commented] (SPARK-13857) Feature parity for ALS ML with MLLIB

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236796#comment-15236796 ] Nick Pentreath commented on SPARK-13857: [~mengxr] [~josephkb] In an ideal world

[jira] [Commented] (SPARK-14409) Investigate adding a RankingEvaluator to ML

2016-04-12 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15236784#comment-15236784 ] Nick Pentreath commented on SPARK-14409: Thanks for working up the design doc. I

[jira] [Commented] (SPARK-13967) Add binary toggle Param to PySpark CountVectorizer

2016-04-11 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234771#comment-15234771 ] Nick Pentreath commented on SPARK-13967: Now that SPARK-14392 is done, this is re

[jira] [Comment Edited] (SPARK-13944) Separate out local linear algebra as a standalone module without Spark dependency

2016-04-10 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234078#comment-15234078 ] Nick Pentreath edited comment on SPARK-13944 at 4/10/16 12:15 PM: -

[jira] [Commented] (SPARK-13944) Separate out local linear algebra as a standalone module without Spark dependency

2016-04-10 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234078#comment-15234078 ] Nick Pentreath commented on SPARK-13944: What about the case of {{dataFrame.rdd.m

[jira] [Commented] (SPARK-13944) Separate out local linear algebra as a standalone module without Spark dependency

2016-04-10 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234077#comment-15234077 ] Nick Pentreath commented on SPARK-13944: Type alias is a better solution if we ai

[jira] [Updated] (SPARK-14392) CountVectorizer Estimator should include binary toggle Param

2016-04-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-14392: --- Assignee: Miao Wang > CountVectorizer Estimator should include binary toggle Param >

[jira] [Resolved] (SPARK-14392) CountVectorizer Estimator should include binary toggle Param

2016-04-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-14392. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 12200 [https:/

[jira] [Commented] (SPARK-14489) RegressionEvaluator returns NaN for ALS in Spark ml

2016-04-08 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232226#comment-15232226 ] Nick Pentreath commented on SPARK-14489: This issue would also apply to any ranki

[jira] [Comment Edited] (SPARK-13944) Separate out local linear algebra as a standalone module without Spark dependency

2016-04-08 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231948#comment-15231948 ] Nick Pentreath edited comment on SPARK-13944 at 4/8/16 11:09 AM: --

[jira] [Commented] (SPARK-13944) Separate out local linear algebra as a standalone module without Spark dependency

2016-04-08 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231948#comment-15231948 ] Nick Pentreath commented on SPARK-13944: What's the reasoning behind breaking cha

[jira] [Commented] (SPARK-14433) PySpark ml GaussianMixture

2016-04-08 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231922#comment-15231922 ] Nick Pentreath commented on SPARK-14433: Go ahead > PySpark ml GaussianMixture >

<    3   4   5   6   7   8   9   10   >