[jira] [Commented] (SPARK-12299) Remove history serving functionality from standalone Master

2016-01-18 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15105845#comment-15105845 ] Bryan Cutler commented on SPARK-12299: -- I'd be happy to work on this since I recently made some

[jira] [Created] (SPARK-16231) PySpark ML DataFrame example fails on Vector conversion

2016-06-27 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-16231: Summary: PySpark ML DataFrame example fails on Vector conversion Key: SPARK-16231 URL: https://issues.apache.org/jira/browse/SPARK-16231 Project: Spark

[jira] [Created] (SPARK-16197) Cleanup PySpark status api and example

2016-06-24 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-16197: Summary: Cleanup PySpark status api and example Key: SPARK-16197 URL: https://issues.apache.org/jira/browse/SPARK-16197 Project: Spark Issue Type:

[jira] [Commented] (SPARK-12731) PySpark docstring cleanup

2016-02-01 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126886#comment-15126886 ] Bryan Cutler commented on SPARK-12731: -- Just to add my 2cents since I've been working on a similar

[jira] [Resolved] (SPARK-13500) Add an example for LDA in PySpark

2016-02-26 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-13500. -- Resolution: Duplicate this example and others are being added as part of this > Add an

[jira] [Commented] (SPARK-13500) Add an example for LDA in PySpark

2016-02-25 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168021#comment-15168021 ] Bryan Cutler commented on SPARK-13500: -- I'm working on it :D > Add an example for LDA in PySpark >

[jira] [Created] (SPARK-13500) Add an example for LDA in PySpark

2016-02-25 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-13500: Summary: Add an example for LDA in PySpark Key: SPARK-13500 URL: https://issues.apache.org/jira/browse/SPARK-13500 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-11219) Make Parameter Description Format Consistent in PySpark.MLlib

2016-02-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-11219. -- Resolution: Done Fix Version/s: 2.0.0 > Make Parameter Description Format Consistent in

[jira] [Commented] (SPARK-13430) Expose ml summary function in PySpark for classification and regression models

2016-02-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172775#comment-15172775 ] Bryan Cutler commented on SPARK-13430: -- I can work on adding this > Expose ml summary function in

[jira] [Commented] (SPARK-11219) Make Parameter Description Format Consistent in PySpark.MLlib

2016-01-25 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15115756#comment-15115756 ] Bryan Cutler commented on SPARK-11219: -- Regarding overall style in PySpark, I generally see single

[jira] [Commented] (SPARK-12986) Fix pydoc warnings in mllib/regression.py

2016-01-25 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15116153#comment-15116153 ] Bryan Cutler commented on SPARK-12986: -- It looks like this is caused by an indented line not being

[jira] [Commented] (SPARK-9844) File appender race condition during SparkWorker shutdown

2016-02-17 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150913#comment-15150913 ] Bryan Cutler commented on SPARK-9844: - This error is benign for the most part, once it gets here, the

[jira] [Updated] (SPARK-10086) Flaky StreamingKMeans test in PySpark

2016-02-12 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-10086: - Attachment: flakyRepro.py Simple script with similar operations to this StreamingKMeans test,

[jira] [Comment Edited] (SPARK-10086) Flaky StreamingKMeans test in PySpark

2016-02-12 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15145628#comment-15145628 ] Bryan Cutler edited comment on SPARK-10086 at 2/13/16 12:44 AM: Simple

[jira] [Commented] (SPARK-10086) Flaky StreamingKMeans test in PySpark

2016-02-12 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15145625#comment-15145625 ] Bryan Cutler commented on SPARK-10086: -- I was able to track down the cause of these failures, so

[jira] [Commented] (SPARK-13963) Add binary toggle Param to ml.HashingTF

2016-03-18 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1520#comment-1520 ] Bryan Cutler commented on SPARK-13963: -- Hi [~mlnick], mind if I work on this? > Add binary toggle

[jira] [Created] (SPARK-13937) PySpark ML JavaWrapper, variable _java_obj should not be static

2016-03-18 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-13937: Summary: PySpark ML JavaWrapper, variable _java_obj should not be static Key: SPARK-13937 URL: https://issues.apache.org/jira/browse/SPARK-13937 Project: Spark

[jira] [Commented] (SPARK-13967) Add binary toggle Param to PySpark CountVectorizer

2016-03-18 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201803#comment-15201803 ] Bryan Cutler commented on SPARK-13967: -- Sure, I'd like to do this - thanks! > Add binary toggle

[jira] [Commented] (SPARK-14472) Cleanup PySpark-ML Java wrapper classes so that JavaWrapper will inherit from JavaCallable

2016-04-07 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231320#comment-15231320 ] Bryan Cutler commented on SPARK-14472: -- I'm working on it :D > Cleanup PySpark-ML Java wrapper

[jira] [Created] (SPARK-14472) Cleanup PySpark-ML Java wrapper classes so that JavaWrapper will inherit from JavaCallable

2016-04-07 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-14472: Summary: Cleanup PySpark-ML Java wrapper classes so that JavaWrapper will inherit from JavaCallable Key: SPARK-14472 URL: https://issues.apache.org/jira/browse/SPARK-14472

[jira] [Commented] (SPARK-10086) Flaky StreamingKMeans test in PySpark

2016-04-08 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232888#comment-15232888 ] Bryan Cutler commented on SPARK-10086: -- The changes to the test I proposed earlier are still valid,

[jira] [Commented] (SPARK-13691) Scala and Python generate inconsistent results

2016-03-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197775#comment-15197775 ] Bryan Cutler commented on SPARK-13691: -- Since the problem comes from the structure of the code in

[jira] [Commented] (SPARK-13937) PySpark ML JavaWrapper, variable _java_obj should not be static

2016-03-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197805#comment-15197805 ] Bryan Cutler commented on SPARK-13937: -- I'll submit a PR for this > PySpark ML JavaWrapper,

[jira] [Created] (SPARK-14087) PySpark ML JavaModel does not properly own params after being fit

2016-03-22 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-14087: Summary: PySpark ML JavaModel does not properly own params after being fit Key: SPARK-14087 URL: https://issues.apache.org/jira/browse/SPARK-14087 Project: Spark

[jira] [Commented] (SPARK-14087) PySpark ML JavaModel does not properly own params after being fit

2016-03-22 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207555#comment-15207555 ] Bryan Cutler commented on SPARK-14087: -- I can post a PR for this > PySpark ML JavaModel does not

[jira] [Updated] (SPARK-14087) PySpark ML JavaModel does not properly own params after being fit

2016-03-22 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-14087: - Attachment: feature.py > PySpark ML JavaModel does not properly own params after being fit >

[jira] [Updated] (SPARK-13625) PySpark-ML method to get list of params for an obj should not check property attr

2016-03-02 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-13625: - Description: In PySpark params.__init__.py, the method {{Param.params()}} returns a list of

[jira] [Commented] (SPARK-13602) o.a.s.deploy.worker.DriverRunner may leak the driver processes

2016-03-02 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176787#comment-15176787 ] Bryan Cutler commented on SPARK-13602: -- Hi [~zsxwing], mind if I work on this one? >

[jira] [Updated] (SPARK-13625) PySpark-ML method to get list of params for an obj should not check property attr

2016-03-02 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-13625: - Description: In PySpark params.__init__.py, the method {{Param.params()}} returns a list of

[jira] [Updated] (SPARK-13625) PySpark-ML method to get list of params for an obj should not check property attr

2016-03-02 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-13625: - Description: In PySpark params.__init__.py, the method {{Param.params()}} returns a list of

[jira] [Commented] (SPARK-13625) PySpark-ML method to get list of params for an obj should not check property attr

2016-03-02 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15176655#comment-15176655 ] Bryan Cutler commented on SPARK-13625: -- I have a fix for this, will post PR soon > PySpark-ML

[jira] [Created] (SPARK-13625) PySpark-ML method to get list of params for an obj should not check property attr

2016-03-02 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-13625: Summary: PySpark-ML method to get list of params for an obj should not check property attr Key: SPARK-13625 URL: https://issues.apache.org/jira/browse/SPARK-13625

[jira] [Commented] (SPARK-13602) o.a.s.deploy.worker.DriverRunner may leak the driver processes

2016-03-02 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177169#comment-15177169 ] Bryan Cutler commented on SPARK-13602: -- Great! Thanks :D > o.a.s.deploy.worker.DriverRunner may

[jira] [Commented] (SPARK-13691) Scala and Python generate inconsistent results

2016-03-07 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183438#comment-15183438 ] Bryan Cutler commented on SPARK-13691: -- The reason for this is that Pyspark serializes the closure

[jira] [Commented] (SPARK-14087) PySpark ML JavaModel does not properly own params after being fit

2016-04-04 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225326#comment-15225326 ] Bryan Cutler commented on SPARK-14087: -- I don't think this would completely solve it, please see my

[jira] [Commented] (SPARK-15448) Flaky test:pyspark.ml.tests.DefaultValuesTests.test_java_params

2016-05-20 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15293920#comment-15293920 ] Bryan Cutler commented on SPARK-15448: -- I believe this was recently fixed in SPARK-15444 > Flaky

[jira] [Commented] (SPARK-15391) Spark executor OOM during TimSort

2016-05-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291582#comment-15291582 ] Bryan Cutler commented on SPARK-15391: -- this looks to be a duplicate of SPARK-15332 > Spark

[jira] [Commented] (SPARK-15456) PySpark Shell fails to create SparkContext if HiveConf not found

2016-05-20 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294446#comment-15294446 ] Bryan Cutler commented on SPARK-15456: -- I can submit a fix for this > PySpark Shell fails to create

[jira] [Created] (SPARK-15456) PySpark Shell fails to create SparkContext if HiveConf not found

2016-05-20 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-15456: Summary: PySpark Shell fails to create SparkContext if HiveConf not found Key: SPARK-15456 URL: https://issues.apache.org/jira/browse/SPARK-15456 Project: Spark

[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-16 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285366#comment-15285366 ] Bryan Cutler commented on SPARK-15100: -- I can do a PR to update CountVectorizer and HashingTF >

[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-18 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289350#comment-15289350 ] Bryan Cutler commented on SPARK-15100: -- I did a quick pass through Scala and Python APIs, just found

[jira] [Commented] (SPARK-15100) Audit: ml.feature

2016-05-18 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289402#comment-15289402 ] Bryan Cutler commented on SPARK-15100: -- sure, I hadn't started on those yet > Audit: ml.feature >

[jira] [Commented] (SPARK-15009) PySpark CountVectorizerModel should be able to construct from vocabulary list

2016-04-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264515#comment-15264515 ] Bryan Cutler commented on SPARK-15009: -- I'm working on this > PySpark CountVectorizerModel should

[jira] [Created] (SPARK-15009) PySpark CountVectorizerModel should be able to construct from vocabulary list

2016-04-29 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-15009: Summary: PySpark CountVectorizerModel should be able to construct from vocabulary list Key: SPARK-15009 URL: https://issues.apache.org/jira/browse/SPARK-15009

[jira] [Created] (SPARK-15018) PySpark ML Pipeline fails when no stages set

2016-04-29 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-15018: Summary: PySpark ML Pipeline fails when no stages set Key: SPARK-15018 URL: https://issues.apache.org/jira/browse/SPARK-15018 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-15018) PySpark ML Pipeline fails when no stages set

2016-04-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264858#comment-15264858 ] Bryan Cutler commented on SPARK-15018: -- I have a fix for this > PySpark ML Pipeline fails when no

[jira] [Created] (SPARK-14779) Incorrect log message in Worker while handling KillExecutor message

2016-04-20 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-14779: Summary: Incorrect log message in Worker while handling KillExecutor message Key: SPARK-14779 URL: https://issues.apache.org/jira/browse/SPARK-14779 Project: Spark

[jira] [Commented] (SPARK-15497) DecisionTreeClassificationModel can't be saved within in Pipeline caused by not implement Writable

2016-05-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15298837#comment-15298837 ] Bryan Cutler commented on SPARK-15497: -- This was added in SPARK-11888 and will be in Spark 2.0. >

[jira] [Resolved] (SPARK-15497) DecisionTreeClassificationModel can't be saved within in Pipeline caused by not implement Writable

2016-05-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-15497. -- Resolution: Duplicate Fix Version/s: 2.0.0 > DecisionTreeClassificationModel can't be

[jira] [Updated] (SPARK-16197) Cleanup PySpark status api and example

2016-07-22 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-16197: - Description: Cleanup of Status API example to use SparkSession and be more consistent with

[jira] [Updated] (SPARK-16800) Fix Java Examples that throw exception

2016-07-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-16800: - Description: Some Java examples fail to run due to an exception thrown when using

[jira] [Updated] (SPARK-16800) Fix Java Examples that throw exception

2016-07-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-16800: - Description: Some Java examples fail to run due to an exception thrown when using

[jira] [Commented] (SPARK-16832) CrossValidator and TrainValidationSplit are not random without seed

2016-08-01 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403067#comment-15403067 ] Bryan Cutler commented on SPARK-16832: -- The default seed value is a constant, this is the trait

[jira] [Commented] (SPARK-16421) Improve output from ML examples

2016-07-18 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382579#comment-15382579 ] Bryan Cutler commented on SPARK-16421: -- Yeah, I'm working on it now > Improve output from ML

[jira] [Updated] (SPARK-16403) Example cleanup and fix minor issues

2016-07-11 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-16403: - Description: General cleanup of examples, focused on PySpark ML, to remove unused imports, sync

[jira] [Resolved] (SPARK-14087) PySpark ML JavaModel does not properly own params after being fit

2016-07-11 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-14087. -- Resolution: Resolved Fix Version/s: 2.0.0 This is no longer an issue as the PySpark

[jira] [Created] (SPARK-16260) PySpark ML Example Improvements and Cleanup

2016-06-28 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-16260: Summary: PySpark ML Example Improvements and Cleanup Key: SPARK-16260 URL: https://issues.apache.org/jira/browse/SPARK-16260 Project: Spark Issue Type:

[jira] [Created] (SPARK-16261) Fix Incorrect appNames in PySpark ML Examples

2016-06-28 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-16261: Summary: Fix Incorrect appNames in PySpark ML Examples Key: SPARK-16261 URL: https://issues.apache.org/jira/browse/SPARK-16261 Project: Spark Issue Type:

[jira] [Commented] (SPARK-12428) Write a script to run all PySpark MLlib examples for testing

2016-06-28 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15353488#comment-15353488 ] Bryan Cutler commented on SPARK-12428: -- Hey Holden, I was thinking about doing this anyway and found

[jira] [Commented] (SPARK-16247) Using pyspark dataframe with pipeline and cross validator

2016-06-28 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15353555#comment-15353555 ] Bryan Cutler commented on SPARK-16247: -- I'm not sure if this is the issue, but the first parameter

[jira] [Commented] (SPARK-16260) PySpark ML Example Improvements and Cleanup

2016-07-05 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363135#comment-15363135 ] Bryan Cutler commented on SPARK-16260: -- I have a couple tasks I still plan to add here, I will close

[jira] [Created] (SPARK-16421) Improve output from ML examples

2016-07-07 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-16421: Summary: Improve output from ML examples Key: SPARK-16421 URL: https://issues.apache.org/jira/browse/SPARK-16421 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-16421) Improve output from ML examples

2016-07-07 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15366461#comment-15366461 ] Bryan Cutler commented on SPARK-16421: -- I'll be working on this once the blocking issue is resolved,

[jira] [Updated] (SPARK-16421) Improve output from ML examples

2016-07-07 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-16421: - Issue Type: Sub-task (was: Improvement) Parent: SPARK-16260 > Improve output from ML

[jira] [Commented] (SPARK-16403) Example cleanup and fix minor issues

2016-07-06 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365174#comment-15365174 ] Bryan Cutler commented on SPARK-16403: -- I'm working on this > Example cleanup and fix minor issues

[jira] [Updated] (SPARK-16403) Example cleanup and fix minor issues

2016-07-06 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-16403: - Priority: Trivial (was: Major) > Example cleanup and fix minor issues >

[jira] [Created] (SPARK-16403) Example cleanup and fix minor issues

2016-07-06 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-16403: Summary: Example cleanup and fix minor issues Key: SPARK-16403 URL: https://issues.apache.org/jira/browse/SPARK-16403 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-16403) Example cleanup and fix minor issues

2016-07-06 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-16403: - Description: General cleanup of examples, focused on PySpark ML, to remove unused imports, sync

[jira] [Commented] (SPARK-15623) 2.0 python coverage ml.feature

2016-07-11 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371229#comment-15371229 ] Bryan Cutler commented on SPARK-15623: -- Hey [~holdenk], think I can close this off now or would you

[jira] [Commented] (SPARK-16832) CrossValidator and TrainValidationSplit are not random without seed

2016-08-04 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408190#comment-15408190 ] Bryan Cutler commented on SPARK-16832: -- Yeah, I'm not sure of the reason myself, but I agree with

[jira] [Updated] (SPARK-16260) ML Example Improvements and Cleanup

2016-08-05 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-16260: - Summary: ML Example Improvements and Cleanup (was: PySpark ML Example Improvements and Cleanup)

[jira] [Reopened] (SPARK-15702) Update document programming-guide accumulator section

2016-08-05 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reopened SPARK-15702: -- I'm reopening this because I think the current programming guide accumulator section is

[jira] [Updated] (SPARK-16260) ML Example Improvements and Cleanup

2016-08-05 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-16260: - Description: This parent task is to track a few possible improvements and cleanup for PySpark

[jira] [Created] (SPARK-16932) Programming-guide Accumulator section should be more clear w.r.t new API

2016-08-05 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-16932: Summary: Programming-guide Accumulator section should be more clear w.r.t new API Key: SPARK-16932 URL: https://issues.apache.org/jira/browse/SPARK-16932 Project:

[jira] [Closed] (SPARK-15702) Update document programming-guide accumulator section

2016-08-05 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler closed SPARK-15702. Resolution: Fixed > Update document programming-guide accumulator section >

[jira] [Commented] (SPARK-16765) Add Pipeline API example for KMeans

2016-07-28 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397994#comment-15397994 ] Bryan Cutler commented on SPARK-16765: -- Was there some specific use of Pipelines with KMeans that

[jira] [Created] (SPARK-16800) Fix Java Examples that throw exception

2016-07-29 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-16800: Summary: Fix Java Examples that throw exception Key: SPARK-16800 URL: https://issues.apache.org/jira/browse/SPARK-16800 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-16247) Using pyspark dataframe with pipeline and cross validator

2016-06-30 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357548#comment-15357548 ] Bryan Cutler commented on SPARK-16247: -- Great, glad that solved the problem! A cross-validation

[jira] [Commented] (SPARK-16247) Using pyspark dataframe with pipeline and cross validator

2016-06-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355456#comment-15355456 ] Bryan Cutler commented on SPARK-16247: -- I think you need to specify the {labelCol} in

[jira] [Comment Edited] (SPARK-16247) Using pyspark dataframe with pipeline and cross validator

2016-06-29 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15355456#comment-15355456 ] Bryan Cutler edited comment on SPARK-16247 at 6/29/16 3:53 PM: --- I think you

[jira] [Commented] (SPARK-15009) PySpark CountVectorizerModel should be able to construct from vocabulary list

2016-07-03 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360782#comment-15360782 ] Bryan Cutler commented on SPARK-15009: -- At the time I reported this, it was blocked by SPARK-14087

[jira] [Commented] (SPARK-19282) RandomForestRegressionModel should expose getMaxDepth

2017-02-01 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848963#comment-15848963 ] Bryan Cutler commented on SPARK-19282: -- Is this another case of SPARK-10931? The PySpark ML models

[jira] [Commented] (SPARK-18813) MLlib 2.2 Roadmap

2017-02-01 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849132#comment-15849132 ] Bryan Cutler commented on SPARK-18813: -- It would be nice to get some of the Python param/uid issues

[jira] [Commented] (SPARK-19348) pyspark.ml.Pipeline gets corrupted under multi threaded use

2017-02-02 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15850705#comment-15850705 ] Bryan Cutler commented on SPARK-19348: -- The problem here is with the @keyword_only decorator that is

[jira] [Commented] (SPARK-19071) Optimizations for ML Pipeline Tuning

2017-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836958#comment-15836958 ] Bryan Cutler commented on SPARK-19071: -- [~cyp] thanks for your interest. I agree with Nick that

[jira] [Commented] (SPARK-19071) Optimizations for ML Pipeline Tuning

2017-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836957#comment-15836957 ] Bryan Cutler commented on SPARK-19071: -- Thanks for the comments [~josephkb] and [~mlnick]. I'll

[jira] [Commented] (SPARK-19357) Parallel Model Evaluation for ML Pipeline Tuning

2017-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836977#comment-15836977 ] Bryan Cutler commented on SPARK-19357: -- I'm working on this > Parallel Model Evaluation for ML

[jira] [Created] (SPARK-19357) Parallel Model Evaluation for ML Pipeline Tuning

2017-01-24 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-19357: Summary: Parallel Model Evaluation for ML Pipeline Tuning Key: SPARK-19357 URL: https://issues.apache.org/jira/browse/SPARK-19357 Project: Spark Issue Type:

[jira] [Updated] (SPARK-19357) Parallel Model Evaluation for ML Tuning

2017-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-19357: - Summary: Parallel Model Evaluation for ML Tuning (was: Parallel Model Evaluation for ML

[jira] [Updated] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-13534: - Attachment: benchmark.py Script for benchmarks > Implement Apache Arrow serializer for Spark

[jira] [Commented] (SPARK-19216) LogisticRegressionModel is missing getThreshold()

2017-01-17 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826632#comment-15826632 ] Bryan Cutler commented on SPARK-19216: -- This is a valid issue, but is sort of a duplicate of

[jira] [Updated] (SPARK-15018) PySpark ML Pipeline raises unclear error when no stages set

2016-08-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-15018: - Description: When fitting a PySpark Pipeline with no stages, it should work as an identity

[jira] [Updated] (SPARK-15018) PySpark ML Pipeline fails when no stages set

2016-08-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-15018: - Issue Type: Improvement (was: Bug) > PySpark ML Pipeline fails when no stages set >

[jira] [Updated] (SPARK-15018) PySpark ML Pipeline raises unclear error when no stages set

2016-08-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-15018: - Summary: PySpark ML Pipeline raises unclear error when no stages set (was: PySpark ML Pipeline

[jira] [Updated] (SPARK-15018) PySpark ML Pipeline raises unclear error when no stages set

2016-08-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-15018: - Description: When fitting a PySpark Pipeline with no stages, it should work as an identity

[jira] [Updated] (SPARK-15018) PySpark ML Pipeline fails when no stages set

2016-08-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-15018: - Priority: Minor (was: Major) > PySpark ML Pipeline fails when no stages set >

[jira] [Resolved] (SPARK-16197) Cleanup PySpark status api and example

2016-08-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler resolved SPARK-16197. -- Resolution: Won't Fix This minor change is would be better addressed during a QA audit >

[jira] [Updated] (SPARK-17161) Add PySpark-ML JavaWrapper convenience function to create py4j JavaArrays

2016-08-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-17161: - Summary: Add PySpark-ML JavaWrapper convenience function to create py4j JavaArrays (was: Add

[jira] [Created] (SPARK-17161) Add PySpark-ML JavaWrapper convienience function to create py4j JavaArrays

2016-08-19 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-17161: Summary: Add PySpark-ML JavaWrapper convienience function to create py4j JavaArrays Key: SPARK-17161 URL: https://issues.apache.org/jira/browse/SPARK-17161 Project:

[jira] [Commented] (SPARK-17508) Setting weightCol to None in ML library causes an error

2016-09-14 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491163#comment-15491163 ] Bryan Cutler commented on SPARK-17508: -- I had a similar discussion in this PR

<    1   2   3   4   5   6   7   8   >