[jira] [Commented] (SPARK-14740) CrossValidatorModel.bestModel does not include hyper-parameters

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15250055#comment-15250055 ] Paul Shearer commented on SPARK-14740: -- It appears it's accessible through

[jira] [Updated] (SPARK-14740) CrossValidatorModel.bestModel does not include hyper-parameters

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-14740: - Component/s: (was: Spark Core) PySpark MLlib >

[jira] [Commented] (SPARK-13973) `ipython notebook` is going away...

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249921#comment-15249921 ] Paul Shearer commented on SPARK-13973: -- Done: https://github.com/apache/spark/pull/12528 I'm a bit

[jira] [Comment Edited] (SPARK-13973) `ipython notebook` is going away...

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249705#comment-15249705 ] Paul Shearer edited comment on SPARK-13973 at 4/20/16 12:09 PM: Bottom

[jira] [Comment Edited] (SPARK-13973) `ipython notebook` is going away...

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249705#comment-15249705 ] Paul Shearer edited comment on SPARK-13973 at 4/20/16 12:08 PM: Bottom

[jira] [Commented] (SPARK-13973) `ipython notebook` is going away...

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249705#comment-15249705 ] Paul Shearer commented on SPARK-13973: -- Bottom line... I think IPYTHON=1 should either (1) mean

[jira] [Comment Edited] (SPARK-13973) `ipython notebook` is going away...

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249656#comment-15249656 ] Paul Shearer edited comment on SPARK-13973 at 4/20/16 11:36 AM: In my

[jira] [Comment Edited] (SPARK-13973) `ipython notebook` is going away...

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249656#comment-15249656 ] Paul Shearer edited comment on SPARK-13973 at 4/20/16 11:36 AM: In my

[jira] [Comment Edited] (SPARK-13973) `ipython notebook` is going away...

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249656#comment-15249656 ] Paul Shearer edited comment on SPARK-13973 at 4/20/16 11:18 AM: Just to

[jira] [Comment Edited] (SPARK-13973) `ipython notebook` is going away...

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249656#comment-15249656 ] Paul Shearer edited comment on SPARK-13973 at 4/20/16 11:17 AM: Just to

[jira] [Commented] (SPARK-13973) `ipython notebook` is going away...

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249656#comment-15249656 ] Paul Shearer commented on SPARK-13973: -- Just to get us on the same page - pyspark is a python shell

[jira] [Comment Edited] (SPARK-13973) `ipython notebook` is going away...

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249628#comment-15249628 ] Paul Shearer edited comment on SPARK-13973 at 4/20/16 10:48 AM: The

[jira] [Commented] (SPARK-13973) `ipython notebook` is going away...

2016-04-20 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15249628#comment-15249628 ] Paul Shearer commented on SPARK-13973: -- The problem with this change is that it creates a bug for

[jira] [Updated] (SPARK-14740) CrossValidatorModel.bestModel does not include hyper-parameters

2016-04-19 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-14740: - Description: If you tune hyperparameters using a CrossValidator object in PySpark, you may not

[jira] [Updated] (SPARK-14740) CrossValidatorModel.bestModel does not include hyper-parameters

2016-04-19 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-14740: - Description: If you tune hyperparameters using a CrossValidator object in PySpark, you may not

[jira] [Updated] (SPARK-14740) CrossValidatorModel.bestModel does not include hyper-parameters

2016-04-19 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-14740: - Description: If you tune hyperparameters using a CrossValidator object in PySpark, you may not

[jira] [Updated] (SPARK-14740) CrossValidatorModel.bestModel does not include hyper-parameters

2016-04-19 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-14740: - Description: If you tune hyperparameters using a CrossValidator object in PySpark, you may not

[jira] [Updated] (SPARK-14740) CrossValidatorModel.bestModel does not include hyper-parameters

2016-04-19 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-14740: - Description: If you tune hyperparameters using a CrossValidator object in PySpark, you may not

[jira] [Created] (SPARK-14740) CrossValidatorModel.bestModel does not include hyper-parameters

2016-04-19 Thread Paul Shearer (JIRA)
Paul Shearer created SPARK-14740: Summary: CrossValidatorModel.bestModel does not include hyper-parameters Key: SPARK-14740 URL: https://issues.apache.org/jira/browse/SPARK-14740 Project: Spark

[jira] [Updated] (SPARK-14241) Output of monotonically_increasing_id lacks stable relation with rows of DataFrame

2016-03-29 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-14241: - Description: If you use monotonically_increasing_id() to append a column of IDs to a DataFrame,

[jira] [Created] (SPARK-14241) Output of monotonically_increasing_id lacks stable relation with rows of DataFrame

2016-03-29 Thread Paul Shearer (JIRA)
Paul Shearer created SPARK-14241: Summary: Output of monotonically_increasing_id lacks stable relation with rows of DataFrame Key: SPARK-14241 URL: https://issues.apache.org/jira/browse/SPARK-14241

[jira] [Updated] (SPARK-12824) Failure to maintain consistent RDD references in pyspark

2016-01-14 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12824: - Description: Below is a simple `pyspark` script that tries to split an RDD into a dictionary

[jira] [Updated] (SPARK-12824) Failure to maintain consistent RDD references in pyspark

2016-01-14 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12824: - Affects Version/s: 1.5.2 > Failure to maintain consistent RDD references in pyspark >

[jira] [Updated] (SPARK-12824) Failure to maintain consistent RDD references in pyspark

2016-01-14 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12824: - Description: Below is a simple {{pyspark}} script that tries to split an RDD into a dictionary

[jira] [Created] (SPARK-12824) Failure to maintain consistent RDD references in pyspark

2016-01-14 Thread Paul Shearer (JIRA)
Paul Shearer created SPARK-12824: Summary: Failure to maintain consistent RDD references in pyspark Key: SPARK-12824 URL: https://issues.apache.org/jira/browse/SPARK-12824 Project: Spark

[jira] [Updated] (SPARK-12824) Failure to maintain consistent RDD references in pyspark

2016-01-14 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12824: - Description: Below is a simple {{pyspark}} script that tries to split an RDD into a dictionary

[jira] [Updated] (SPARK-12824) Failure to maintain consistent RDD references in pyspark

2016-01-14 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12824: - Description: Below is a simple `pyspark` script that tries to split an RDD into a dictionary

[jira] [Updated] (SPARK-12824) Failure to maintain consistent RDD references in pyspark

2016-01-14 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12824: - Description: Below is a simple {{pyspark}} script that tries to split an RDD into a dictionary

[jira] [Resolved] (SPARK-12824) Failure to maintain consistent RDD references in pyspark

2016-01-14 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer resolved SPARK-12824. -- Resolution: Not A Problem This is not so much a Spark issue as a Python gotcha. key_value

[jira] [Updated] (SPARK-12519) "Managed memory leak detected" when using distinct on PySpark DataFrame

2015-12-24 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12519: - Description: After running the distinct() method to transform a DataFrame, subsequent actions

[jira] [Updated] (SPARK-12519) "Managed memory leak detected" when using distinct on PySpark DataFrame

2015-12-24 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12519: - Description: After running the distinct() method to transform a DataFrame, subsequent actions

[jira] [Updated] (SPARK-12519) "Managed memory leak detected" when using distinct on PySpark DataFrame

2015-12-24 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12519: - Description: After running the distinct() method to transform a DataFrame, subsequent actions

[jira] [Updated] (SPARK-12519) "Managed memory leak detected" when using distinct on PySpark DataFrame

2015-12-24 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12519: - Description: After running the distinct() method to transform a DataFrame, subsequent actions

[jira] [Created] (SPARK-12519) "Managed memory leak detected" when using distinct on PySpark DataFrame

2015-12-24 Thread Paul Shearer (JIRA)
Paul Shearer created SPARK-12519: Summary: "Managed memory leak detected" when using distinct on PySpark DataFrame Key: SPARK-12519 URL: https://issues.apache.org/jira/browse/SPARK-12519 Project:

[jira] [Updated] (SPARK-12519) "Managed memory leak detected" when using distinct on PySpark DataFrame

2015-12-24 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12519: - Description: After running the distinct() method to transform a DataFrame, subsequent actions

[jira] [Updated] (SPARK-12519) "Managed memory leak detected" when using distinct on PySpark DataFrame

2015-12-24 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12519: - Description: After running the distinct() method to transform a DataFrame, subsequent actions

[jira] [Updated] (SPARK-12519) "Managed memory leak detected" when using distinct on PySpark DataFrame

2015-12-24 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12519: - Description: After running the distinct() method to transform a DataFrame, subsequent actions

[jira] [Updated] (SPARK-12519) "Managed memory leak detected" when using distinct on PySpark DataFrame

2015-12-24 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12519: - Description: After running the distinct() method to transform a DataFrame, subsequent actions

[jira] [Updated] (SPARK-12519) "Managed memory leak detected" when using distinct on PySpark DataFrame

2015-12-24 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12519: - Description: After running the distinct() method to transform a DataFrame, subsequent actions

[jira] [Updated] (SPARK-12519) "Managed memory leak detected" when using distinct on PySpark DataFrame

2015-12-24 Thread Paul Shearer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Shearer updated SPARK-12519: - Description: After running the distinct() method to transform a DataFrame, subsequent actions