[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/15843
  
Good point @holdenk --- @techaddict could you also please update the PR 
title to say "JavaWrapper" instead of "StringIndexer"?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/15843
  
ping @davies if you have time for final review/merge?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/15843
  
LGTM thanks for fixing this @techaddict :D :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread techaddict
Github user techaddict commented on the issue:

https://github.com/apache/spark/pull/15843
  
@holdenk updated the description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread holdenk
Github user holdenk commented on the issue:

https://github.com/apache/spark/pull/15843
  
So this change looks good to me, but it seems like it fixes more than just 
the bug described in the JIRA & PR description with @jkbradley's change 
integrated (namely the issue with param copy which we have). For people who are 
looking for what's changed between versions it might make sense to explain the 
copy related fix the PR description as well since that is what is used in the 
commit log.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15843
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68515/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15843
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15843
  
**[Test build #68515 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68515/consoleFull)**
 for PR 15843 at commit 
[`01a80b9`](https://github.com/apache/spark/commit/01a80b9d783dca7e74b717b0374305cb4376208a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15843
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68516/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15843
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15843
  
**[Test build #68516 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68516/consoleFull)**
 for PR 15843 at commit 
[`a76a1fb`](https://github.com/apache/spark/commit/a76a1fb1f10532e0e99592c53fbaa548279f69a8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15843
  
**[Test build #68516 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68516/consoleFull)**
 for PR 15843 at commit 
[`a76a1fb`](https://github.com/apache/spark/commit/a76a1fb1f10532e0e99592c53fbaa548279f69a8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15843
  
**[Test build #68515 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68515/consoleFull)**
 for PR 15843 at commit 
[`01a80b9`](https://github.com/apache/spark/commit/01a80b9d783dca7e74b717b0374305cb4376208a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/15843
  
LGTM with minor doc comment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15843
  
**[Test build #68514 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68514/consoleFull)**
 for PR 15843 at commit 
[`dc5aee3`](https://github.com/apache/spark/commit/dc5aee399707f712bd4627dc41719760a0db997d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `final class ParquetLogRedirector implements Serializable `
  * `  case class OutputSpec(`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15843
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68514/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15843
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15843
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68513/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15843
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15843
  
**[Test build #68513 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68513/consoleFull)**
 for PR 15843 at commit 
[`3d858a2`](https://github.com/apache/spark/commit/3d858a2326809b7e1c679b712d84a8a21767d13c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-11 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15843
  
**[Test build #68514 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68514/consoleFull)**
 for PR 15843 at commit 
[`dc5aee3`](https://github.com/apache/spark/commit/dc5aee399707f712bd4627dc41719760a0db997d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-10 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15843
  
**[Test build #68513 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68513/consoleFull)**
 for PR 15843 at commit 
[`3d858a2`](https://github.com/apache/spark/commit/3d858a2326809b7e1c679b712d84a8a21767d13c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-10 Thread techaddict
Github user techaddict commented on the issue:

https://github.com/apache/spark/pull/15843
  
@jkbradley looks good, merged 👍 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-10 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/15843
  
You're right!  It's another bug: copy should be implemented in JavaParams, 
not JavaModel.  I'm sending this PR to fix that: 
https://github.com/techaddict/spark/pull/1

Can you please check it out and merge it into your PR if it looks OK to 
you?  All pyspark.ml tests ran successfully with it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-10 Thread techaddict
Github user techaddict commented on the issue:

https://github.com/apache/spark/pull/15843
  
@jkbradley yes I did it for `JavaWrapper` first, but try running tests with 
it gives 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68478/consoleFull


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-10 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/15843
  
Thanks a lot for finding & reporting this!  The fix should probably go in 
JavaWrapper, not JavaModel, right?

I tested this manually (in JavaWrapper), and it seems to fix the 
problematic case with StringIndexer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...

2016-11-10 Thread techaddict
Github user techaddict commented on the issue:

https://github.com/apache/spark/pull/15843
  
cc: @jkbradley @davies @holdenk 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org