[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15843 Good point @holdenk --- @techaddict could you also please update the PR title to say "JavaWrapper" instead of "StringIndexer"? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15843 ping @davies if you have time for final review/merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15843 LGTM thanks for fixing this @techaddict :D :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/15843 @holdenk updated the description. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15843 So this change looks good to me, but it seems like it fixes more than just the bug described in the JIRA & PR description with @jkbradley's change integrated (namely the issue with param copy which we have). For people who are looking for what's changed between versions it might make sense to explain the copy related fix the PR description as well since that is what is used in the commit log. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15843 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68515/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15843 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15843 **[Test build #68515 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68515/consoleFull)** for PR 15843 at commit [`01a80b9`](https://github.com/apache/spark/commit/01a80b9d783dca7e74b717b0374305cb4376208a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15843 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68516/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15843 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15843 **[Test build #68516 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68516/consoleFull)** for PR 15843 at commit [`a76a1fb`](https://github.com/apache/spark/commit/a76a1fb1f10532e0e99592c53fbaa548279f69a8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15843 **[Test build #68516 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68516/consoleFull)** for PR 15843 at commit [`a76a1fb`](https://github.com/apache/spark/commit/a76a1fb1f10532e0e99592c53fbaa548279f69a8). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15843 **[Test build #68515 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68515/consoleFull)** for PR 15843 at commit [`01a80b9`](https://github.com/apache/spark/commit/01a80b9d783dca7e74b717b0374305cb4376208a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/15843 LGTM with minor doc comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15843 **[Test build #68514 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68514/consoleFull)** for PR 15843 at commit [`dc5aee3`](https://github.com/apache/spark/commit/dc5aee399707f712bd4627dc41719760a0db997d). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `final class ParquetLogRedirector implements Serializable ` * ` case class OutputSpec(` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15843 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68514/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15843 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15843 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/68513/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15843 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15843 **[Test build #68513 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68513/consoleFull)** for PR 15843 at commit [`3d858a2`](https://github.com/apache/spark/commit/3d858a2326809b7e1c679b712d84a8a21767d13c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15843 **[Test build #68514 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68514/consoleFull)** for PR 15843 at commit [`dc5aee3`](https://github.com/apache/spark/commit/dc5aee399707f712bd4627dc41719760a0db997d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15843 **[Test build #68513 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68513/consoleFull)** for PR 15843 at commit [`3d858a2`](https://github.com/apache/spark/commit/3d858a2326809b7e1c679b712d84a8a21767d13c). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/15843 @jkbradley looks good, merged ð --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15843 You're right! It's another bug: copy should be implemented in JavaParams, not JavaModel. I'm sending this PR to fix that: https://github.com/techaddict/spark/pull/1 Can you please check it out and merge it into your PR if it looks OK to you? All pyspark.ml tests ran successfully with it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/15843 @jkbradley yes I did it for `JavaWrapper` first, but try running tests with it gives https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/68478/consoleFull --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15843 Thanks a lot for finding & reporting this! The fix should probably go in JavaWrapper, not JavaModel, right? I tested this manually (in JavaWrapper), and it seems to fix the problematic case with StringIndexer. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark String...
Github user techaddict commented on the issue: https://github.com/apache/spark/pull/15843 cc: @jkbradley @davies @holdenk --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org