[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15415 No problem, thanks! Could you please create a subtask for docs? Merging with master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73616/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73616 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73616/testReport)** for PR 15415 at commit [`9940c47`](https://github.com/apache/spark/commit/9940c4716daf47c6678fdd45abba8afa71a3e53a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15415 Sorry to miss your comments. I can send a follow-up together with document. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73616 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73616/testReport)** for PR 15415 at commit [`9940c47`](https://github.com/apache/spark/commit/9940c4716daf47c6678fdd45abba8afa71a3e53a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15415 I'm going to go ahead and merge this after tests to make sure it's in 2.2, but can you please send a follow-up for my last 2 comments? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15415 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15415 I don't think we need to support the default prediction (for empty/null inputs) now. I agree we could use an inputer or add something as an option later on. Will take a final look now --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73453/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73453 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73453/testReport)** for PR 15415 at commit [`9940c47`](https://github.com/apache/spark/commit/9940c4716daf47c6678fdd45abba8afa71a3e53a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73452/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73452 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73452/testReport)** for PR 15415 at commit [`3d7ed0b`](https://github.com/apache/spark/commit/3d7ed0ba58ec16274930448e80ad57d52430cc95). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15415 > Btw, I could imagine us wanting to change this later. If we're recommending items a user could add to their basket, then we might want to suggest the most frequent item rather than nothing. Do we need to support this now? That may not be expected for all the scenarios. And it seems an Imputer can help with the issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73453 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73453/testReport)** for PR 15415 at commit [`9940c47`](https://github.com/apache/spark/commit/9940c4716daf47c6678fdd45abba8afa71a3e53a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73452 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73452/testReport)** for PR 15415 at commit [`3d7ed0b`](https://github.com/apache/spark/commit/3d7ed0ba58ec16274930448e80ad57d52430cc95). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15415 Thanks @jkbradley for contributing the code. That helps a lot. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15415 I agree that, if the set of rules is small (1-2 GB max), then collecting and broadcasting it is best. But for larger sets of rules, we'd have to keep it distributed. I'm very surprised by the time difference in your comparison. I'll experiment a little myself and get back soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15415 Hi @jkbradley After further performance comparison, I found using broadcast would give much better performance for the transform. I tested with some public data from http://fimi.ua.ac.be/data/. For kosarak (.gz) data (300K records), the current transform would take more than 3 hours for only 2 rules, while the broadcast version only cost 0.15 sec with 900 rules. ( I adjusted support and confidence) ``` val rules = associationRules.rdd.map(r => (r.getSeq[Int](0), r.getSeq[Int](1)) ).collect() val brRules = dataset.sparkSession.sparkContext.broadcast(rules) // For each rule, examine the input items and summarize the consequents val predictUDF = udf((items: Seq[Int]) => brRules.value.flatMap( r => if (r._1.forall(items.contains(_))) r._2 else Seq.empty[Int] ).distinct) dataset.withColumn($(predictionCol), predictUDF(col($(featuresCol ``` The test can be verified by the code: https://gist.github.com/hhbyyh/06fcf3fdc8f6edda971847bcb5783d99 https://gist.github.com/hhbyyh/889b88ae2176d1263fdc9dd3e29d1c2d Thinking again, the broadcast implementation may have much better performance in any case. The major issue is how to support generic with the UDF. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73384/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73384 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73384/testReport)** for PR 15415 at commit [`bfcef4a`](https://github.com/apache/spark/commit/bfcef4a72ba1a6ee27d6099f591bb140f9c0d59e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73384 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73384/testReport)** for PR 15415 at commit [`bfcef4a`](https://github.com/apache/spark/commit/bfcef4a72ba1a6ee27d6099f591bb140f9c0d59e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15415 I tried a few different ways to implement the transform. https://gist.github.com/hhbyyh/889b88ae2176d1263fdc9dd3e29d1c2d. The performance actually are similiar, while the current one can maintain the original order of the input dataset. I would be glad to see an more optimized version. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73354/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73354 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73354/testReport)** for PR 15415 at commit [`d8e4884`](https://github.com/apache/spark/commit/d8e48846599ca3e137a2ad3e5b7eda624157ed5f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73354 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73354/testReport)** for PR 15415 at commit [`d8e4884`](https://github.com/apache/spark/commit/d8e48846599ca3e137a2ad3e5b7eda624157ed5f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15415 > wrap the old AssociationRules code Sorry, forget this comment from me; I was thinking that something like FPGrowthModel.transform had already been implemented, but it's new. Btw, I'm adding more comments now, so please hold off on changes for an hour. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15415 Hi @jkbradley We can hold the transform code. > wrap the old AssociationRules code Do you mean to make transform return the Association Rules DataFrame, like the currect `getAssociationRules` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15415 @jkbradley Sent an update to refine the transform code and address the comments. Regarding to the behavior changing concern, I think different partition strategy will only affect the overall efficiency and maybe the order of the frequent itemsets and association rules. As long as the result set does not change, I don't think it will disturb the users. Let me know if I miss anything. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73149 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73149/testReport)** for PR 15415 at commit [`dfdf85d`](https://github.com/apache/spark/commit/dfdf85d4cf26864fdbcf57d2e60153d299741197). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73149/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #73149 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73149/testReport)** for PR 15415 at commit [`dfdf85d`](https://github.com/apache/spark/commit/dfdf85d4cf26864fdbcf57d2e60153d299741197). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15415 Thanks @jkbradley . I'm also working on improving the `transform` performance and add more unit tests. I'll address the comments in a combined update. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72943/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #72943 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72943/testReport)** for PR 15415 at commit [`e141776`](https://github.com/apache/spark/commit/e1417761a9833d52ecd512099ce0003108844f0e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #72943 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72943/testReport)** for PR 15415 at commit [`e141776`](https://github.com/apache/spark/commit/e1417761a9833d52ecd512099ce0003108844f0e). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72611/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #72611 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72611/testReport)** for PR 15415 at commit [`049e1a3`](https://github.com/apache/spark/commit/049e1a326daee4c55edb6d65090fafd229b93b6a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #72611 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72611/testReport)** for PR 15415 at commit [`049e1a3`](https://github.com/apache/spark/commit/049e1a326daee4c55edb6d65090fafd229b93b6a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/15415 @aray Thanks for the comments. I'd like to collect more feedback on the current `transform` behavior. I saw your comment regarding to the necessity. Given the current ML class hierarchy (Estimator and transformer), I'd like to hear more suggestions on the expectation of `transform` in `FPGrowthModel`, or do you think we should not extend from Transformer. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72213/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #72213 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72213/testReport)** for PR 15415 at commit [`57c9437`](https://github.com/apache/spark/commit/57c943798d06299e336c184054338188d7edba32). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class AssociationRules(override val uid: String)` * ` class AssociationRulesModelWriter(instance: AssociationRulesModel) extends MLWriter ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #72213 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72213/testReport)** for PR 15415 at commit [`57c9437`](https://github.com/apache/spark/commit/57c943798d06299e336c184054338188d7edba32). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15415 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71642/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15415 **[Test build #71642 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71642/testReport)** for PR 15415 at commit [`3273b76`](https://github.com/apache/spark/commit/3273b76c3d818636a822f98ecd3df0706a4cae26). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org