[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-28 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/15415
  
No problem, thanks!  Could you please create a subtask for docs?

Merging with master




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73616/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #73616 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73616/testReport)**
 for PR 15415 at commit 
[`9940c47`](https://github.com/apache/spark/commit/9940c4716daf47c6678fdd45abba8afa71a3e53a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-28 Thread hhbyyh
Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/15415
  
Sorry to miss your comments. I can send a follow-up together with document.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-28 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #73616 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73616/testReport)**
 for PR 15415 at commit 
[`9940c47`](https://github.com/apache/spark/commit/9940c4716daf47c6678fdd45abba8afa71a3e53a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-28 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/15415
  
I'm going to go ahead and merge this after tests to make sure it's in 2.2, 
but can you please send a follow-up for my last 2 comments?  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-28 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/15415
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/15415
  
I don't think we need to support the default prediction (for empty/null 
inputs) now.  I agree we could use an inputer or add something as an option 
later on.

Will take a final look now


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73453/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #73453 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73453/testReport)**
 for PR 15415 at commit 
[`9940c47`](https://github.com/apache/spark/commit/9940c4716daf47c6678fdd45abba8afa71a3e53a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73452/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #73452 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73452/testReport)**
 for PR 15415 at commit 
[`3d7ed0b`](https://github.com/apache/spark/commit/3d7ed0ba58ec16274930448e80ad57d52430cc95).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread hhbyyh
Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/15415
  
> Btw, I could imagine us wanting to change this later. If we're 
recommending items a user could add to their basket, then we might want to 
suggest the most frequent item rather than nothing.

Do we need to support this now? That may not be expected for all the 
scenarios. And it seems an Imputer can help with the issue. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #73453 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73453/testReport)**
 for PR 15415 at commit 
[`9940c47`](https://github.com/apache/spark/commit/9940c4716daf47c6678fdd45abba8afa71a3e53a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #73452 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73452/testReport)**
 for PR 15415 at commit 
[`3d7ed0b`](https://github.com/apache/spark/commit/3d7ed0ba58ec16274930448e80ad57d52430cc95).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread hhbyyh
Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/15415
  
Thanks @jkbradley for contributing the code. That helps a lot. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/15415
  
I agree that, if the set of rules is small (1-2 GB max), then collecting 
and broadcasting it is best.  But for larger sets of rules, we'd have to keep 
it distributed.

I'm very surprised by the time difference in your comparison.  I'll 
experiment a little myself and get back soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-24 Thread hhbyyh
Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/15415
  
Hi @jkbradley After further performance comparison, I found using broadcast 
would give much better performance for the transform.

I tested with some public data from http://fimi.ua.ac.be/data/.
For kosarak (.gz) data (300K records), the current transform would take 
more than 3 hours for only 2 rules, while the broadcast version only cost 0.15 
sec with 900 rules. ( I adjusted support and confidence)

```
val rules = associationRules.rdd.map(r =>
  (r.getSeq[Int](0), r.getSeq[Int](1))
).collect()
val brRules = dataset.sparkSession.sparkContext.broadcast(rules)

// For each rule, examine the input items and summarize the consequents
val predictUDF = udf((items: Seq[Int]) => brRules.value.flatMap( r =>
  if (r._1.forall(items.contains(_))) r._2 else Seq.empty[Int]
).distinct)
dataset.withColumn($(predictionCol), predictUDF(col($(featuresCol
```
The test can be verified by the code:
https://gist.github.com/hhbyyh/06fcf3fdc8f6edda971847bcb5783d99
https://gist.github.com/hhbyyh/889b88ae2176d1263fdc9dd3e29d1c2d

Thinking again, the broadcast implementation may have much better 
performance in any case. The major issue is how to support generic with the 
UDF. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73384/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #73384 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73384/testReport)**
 for PR 15415 at commit 
[`bfcef4a`](https://github.com/apache/spark/commit/bfcef4a72ba1a6ee27d6099f591bb140f9c0d59e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #73384 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73384/testReport)**
 for PR 15415 at commit 
[`bfcef4a`](https://github.com/apache/spark/commit/bfcef4a72ba1a6ee27d6099f591bb140f9c0d59e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread hhbyyh
Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/15415
  
I tried a few different ways to implement the transform. 
https://gist.github.com/hhbyyh/889b88ae2176d1263fdc9dd3e29d1c2d.

The performance actually are similiar, while the current one can maintain 
the original order of the input dataset. I would be glad to see an more 
optimized version.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73354/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #73354 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73354/testReport)**
 for PR 15415 at commit 
[`d8e4884`](https://github.com/apache/spark/commit/d8e48846599ca3e137a2ad3e5b7eda624157ed5f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #73354 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73354/testReport)**
 for PR 15415 at commit 
[`d8e4884`](https://github.com/apache/spark/commit/d8e48846599ca3e137a2ad3e5b7eda624157ed5f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-22 Thread jkbradley
Github user jkbradley commented on the issue:

https://github.com/apache/spark/pull/15415
  
> wrap the old AssociationRules code

Sorry, forget this comment from me; I was thinking that something like 
FPGrowthModel.transform had already been implemented, but it's new.

Btw, I'm adding more comments now, so please hold off on changes for an 
hour.  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-21 Thread hhbyyh
Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/15415
  
Hi @jkbradley 
We can hold the transform code. 

> wrap the old AssociationRules code

Do you mean to make transform return the Association Rules DataFrame, like 
the currect `getAssociationRules` ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-20 Thread hhbyyh
Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/15415
  
@jkbradley Sent an update to refine the transform code and address the 
comments.

Regarding to the behavior changing concern, I think different partition 
strategy will only affect the overall efficiency and maybe the order of the 
frequent itemsets and association rules. As long as the result set does not 
change, I don't think it will disturb the users. Let me know if I miss 
anything. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #73149 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73149/testReport)**
 for PR 15415 at commit 
[`dfdf85d`](https://github.com/apache/spark/commit/dfdf85d4cf26864fdbcf57d2e60153d299741197).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/73149/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-19 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #73149 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/73149/testReport)**
 for PR 15415 at commit 
[`dfdf85d`](https://github.com/apache/spark/commit/dfdf85d4cf26864fdbcf57d2e60153d299741197).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-17 Thread hhbyyh
Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/15415
  
Thanks @jkbradley . I'm also working on improving the `transform` 
performance and add more unit tests. I'll address the comments in a combined 
update.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72943/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #72943 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72943/testReport)**
 for PR 15415 at commit 
[`e141776`](https://github.com/apache/spark/commit/e1417761a9833d52ecd512099ce0003108844f0e).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-15 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #72943 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72943/testReport)**
 for PR 15415 at commit 
[`e141776`](https://github.com/apache/spark/commit/e1417761a9833d52ecd512099ce0003108844f0e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72611/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #72611 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72611/testReport)**
 for PR 15415 at commit 
[`049e1a3`](https://github.com/apache/spark/commit/049e1a326daee4c55edb6d65090fafd229b93b6a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-08 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #72611 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72611/testReport)**
 for PR 15415 at commit 
[`049e1a3`](https://github.com/apache/spark/commit/049e1a326daee4c55edb6d65090fafd229b93b6a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-02-01 Thread hhbyyh
Github user hhbyyh commented on the issue:

https://github.com/apache/spark/pull/15415
  
@aray Thanks for the comments. I'd like to collect more feedback on the 
current `transform` behavior. I saw your comment regarding to the necessity. 
Given the current ML class hierarchy (Estimator and transformer), I'd like to 
hear more suggestions on the expectation of `transform` in `FPGrowthModel`, or 
do you think we should not extend from Transformer. Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-01-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-01-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/72213/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-01-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #72213 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72213/testReport)**
 for PR 15415 at commit 
[`57c9437`](https://github.com/apache/spark/commit/57c943798d06299e336c184054338188d7edba32).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class AssociationRules(override val uid: String)`
  * `  class AssociationRulesModelWriter(instance: AssociationRulesModel) 
extends MLWriter `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-01-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #72213 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/72213/testReport)**
 for PR 15415 at commit 
[`57c9437`](https://github.com/apache/spark/commit/57c943798d06299e336c184054338188d7edba32).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-01-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15415
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/71642/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15415: [SPARK-14503][ML] spark.ml API for FPGrowth

2017-01-18 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/15415
  
**[Test build #71642 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/71642/testReport)**
 for PR 15415 at commit 
[`3273b76`](https://github.com/apache/spark/commit/3273b76c3d818636a822f98ecd3df0706a4cae26).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org