[
https://issues.apache.org/jira/browse/SPARK-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071263#comment-15071263
]
Caique Rodrigues Marques commented on SPARK-8855:
-------------------------------------------------
I tried to implement, in the last days, the association rules for a custom RDD
and now I don't know what to do.
All the code can be seen
[here|https://github.com/mrcaique/spark/commit/45d221b8e04a6a006b2baab11e897be61e256167].
When I execute the train method of association rules, to get the associations
from a custom RDD (in the example, the content in {{data}}), I got a
MapPartitionsRDD. If I change the train method to something like that:
In {{PythonMLLibAPI.scala}}:
{code}
def trainAssociationRules(
data: JavaRDD[FPGrowth.FreqItemset[Any]]):
JavaRDD[AssociationRules.Rule[Any]] = {
val model = new FPGrowthModel(data)
new FPGrowthModelWrapper(model)
}
{code}
And this in {{fpm.py}}:
{code}
@classmethod
def train(cls, data, minConfidence=0.8):
model = callMLlibFunc("trainAssociationRules", data)
return FPGrowthModel(model).generateAssociationRules(minConfidence)
{code}
I will have a PythonRDD.
In both cases I can not see the content of the RDD returned -- if I use
{{ar.collect()}} in {{association_rules_example.py}} several errors happen.
Can anyone help?
Thanks.
> Python API for Association Rules
> --------------------------------
>
> Key: SPARK-8855
> URL: https://issues.apache.org/jira/browse/SPARK-8855
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Reporter: Feynman Liang
> Priority: Minor
>
> A simple Python wrapper and doctests needs to be written for Association
> Rules. The relevant method is {{FPGrowthModel.generateAssociationRules}}. The
> code will likely live in {{fpm.py}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]