[ 
https://issues.apache.org/jira/browse/SPARK-8855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15047432#comment-15047432
 ] 

Joseph K. Bradley commented on SPARK-8855:
------------------------------------------

{quote}
1) Create the class "Association Rules" inside the "fpm.py" file
1.1) Method train(data, minConfidence), that will generate the association 
rules for a data with a minConfidence specified (0.6 default).  This method 
will call the "trainAssociationRules" from the PythonMLLibAPI with the 
parameters data and minConfidence. Returns a FPGrowthModel.
{quote}

This won't be needed for now since the only public API is via FPGrowthModel.  
(Users can't construct an AssociationRules instance since the constructor is 
private.)

{quote}
1.2) Class Rule, that will be a namedtuple and represents a antecedent or 
consequent tuple.
{quote}

Sounds good.

{quote}
2) Add the method generateAssociationRules to FPGrowthModel class (inside 
fpm.py). This method will map the Rules generated (calling the method 
"getAssociationRule" from FPGrowthModelWrapper) to the namedtuple.
{quote}

Sounds good, but I'd use the same name for the method as in FPGrowthModel.

{quote}
Now comes my real problem: how to make trainAssociationRules return a 
FGrowthModel to the Wrapper, so the Wrapper can map the rule received to the 
antecedent/consequent? I can't make trainAssociationRules returns a 
FPGrowthModel. The wrapper for association rules is in FPGrowthModelWrapper, 
right? Something wrong with this idea?
{quote}

I hope the above answers simplify this problem.  You'll just need to be able to 
return an RDD of Rule objects, which you could do by either (a) writing custom 
serialization or (b) passing via a DataFrame, which could handle serialization. 
 You can find examples of both code paths in the Python-Scala interface, so I'd 
do whichever is simpler.

> Python API for Association Rules
> --------------------------------
>
>                 Key: SPARK-8855
>                 URL: https://issues.apache.org/jira/browse/SPARK-8855
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Feynman Liang
>            Priority: Minor
>
> A simple Python wrapper and doctests needs to be written for Association 
> Rules. The relevant method is {{FPGrowthModel.generateAssociationRules}}. The 
> code will likely live in {{fpm.py}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to