GitHub user hhbyyh opened a pull request:

    https://github.com/apache/spark/pull/13656

    [SPARK-15938]Adding "support" property to MLlib Association Rule

    ## What changes were proposed in this pull request?
    jira: https://issues.apache.org/jira/browse/SPARK-15938
    
    Support is an indication of how frequently the item-set appears in the 
database. Besides confidence, "Support" is another critical property for 
Association rule. 
    References: 
    https://en.wikipedia.org/wiki/Association_rule_learning
    
http://www.philippe-fournier-viger.com/spmf/index.php?link=documentation.php#allassociationrules
    https://www-users.cs.umn.edu/~kumar/dmbook/ch6.pdf
    Support can be either the count of appearances or the fraction within the 
dataset. I choose to use the count as:
    1. API compatibility: Currently both FPGrowthModel and Association Rule 
does not have the information about size of the dataset. I'd try to avoid 
breaking a list of public APIs.
    2. This also refers to the API of SPMF. 
http://www.philippe-fournier-viger.com/spmf/index.php?link=documentation.php#allassociationrules.
    In the next steps, we could add constraint like minSupport as in other 
libraries. FPGrowthModel should also contains the size of the dataset.
    
    
    ## How was this patch tested?
    existing ut. 
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hhbyyh/spark supportAsso

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13656.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13656
    
----
commit 60efd0520a3af52995c2d6b1a2abaeebe658bb32
Author: Yuhao Yang <[email protected]>
Date:   2016-06-14T06:27:21Z

    add support for association rule

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to