[jira] [Created] (SPARK-31094) Removing redundant rules in the output of Frequent Pattern Growth Algorithm

Aditya Addepalli (Jira) Mon, 09 Mar 2020 06:01:01 -0700

Aditya Addepalli created SPARK-31094:
----------------------------------------


             Summary: Removing redundant rules in the output of Frequent 
Pattern Growth Algorithm
                 Key: SPARK-31094
                 URL: https://issues.apache.org/jira/browse/SPARK-31094
             Project: Spark
          Issue Type: Brainstorming
          Components: ML
    Affects Versions: 2.4.5
            Reporter: Aditya Addepalli


Will implement the is.redundant() function similar to the one here: 
[https://rdrr.io/cran/arules/man/is.redundant.html]

By definition:

A rule is redundant if a more general rules with the same or a higher 
confidence exists. That is, a more specific rule is redundant if it is only 
equally or even less predictive than a more general rule.

As FP Growth is an exhaustive algorithm, many of the rules it produces are 
redundant. Therefore there is merit in implementing this function to spark. 
This not only reduces the total number of rules produced in the output, but 
also produces better rules.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-31094) Removing redundant rules in the output of Frequent Pattern Growth Algorithm

Reply via email to