[
https://issues.apache.org/jira/browse/MADLIB-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Frank McQuillan updated MADLIB-1031:
------------------------------------
Description:
Limit itemset size: see maxlen parameter on p. 10 of
https://cran.r-project.org/web/packages/arules/arules.pdf
This can also serve to improve performance by limiting the number of items.
This parameter will not change support or confidence computations.
was:
Two potential improvements:
1) Limit itemset size: see maxlen parameter on p. 10 of
https://cran.r-project.org/web/packages/arules/arules.pdf
as an example.
2) Something like a WHERE clause for LHS and RHS in order to reduce execution
time, but still need the existence of the filtered transactions for support and
confidence computation. (That is you can't filter them out ahead of time
because would skew support and confidence values.)
3) Also review general performance of the core algorithm. Are there any
improvements that can be made?
> Improve performance of Apriori
> ------------------------------
>
> Key: MADLIB-1031
> URL: https://issues.apache.org/jira/browse/MADLIB-1031
> Project: Apache MADlib
> Issue Type: Improvement
> Components: Module: Association Rules
> Reporter: Frank McQuillan
> Fix For: v1.10
>
>
> Limit itemset size: see maxlen parameter on p. 10 of
> https://cran.r-project.org/web/packages/arules/arules.pdf
> This can also serve to improve performance by limiting the number of items.
> This parameter will not change support or confidence computations.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)