[ 
https://issues.apache.org/jira/browse/MADLIB-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank McQuillan updated MADLIB-1031:
------------------------------------
    Description: 
Two potential improvements:

1) Limit itemset size:  see maxlen parameter on p. 10 of 
https://cran.r-project.org/web/packages/arules/arules.pdf
as an example.

2)  Something like a WHERE clause for LHS and RHS in order to reduce execution 
time, but still need the existence of the filtered transactions for support and 
confidence computation.   (That is you can't filter them out ahead of time 
because would skew support and confidence values.)

3) Also review general performance of the core algorithm.  Are there any 
improvements that can be made?

  was:
Two potential improvements:

1) Limit itemset size:  see maxlen parameter on p. 10 of 
https://cran.r-project.org/web/packages/arules/arules.pdf
as an example.

2)  Something like a WHERE clause for LHS and RHS in order to reduce execution 
time, but still need the existence of the filtered transactions for support and 
confidence computation.   (That is you can't filter them out ahead of time 
because would skew support and confidence values.)



> Improve performance of Apriori
> ------------------------------
>
>                 Key: MADLIB-1031
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1031
>             Project: Apache MADlib
>          Issue Type: Improvement
>          Components: Module: Association Rules
>            Reporter: Frank McQuillan
>             Fix For: v1.10
>
>
> Two potential improvements:
> 1) Limit itemset size:  see maxlen parameter on p. 10 of 
> https://cran.r-project.org/web/packages/arules/arules.pdf
> as an example.
> 2)  Something like a WHERE clause for LHS and RHS in order to reduce 
> execution time, but still need the existence of the filtered transactions for 
> support and confidence computation.   (That is you can't filter them out 
> ahead of time because would skew support and confidence values.)
> 3) Also review general performance of the core algorithm.  Are there any 
> improvements that can be made?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to