[ https://issues.apache.org/jira/browse/SPARK-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-12999. ------------------------------- Resolution: Duplicate Questions should go to user@. I think this is a duplicate of SPARK-12163 as you're asking about limits aside from min support. > Guidance on adding a stopping criterion (maximul literal length or itemset > count) for FPGrowth > ---------------------------------------------------------------------------------------------- > > Key: SPARK-12999 > URL: https://issues.apache.org/jira/browse/SPARK-12999 > Project: Spark > Issue Type: Question > Affects Versions: 1.6.0 > Reporter: Tomas Kliegr > > The absence of stopping criteria results in combinatorial explosion and hence > excessive run time even on small UCI datasets. Since our workflow makes it > difficult > to terminate the FPGrowth job when it is running for too long and > iteratively increase the support threshold, we would like to extend > the SPARK FPGrowth implementation with either of the following > stopping criteria: > - maximum number of generated itemsets, > - maximum length of generated itemsets (i.e. number of items). > We would like to ask for any suggestion that could help us modify the > implementation. > Having a workaround for this problem can not only make difference to our use > case, but through the ability to process more datasets without painful > support tweaking also hopefully for the Spar community. > This question is related to the following issue: > https://issues.apache.org/jira/browse/SPARK-12163 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org