[ 
https://issues.apache.org/jira/browse/SPARK-12999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-12999.
-------------------------------
    Resolution: Duplicate

Questions should go to user@. I think this is a duplicate of SPARK-12163 as 
you're asking about limits aside from min support.

> Guidance on adding a stopping criterion (maximul literal length or itemset 
> count) for FPGrowth
> ----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-12999
>                 URL: https://issues.apache.org/jira/browse/SPARK-12999
>             Project: Spark
>          Issue Type: Question
>    Affects Versions: 1.6.0
>            Reporter: Tomas Kliegr
>
> The absence of stopping criteria results in combinatorial explosion and hence 
> excessive run time even on small UCI datasets. Since our workflow makes it 
> difficult
> to terminate the FPGrowth job when it is running for too long and
> iteratively increase the support threshold, we would like to extend
> the SPARK FPGrowth implementation with either of the following
> stopping criteria:
> - maximum number of generated itemsets,
> - maximum length of generated itemsets (i.e. number of items).
> We would like to ask for any suggestion that could help us modify the
> implementation. 
> Having a workaround for this problem can not only make difference to our use 
> case, but through the ability to process more datasets without painful 
> support tweaking also hopefully for the Spar community. 
> This question is related to the following issue: 
> https://issues.apache.org/jira/browse/SPARK-12163



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to