[
https://issues.apache.org/jira/browse/MAHOUT-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005770#comment-13005770
]
Ted Dunning commented on MAHOUT-625:
------------------------------------
Can you use a portable/standard compressor for this attachment? 7-zip is not
widely used.
Try zip or tar.gz. Or bzip.
> Some of generated patterns have support higher than in reality
> --------------------------------------------------------------
>
> Key: MAHOUT-625
> URL: https://issues.apache.org/jira/browse/MAHOUT-625
> Project: Mahout
> Issue Type: Bug
> Components: Frequent Itemset/Association Rule Mining
> Affects Versions: 0.4
> Reporter: Jaroslaw Odzga
> Priority: Critical
> Attachments: mahout-test.7z
>
>
> It turnes out that some of generated patterns have incorrect support. The
> returned support is slightly higher than the true one.
> I attached the test, which proves that FPGrowth has a bug. Test is using data
> (retail) found here: http://fimi.ua.ac.be/data/
> The pattern (36, 39, 41) occurs in the transactions 572 times (this is also
> calculated in test), but the FPGrowth returns pattern (36, 39, 41) with
> support 573.
> Please note that mentioned pattern is not the only one with incorrect support
> - the test only point out one example to hace something to focus on. There is
> plenty more patterns with support higher than the real one. The biggest
> difference I noticed was support 8 higher than the real one for one of
> patterns.
> Please find attached failing unit test - it's actually a maven project, which
> contains test data and is ready to run.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira