Only closed set is mined. One may generate all combinations themselves

sent from handheld device excuse typos
On Mar 18, 2011 7:08 AM, "Vipul Pandey (JIRA)" <[email protected]> wrote:
>
> [
https://issues.apache.org/jira/browse/MAHOUT-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008276#comment-13008276]
>
> Vipul Pandey commented on MAHOUT-617:
> -------------------------------------
>
> Looks like FPGrowth report only the closed sets , is that right?
> IN that case I may have to mine for all the frequent subsets manually?
>
>> FPGrowth/PFPGrowth giving out wrong results.
>> ---------------------------------------------
>>
>> Key: MAHOUT-617
>> URL: https://issues.apache.org/jira/browse/MAHOUT-617
>> Project: Mahout
>> Issue Type: Bug
>> Components: Frequent Itemset/Association Rule Mining
>> Affects Versions: 0.4
>> Environment: Mac OS X, Linux
>> Reporter: Vipul Pandey
>> Assignee: Robin Anil
>> Labels: AssociationMining, FPGrowth, FrequentItemsets
>> Attachments: XY, XYZ
>>
>>
>> FPGrowth reports the support of itemsets individually - in that - if Item
X appears "individually" 12 times and appears with item Y 10 times (a total
of 22 times) AND item Y appears "individually" 4 times (a total of 14 times)
then this is what the output will be (say for min-support 2)
>> 12 X
>> 10 XY
>> 4 Y
>> Instead of
>> 22 X
>> 10 XY
>> 14 Y
>> Also, because of this If the minimum support is 5 then the output will
look like :
>> 12 X
>> 10 X Y
>> Thus totally Ignoring Y
>> if the minimum support is 11 then the output will look like
>> 12 X
>> again Ignoring Y
>> if the minimum support is 13 then there will be NO output. even though
all the way along Xs support was 22 and Y's was 14
>> Even if we want to show just the maximal itemsets (although i would like
to see ALL the frequent itemsets - maximal or not) this output is wrong as
with a support of 13 we should still have seen X(22) and Y(14)
>> Now Say you add XYZ 11 times
>> for support 1 you'd see
>> 12 X
>> 10 X Y
>> 11 X Y Z
>> 4 Y
>> And for support 11 you'd see
>> 12 X
>> 11 X Y Z
>> Although I'd expect the output (for both s=1 & s=11) to be
>> 33 X
>> 25 Y
>> 21 XY
>> 11 Z
>> 11 XZ
>> 11 YZ
>> 11 XYZ
>> attached are the sample inputs:
>
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to