Hi all! I'm trying to run PFPgrowth on my data and this is an output I get. (Please note that I parse the output in frequentpatterns folder and generate this output with the support followed by the itemset)
support : Itemset *234 1518311 1476937 * 235 55843184 238 1238079 244 34541 247 4516454 252 106478 252 670864 *254 1476937 1518311 * You can see that two items are reported twice (*1518311 1476937*) with different supports. And below are all the occurance of these two items together .... if you notice it has all the permutations of the three items (*1476937* *720020* * 1518311* ) 22 *1476937* 720020 *1518311* 30 *1518311* *1476937* 720020 30 720020 *1518311* *1476937* 34 720020 *1476937* *1518311* 38 *1518311* 720020 *1476937* 42 *1476937* *1518311* 720020 234 *1518311* *1476937* 254 *1476937* *1518311* Does this mean if I have to get the support of just the the pair (*1476937* *1518311* ) I will have to add all of them up !? Even in that case ... this total comes out to *684* and if I count the number of co-ocurrances of these two items in the original baskets the support is *766*? Why's there a difference? any idea? Thanks! Vipul
