[ 
https://issues.apache.org/jira/browse/MAHOUT-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064432#comment-13064432
 ] 

Han Hui Wen  edited comment on MAHOUT-759 at 7/13/11 8:47 AM:
--------------------------------------------------------------

yep,

1) the end user can add another M/R task after ItemSimilarityJob  to traverse 
the output to second style if ItemSimilarityJob can not do this .

2) Or change  
http://svn.apache.org/viewvc/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/similarity/item/MostSimilarItemPairsMapper.java?view=markup
 

change:

69      long itemID = indexItemIDMap.get(itemIDIndex);
70      for (SimilarItem similarItem : topKMostSimilarItems.retrieve()) {
71      long otherItemID = similarItem.getItemID();
72      if (itemID < otherItemID) {
73      ctx.write(new EntityEntityWritable(itemID, otherItemID), new 
DoubleWritable(similarItem.getSimilarity()));
74      } else {
75      ctx.write(new EntityEntityWritable(otherItemID, itemID), new 
DoubleWritable(similarItem.getSimilarity()));
76      }
77      }

to :

69      long itemID = indexItemIDMap.get(itemIDIndex);
70        context.write(itemID , new 
RecommendedItemsWritable(topKMostSimilarItems.retrieve()));


3) or writer another mapper for ItemSimilarityJob  .

      was (Author: huiwenhan):
    yep,

1) the end user can add another M/R task after ItemSimilarityJob  to traverse 
the output to second style if ItemSimilarityJob can not do this .

2) Or change  
http://svn.apache.org/viewvc/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/similarity/item/MostSimilarItemPairsMapper.java?view=markup
 

change:

69      long itemID = indexItemIDMap.get(itemIDIndex);
70      for (SimilarItem similarItem : topKMostSimilarItems.retrieve()) {
71      long otherItemID = similarItem.getItemID();
72      if (itemID < otherItemID) {
73      ctx.write(new EntityEntityWritable(itemID, otherItemID), new 
DoubleWritable(similarItem.getSimilarity()));
74      } else {
75      ctx.write(new EntityEntityWritable(otherItemID, itemID), new 
DoubleWritable(similarItem.getSimilarity()));
76      }
77      }

to :

69      long itemID = indexItemIDMap.get(itemIDIndex);
70        context.write(itemID , new 
RecommendedItemsWritable(topKItems.retrieve()));


3) or writer another mapper for ItemSimilarityJob  .
  
> improve the output for ItemSimilarityJob
> ----------------------------------------
>
>                 Key: MAHOUT-759
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-759
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Collaborative Filtering
>    Affects Versions: 0.5
>            Reporter: Han Hui Wen 
>            Assignee: Sean Owen
>            Priority: Minor
>              Labels: ItemSimilarityJob,, Mahout
>             Fix For: 0.6
>
>
> Now the output of ItemSimilarityJob like following:
> -7757148334301255842  8179634876330318523     0.003430531732418525
> -7748456450926673883  -4835531939219667484    0.2
> -7748456450926673883  -4314955996498817413    0.5
> -7748456450926673883  2808714190706572296     0.16666666666666666
> -7748456450926673883  6553837338030757853     0.14285714285714285
> -7748456450926673883  8751415108300656176     0.25
> -7747582778903926086  -7015341798833970389    0.05
> -7745456649800833279  -4355275072474512298    4.2444821731748726E-4
> -7743453627722079138  -3667977661496669483    0.0625
> -7743453627722079138  5506208171850960507     0.0625
> -7743453627722079138  7221367701058721462     0.0625
> -7721326863046534787  4345458182369739840     0.1111111111111111
> It's hard to store and view those similar items for one item. can we traverse 
>   them same as RecommenderJob like following:
> -9220680374247203656  
> [1352180348488328600:2.5,-7757148334301255842:2.5,-7490490145790861630:2.5,-2522983126042570313:2.5,-6799281597153282746:2.5,2068144185705723774:2.5,-6007350693723349387:2.5,-6926986971196173463:2.5,5406899818760113425:2.5,-1490410533166829581:2.5,-27094582027403342:2.5,5665136340246000627:2.5]
> -9218599019595753787  
> [7535853797920985421:2.5,6375444791143058470:2.5,-6278686364859964742:2.5,4842183991621375854:2.5,-5371123101058190798:2.5,8606934083257321678:2.5,8043580185091202137:2.5,5264973095582397115:2.5,1990532764981555035:2.5,5406899818760113425:2.5,-5208048021997301514:2.5,-5565838412826072017:2.5]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to