[
https://issues.apache.org/jira/browse/MAHOUT-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064432#comment-13064432
]
Han Hui Wen edited comment on MAHOUT-759 at 7/13/11 8:47 AM:
--------------------------------------------------------------
yep,
1) the end user can add another M/R task after ItemSimilarityJob to traverse
the output to second style if ItemSimilarityJob can not do this .
2) Or change
http://svn.apache.org/viewvc/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/similarity/item/MostSimilarItemPairsMapper.java?view=markup
change:
69 long itemID = indexItemIDMap.get(itemIDIndex);
70 for (SimilarItem similarItem : topKMostSimilarItems.retrieve()) {
71 long otherItemID = similarItem.getItemID();
72 if (itemID < otherItemID) {
73 ctx.write(new EntityEntityWritable(itemID, otherItemID), new
DoubleWritable(similarItem.getSimilarity()));
74 } else {
75 ctx.write(new EntityEntityWritable(otherItemID, itemID), new
DoubleWritable(similarItem.getSimilarity()));
76 }
77 }
to :
69 long itemID = indexItemIDMap.get(itemIDIndex);
70 context.write(itemID , new
RecommendedItemsWritable(topKMostSimilarItems.retrieve()));
3) or writer another mapper for ItemSimilarityJob .
was (Author: huiwenhan):
yep,
1) the end user can add another M/R task after ItemSimilarityJob to traverse
the output to second style if ItemSimilarityJob can not do this .
2) Or change
http://svn.apache.org/viewvc/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/similarity/item/MostSimilarItemPairsMapper.java?view=markup
change:
69 long itemID = indexItemIDMap.get(itemIDIndex);
70 for (SimilarItem similarItem : topKMostSimilarItems.retrieve()) {
71 long otherItemID = similarItem.getItemID();
72 if (itemID < otherItemID) {
73 ctx.write(new EntityEntityWritable(itemID, otherItemID), new
DoubleWritable(similarItem.getSimilarity()));
74 } else {
75 ctx.write(new EntityEntityWritable(otherItemID, itemID), new
DoubleWritable(similarItem.getSimilarity()));
76 }
77 }
to :
69 long itemID = indexItemIDMap.get(itemIDIndex);
70 context.write(itemID , new
RecommendedItemsWritable(topKItems.retrieve()));
3) or writer another mapper for ItemSimilarityJob .
> improve the output for ItemSimilarityJob
> ----------------------------------------
>
> Key: MAHOUT-759
> URL: https://issues.apache.org/jira/browse/MAHOUT-759
> Project: Mahout
> Issue Type: Improvement
> Components: Collaborative Filtering
> Affects Versions: 0.5
> Reporter: Han Hui Wen
> Assignee: Sean Owen
> Priority: Minor
> Labels: ItemSimilarityJob,, Mahout
> Fix For: 0.6
>
>
> Now the output of ItemSimilarityJob like following:
> -7757148334301255842 8179634876330318523 0.003430531732418525
> -7748456450926673883 -4835531939219667484 0.2
> -7748456450926673883 -4314955996498817413 0.5
> -7748456450926673883 2808714190706572296 0.16666666666666666
> -7748456450926673883 6553837338030757853 0.14285714285714285
> -7748456450926673883 8751415108300656176 0.25
> -7747582778903926086 -7015341798833970389 0.05
> -7745456649800833279 -4355275072474512298 4.2444821731748726E-4
> -7743453627722079138 -3667977661496669483 0.0625
> -7743453627722079138 5506208171850960507 0.0625
> -7743453627722079138 7221367701058721462 0.0625
> -7721326863046534787 4345458182369739840 0.1111111111111111
> It's hard to store and view those similar items for one item. can we traverse
> them same as RecommenderJob like following:
> -9220680374247203656
> [1352180348488328600:2.5,-7757148334301255842:2.5,-7490490145790861630:2.5,-2522983126042570313:2.5,-6799281597153282746:2.5,2068144185705723774:2.5,-6007350693723349387:2.5,-6926986971196173463:2.5,5406899818760113425:2.5,-1490410533166829581:2.5,-27094582027403342:2.5,5665136340246000627:2.5]
> -9218599019595753787
> [7535853797920985421:2.5,6375444791143058470:2.5,-6278686364859964742:2.5,4842183991621375854:2.5,-5371123101058190798:2.5,8606934083257321678:2.5,8043580185091202137:2.5,5264973095582397115:2.5,1990532764981555035:2.5,5406899818760113425:2.5,-5208048021997301514:2.5,-5565838412826072017:2.5]
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira