[
https://issues.apache.org/jira/browse/MAHOUT-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
lariven updated MAHOUT-1739:
----------------------------
Description:
the output similar items of ItemSimilarityJob for each target item may exceed
the number of similar items we set to maxSimilarItemsPerItem parameter. the
following code of ItemSimilarityJob.java about line NO. 200 may affect:
if (itemID < otherItemID) {
ctx.write(new EntityEntityWritable(itemID, otherItemID), new
DoubleWritable(similarItem.getSimilarity()));
} else {
ctx.write(new EntityEntityWritable(otherItemID, itemID), new
DoubleWritable(similarItem.getSimilarity()));
}
Don't know why need to switch itemID with otherItemID, but I think a single
line is enough:
ctx.write(new EntityEntityWritable(itemID, otherItemID), new
DoubleWritable(similarItem.getSimilarity()));
was:
the output similar items of ItemSimilarityJob for each target item may exceed
the number of similar items we set to this parameter. the following code of
ItemSimilarityJob.java about line NO. 200 may affect:
if (itemID < otherItemID) {
ctx.write(new EntityEntityWritable(itemID, otherItemID), new
DoubleWritable(similarItem.getSimilarity()));
} else {
ctx.write(new EntityEntityWritable(otherItemID, itemID), new
DoubleWritable(similarItem.getSimilarity()));
}
Don't know why need to switch itemID with otherItemID, but I think a single
line is enough:
ctx.write(new EntityEntityWritable(itemID, otherItemID), new
DoubleWritable(similarItem.getSimilarity()));
> maxSimilarItemsPerItem param of ItemSimilarityJob doesn't behave correct
> ------------------------------------------------------------------------
>
> Key: MAHOUT-1739
> URL: https://issues.apache.org/jira/browse/MAHOUT-1739
> Project: Mahout
> Issue Type: Bug
> Components: Collaborative Filtering
> Affects Versions: 0.10.0
> Reporter: lariven
> Labels: easyfix, patch
> Fix For: 0.10.0, 0.10.1
>
>
> the output similar items of ItemSimilarityJob for each target item may exceed
> the number of similar items we set to maxSimilarItemsPerItem parameter. the
> following code of ItemSimilarityJob.java about line NO. 200 may affect:
> if (itemID < otherItemID) {
> ctx.write(new EntityEntityWritable(itemID, otherItemID), new
> DoubleWritable(similarItem.getSimilarity()));
> } else {
> ctx.write(new EntityEntityWritable(otherItemID, itemID), new
> DoubleWritable(similarItem.getSimilarity()));
> }
> Don't know why need to switch itemID with otherItemID, but I think a single
> line is enough:
> ctx.write(new EntityEntityWritable(itemID, otherItemID), new
> DoubleWritable(similarItem.getSimilarity()));
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)