[
https://issues.apache.org/jira/browse/MAHOUT-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14174359#comment-14174359
]
Jesse Daniels commented on MAHOUT-1622:
---------------------------------------
Here is a possible fix:
public void run() {
while (numActiveWorkers.get() != 0 || !results.isEmpty()) {
try {
List<SimilarItems> similarItemsOfABatch = results.poll(10,
TimeUnit.MILLISECONDS);
if (similarItemsOfABatch != null) {
for (SimilarItems similarItems : similarItemsOfABatch) {
writer.add(similarItems);
numSimilaritiesProcessed += similarItems.numSimilarItems();
}
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
> MultithreadedBatchItemSimilarities outputs incorrect number of similarities.
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-1622
> URL: https://issues.apache.org/jira/browse/MAHOUT-1622
> Project: Mahout
> Issue Type: Bug
> Components: Collaborative Filtering
> Affects Versions: 0.9
> Reporter: Jesse Daniels
> Priority: Minor
>
> In some cases the Output class in MultithreadedBatchItemSimilarities does not
> output all of the similarity pairs that it should. It is very possible for
> the number of active workers to go to zero while in the while loop, in which
> case the remaining similarities for the finished workers will not be flushed
> to the output. This is because the while loop is only conditioned on whether
> there are active workers or not. An easy fix is to also check to make sure
> the results structure is not empty. This way both the number of active
> workers must be 0 and the result set must be empty to exit the while loop.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)