[
https://issues.apache.org/jira/browse/MAHOUT-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14396132#comment-14396132
]
ASF GitHub Bot commented on MAHOUT-1622:
----------------------------------------
GitHub user avati opened a pull request:
https://github.com/apache/mahout/pull/108
MAHOUT-1622: MultithreadedBatchItemSimilarities output fix
In some cases the Output class in MultithreadedBatchItemSimilarities does
not output all of the similarity pairs that it should. It is very possible
for the number of active workers to go to zero while in the while loop,
in which case the remaining similarities for the finished workers will not
be flushed to the output. This is because the while loop is only
conditioned on whether there are active workers or not. An easy fix is to
also check to make sure the results structure is not empty. This way both
the number of active workers must be 0 and the result set must be empty to
exit the while loop.
On-behalf-of: Jesse Daniels <[email protected]>
Signed-off-by: Anand Avati <[email protected]>
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/avati/mahout MAHOUT-1622
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/mahout/pull/108.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #108
----
commit d77f7c48e68ae4f4d2b87dab0cefb0f245e189f6
Author: Anand Avati <[email protected]>
Date: 2015-04-05T05:59:09Z
MAHOUT-1622: MultithreadedBatchItemSimilarities output fix
In some cases the Output class in MultithreadedBatchItemSimilarities does
not output all of the similarity pairs that it should. It is very possible
for the number of active workers to go to zero while in the while loop,
in which case the remaining similarities for the finished workers will not
be flushed to the output. This is because the while loop is only
conditioned on whether there are active workers or not. An easy fix is to
also check to make sure the results structure is not empty. This way both
the number of active workers must be 0 and the result set must be empty to
exit the while loop.
On-behalf-of: Jesse Daniels <[email protected]>
Signed-off-by: Anand Avati <[email protected]>
----
> MultithreadedBatchItemSimilarities outputs incorrect number of similarities.
> ----------------------------------------------------------------------------
>
> Key: MAHOUT-1622
> URL: https://issues.apache.org/jira/browse/MAHOUT-1622
> Project: Mahout
> Issue Type: Bug
> Components: Collaborative Filtering
> Affects Versions: 0.9
> Reporter: Jesse Daniels
> Assignee: Anand Avati
> Priority: Minor
> Labels: legacy
> Fix For: 0.10.0
>
> Attachments: batchSimilarities.patch
>
>
> In some cases the Output class in MultithreadedBatchItemSimilarities does not
> output all of the similarity pairs that it should. It is very possible for
> the number of active workers to go to zero while in the while loop, in which
> case the remaining similarities for the finished workers will not be flushed
> to the output. This is because the while loop is only conditioned on whether
> there are active workers or not. An easy fix is to also check to make sure
> the results structure is not empty. This way both the number of active
> workers must be 0 and the result set must be empty to exit the while loop.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)