[ 
https://issues.apache.org/jira/browse/MAHOUT-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Daniels updated MAHOUT-1622:
----------------------------------
    Comment: was deleted

(was: Here is a possible fix:
public void run() {
      while (numActiveWorkers.get() != 0 || !results.isEmpty()) {
        try {
          List<SimilarItems> similarItemsOfABatch = results.poll(10, 
TimeUnit.MILLISECONDS);
          if (similarItemsOfABatch != null) {
            for (SimilarItems similarItems : similarItemsOfABatch) {
              writer.add(similarItems);
              numSimilaritiesProcessed += similarItems.numSimilarItems();
            }
          }
        } catch (Exception e) {
          throw new RuntimeException(e);
        }
      }
    }
)

> MultithreadedBatchItemSimilarities outputs incorrect number of similarities.
> ----------------------------------------------------------------------------
>
>                 Key: MAHOUT-1622
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1622
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.9
>            Reporter: Jesse Daniels
>            Priority: Minor
>         Attachments: batchSimilarities.patch
>
>
> In some cases the Output class in MultithreadedBatchItemSimilarities does not 
> output all of the similarity pairs that it should. It is very possible for 
> the number of active workers to go to zero while in the while loop, in which 
> case the remaining similarities for the finished workers will not be flushed 
> to the output. This is because the while loop is only conditioned on whether 
> there are active workers or not. An easy fix is to also check to make sure 
> the results structure is not empty. This way both the number of active 
> workers must be 0 and the result set must be empty to exit the while loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to