[ 
https://issues.apache.org/jira/browse/MAHOUT-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800466#action_12800466
 ] 

Sean Owen commented on MAHOUT-247:
----------------------------------

This is actually very messy to really fix -- LongPrimitiveIterator is used in a 
lot of places, and to make sure it's "drained" in all exception cases would 
make quite a mess of the code.

I did add a finalize() method to the implementations which are based on a JDBC 
connection, to close the connection. This doesn't really fix anything but at 
least may get the connection closed via finalization somewhat faster.

While I think it's hard to actually guarantee closing the connection 
immediately on all exception paths, it's relatively easy to apply fixes that 
would avoid the cases that would actually probably cause problems, such as the 
code you cite above.

I think my best bet at a fix here is to apply your change and related changes 
in key parts of the code and leave it at that.

> GenericUserBasedRecommender.recommend causes connection leak when called for 
> user with no preferences
> -----------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-247
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-247
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.2
>         Environment: Reproducable on Win32 and Ubuntu
>            Reporter: Tolga Oral
>            Assignee: Sean Owen
>            Priority: Minor
>             Fix For: 0.3
>
>
>                       UserSimilarity userSimilarity = new 
> TanimotoCoefficientSimilarity(getBooleanPrefDataModel());
>                       UserNeighborhood neighborhood = new 
> NearestNUserNeighborhood(3, userSimilarity, getBooleanPrefDataModel());
>                       Recommender recommender = new 
> GenericBooleanPrefUserBasedRecommender(getBooleanPrefDataModel(), 
> neighborhood, userSimilarity);
>                         recommender.recommend(userwithnopreferencesdata, 10);
> code properly throws NoSuchUserException however one of the connections is 
> hang on LongPrimitiveIterator backed by 
> org.apache.mahout.cf.taste.impl.model.jdbc.AbstractJDBCDataModel$ResultSetIDIterator
>  as Exception is thrown before TopItems.getTopUsers finishes the while loop 
> public static long[] getTopUsers(int howMany,
>                                    LongPrimitiveIterator allUserIDs,
>                                    Rescorer<Long> rescorer,
>                                    Estimator<Long> estimator) throws 
> TasteException {
>     Queue<SimilarUser> topUsers = new PriorityQueue<SimilarUser>(howMany + 1, 
> Collections.reverseOrder());
>     boolean full = false;
>     double lowestTopValue = Double.NEGATIVE_INFINITY;
> //HERE IS THE ITERATOR
>     while (allUserIDs.hasNext()) {
>       long userID = allUserIDs.next();
>       if (rescorer != null && rescorer.isFiltered(userID)) {
>         continue;
>       }
> //EXCEPTION THROWN HERE CAUSES THE CONNECTION LEAK
>       double similarity = estimator.estimate(userID);
>       double rescoredSimilarity = rescorer == null ? similarity : 
> rescorer.rescore(userID, similarity);
>       if (!Double.isNaN(rescoredSimilarity) && (!full || rescoredSimilarity > 
> lowestTopValue)) {
>         topUsers.add(new SimilarUser(userID, similarity));
>         if (full) {
>           topUsers.poll();
>         } else if (topUsers.size() > howMany) {
>           full = true;
>           topUsers.poll();
>         }
>         lowestTopValue = topUsers.peek().getSimilarity();
>       }
>     }
>     if (topUsers.isEmpty()) {
>       return NO_IDS;
>     }
>     List<SimilarUser> sorted = new ArrayList<SimilarUser>(topUsers.size());
>     sorted.addAll(topUsers);
>     Collections.sort(sorted);
>     long[] result = new long[sorted.size()];
>     int i = 0;
>     for (SimilarUser similarUser : sorted) {
>       result[i++] = similarUser.getUserID();
>     }
>     return result;
>   }
> ============================================================================================================
> I currently fixed it in our application by checking first to see if user has 
> preferences for the given dataset (user might exists and have preferences for 
> a different dataset).
> However this edge case does not cause issues in some other recommenders as 
> long as we handle the NoSuchUserException.
> Easy solution is to use AbstractJDBCDataModel$ResultSetIDIterator always with 
> try/catch/finally and release the connection.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to