Re: TopItems.getTopUsers()

Sean Owen Tue, 08 Nov 2011 23:24:47 -0800

The PriorityQueue there is a min heap. It's used to keep finding the
smallest among a collection of large values. So, no it can't be used
directly to create a list of items from large to small as it has a
"get smallest" method, not "get largest".


The result of the method is a list of IDs, not a list of item-score
pairs. This is the reason for the last step where it's converted into
just the IDs.

The size of queue here is typically really small, like 10 items. The
queue implementation probably makes no measurable difference at that
size.

On Wed, Nov 9, 2011 at 12:46 AM, Grant Ingersoll <[email protected]> wrote:
> I've been reading code and am wondering about TopItems.getTopUsers
>
> Here's my pseudocode of it (lines 96-134 in TopItems)
> Get a Priority Queue (PQ)
> for all users
>        estimate the similarity of user i with our current user (the one we 
> are generating a rec for)
>        put a SimilarUser object on the PQ.
>
> //HERE
> Create a list of SimilarUser objects
> Populate that list with the PQ objects, which also contains SimilarUser 
> objects
> Sort that list
> Iterate over that list once more and grab the ids and put it into an array
> //HERE
> return the array
>
> Why all that in between moving stuff marked by //HERE?  Isn't that traversing 
> the same list 3 times?  Can't we just pop from the priority queue in reverse 
> order and fill the array from back to front?  Am I missing something?
>
> Here's the same code in Lucene:
> <snip>
> protected void populateResults(ScoreDoc[] results, int howMany) {
>    for (int i = howMany - 1; i >= 0; i--) {
>      results[i] = pq.pop();
>    }
>  }
> </snip>
>
> Also, FWIW, Lucene's PQ implementation is faster than Java's, from what I 
> understand (which is why we wrote our own).
>
> -Grant

Re: TopItems.getTopUsers()

Reply via email to