On Tue, Aug 16, 2011 at 3:28 PM, Jake Mannix <[email protected]> wrote:

> > The need on the recommendation side was to have id's that would not
> collide
> > without having to check.  That is a bit different from the matrix world
> > where you have a conceptually dense set of integer indexes.
> >
>
> Why is it conceptually different than, say, the old DocumentVectorizer,
> which
> takes a random jumble of vocabulary, and creates a dictionary, which is a
> strictly no-collision mapping of (term: string) <-> (termId: int)?
>

Using longs gives an acceptably small probability of collision without a
dictionary.


> Why not do the same thing in the recommender world (other than for legacy
> reasons), for user and item ids?


Could be done any time a dictionary is acceptable.  Sometimes it isn't.

Reply via email to