On Tue, Aug 16, 2011 at 3:28 PM, Jake Mannix <[email protected]> wrote:
> > The need on the recommendation side was to have id's that would not > collide > > without having to check. That is a bit different from the matrix world > > where you have a conceptually dense set of integer indexes. > > > > Why is it conceptually different than, say, the old DocumentVectorizer, > which > takes a random jumble of vocabulary, and creates a dictionary, which is a > strictly no-collision mapping of (term: string) <-> (termId: int)? > Using longs gives an acceptably small probability of collision without a dictionary. > Why not do the same thing in the recommender world (other than for legacy > reasons), for user and item ids? Could be done any time a dictionary is acceptable. Sometimes it isn't.
