<rant>
Which joker thought of removing uint from Java?
</rant>

Dan, the cost of moving to 64 bit for the index is extra RAM usage. My
experiments show that 32 bits is enough to hash down billions of features.
Do we ever need such Quadrillions of features? Can Machine learning truly
work at that scale. Think about these.

Robin Anil | Software Engineer | +1 312 869 2602 | Google Inc.


On Wed, Jun 19, 2013 at 5:16 AM, Dan Filimon <[email protected]>wrote:

> Also, this is particularly problematic because indices can't be negative so
> only 2^31 elements are actually possible.
>
>
> On Wed, Jun 19, 2013 at 1:15 PM, Dan Filimon <[email protected]
> >wrote:
>
> > Hi everyone!
> >
> > The current Vector API only supports 32bit maximum indices for Vectors.
> >
> > I feel that 64bits would be more appropriate especially because the
> > indices are likely to be hash values of other data and 32bit will result
> in
> > quite a few collisions.
> >
> > Also, for some jobs, notably ItemSimilarityJob, this restriction means
> > that we need a special id to index map where we'll collide anyway.
> >
> > What do you think about adding support for 64bit indices?
> > Is anyone at all interested?
> >
>

Reply via email to