This brings up a point about our linear primitives: are 32bit integers big enough for our index range for vectors and matrices? Especially for matrices, having billions of rows is completely possible, even if it is on the large side.
If we want to be about "scalable" machine learning, we really don't want to seal ourselves in to "only" 2 billion x 2 billion matrices in the long run, do we? How hard would it be to promote our ints to longs? -jake On Sat, Dec 5, 2009 at 4:48 AM, Sean Owen <sro...@gmail.com> wrote: > I'm trying to use Vectors to represent a vector of user preferences. > All is well since items are numeric and can be used as indexes into a > Vector -- almost. I have longs, and of course indexes are ints. > > I could fold the long IDs into ints without too much worry about the > effects of collision. However I still need to remember the original > item IDs for each index. I could do it with labels, but I can't > retrieve the label for an index (and the other mapping isn't > serialized anyway?). > > So I guess I must separately store this mapping? Just making sure I'm > not missing something. >