On Mon, Aug 3, 2009 at 8:52 PM, Grant Ingersoll<gsing...@apache.org> wrote: > Why is long less flexible? I mean, I get that Comparable is an interface > and thus can be most anything, but really Taste just needs a way of > identifying something uniquely right? long satisfies that, no?
Really, it's that Strings are possible now too (and in theory other stuff, but those would be by far the most common non-numeric type). Yes, the framework doesn't care what it is. Right now I can have keys like "A09BC3" and now this change would make that impossible. You'd have to maintain, separately, a mapping between your keys and some numeric identifier, if this were the case. > I guess my question is mainly along the lines of how users interact with > said id. I would suspect it is then used as a key into a database or a map > or something like that right? Are they going to be then forced to > constantly box it to Long? I think it is reasonable to push those (To be clear I'm suggesting long primitives, not Long objects -- the point being to avoid the Object overhead entirely.) I also perceive it's usually a *numeric* key which is why this could make sense to assume. > questions out to users to answer while focusing on being as lean as > possible. After all, Taste is a library, not an application, so in order > for it to appeal to a broad set of users, it needs to be lean and fast and > make as few assumption as possible. Comparable is, in some sense, a bigger > assumption than long. Perhaps that stuff can be layered on top while the > core just uses long. I suppose I view long as a stronger assumption -- more limiting -- since it assumes you use a numeric type for keys, as opposed to merely assuming it is something with an ordering, which could be a String or a... or a... well String is really the only other imaginable, common use case. Before you could use a user name like 'srowen' as an ID and now the assumption means you can't. > Also, are there other places where memory could be saved first? This is definitely next up on the list of memory consumers. Right now roughly half the heap is storing arrays of Integers, in my particular test case (but one that's pretty representative). And if those take 32 bytes compared to 8 bytes for a long (which has a satisfyingly larger range to boot), looking at a 37.5% overall savings or so. Then the overhead is really the 'indexes' like the specialized Maps (which already do linear probing instead of separate chaining -- no Map.Entry objects). Their storage would come down somewhat too. I am also sure performance would increase -- avoiding millions and millions of method calls for hashCode() and equals() and compareTo() at least. Not to mention less GC pressure. I don't know of other 'big wins'. Well, can think of some that are specific to slope-one, but those are less interesting than the ones that could affect many implementations -- those affecting the model.