I'm not amongst the knowledgeable but the interested ;)

I also vote +1 for long, being Taste a helper for making a recommendation
system, to me it's much more important gain in performance than in
flexibility. Once the recommended items ID's are obtain (generally a few 5
or 10 items), everything could be done pretty quickly.

Regards,

On Mon, Aug 3, 2009 at 5:22 PM, Sean Owen <sro...@gmail.com> wrote:

> On Mon, Aug 3, 2009 at 8:52 PM, Grant Ingersoll<gsing...@apache.org>
> wrote:
> > Why is long less flexible?  I mean, I get that Comparable is an interface
> > and thus can be most anything, but really Taste just needs a way of
> > identifying something uniquely right?  long satisfies that, no?
>
> Really, it's that Strings are possible now too (and in theory other
> stuff, but those would be by far the most common non-numeric type).
> Yes, the framework doesn't care what it is. Right now I can have keys
> like "A09BC3" and now this change would make that impossible. You'd
> have to maintain, separately, a mapping between your keys and some
> numeric identifier, if this were the case.
>
> > I guess my question is mainly along the lines of how users interact with
> > said id.  I would suspect it is then used as a key into a database or a
> map
> > or something like that right?  Are they going to be then forced to
> > constantly box it to Long?   I think it is reasonable to push those
>
> (To be clear I'm suggesting long primitives, not Long objects -- the
> point being to avoid the Object overhead entirely.)
>
> I also perceive it's usually a *numeric* key which is why this could
> make sense to assume.
>
> > questions out to users to answer while focusing on being as lean as
> > possible.  After all, Taste is a library, not an application, so in order
> > for it to appeal to a broad set of users, it needs to be lean and fast
> and
> > make as few assumption as possible.  Comparable is, in some sense, a
> bigger
> > assumption than long. Perhaps that stuff can be layered on top while the
> > core just uses long.
>
> I suppose I view long as a stronger assumption -- more limiting --
> since it assumes you use a numeric type for keys, as opposed to merely
> assuming it is something with an ordering, which could be a String or
> a... or a... well String is really the only other imaginable, common
> use case. Before you could use a user name like 'srowen' as an ID and
> now the assumption means you can't.
>
> > Also, are there other places where memory could be saved first?
>
> This is definitely next up on the list of memory consumers. Right now
> roughly half the heap is storing arrays of Integers, in my particular
> test case (but one that's pretty representative). And if those take 32
> bytes compared to 8 bytes for a long (which has a satisfyingly larger
> range to boot), looking at a 37.5% overall savings or so.
>
> Then the overhead is really the 'indexes' like the specialized Maps
> (which already do linear probing instead of separate chaining -- no
> Map.Entry objects). Their storage would come down somewhat too.
>
> I am also sure performance would increase -- avoiding millions and
> millions of method calls for hashCode() and equals() and compareTo()
> at least. Not to mention less GC pressure.
>
> I don't know of other 'big wins'. Well, can think of some that are
> specific to slope-one, but those are less interesting than the ones
> that could affect many implementations -- those affecting the model.
>

Reply via email to