I think equals() and hashCode() ought to be retained at least for
consistency. There is no sin in implementing these per se. If hashCode() is
too slow in some context then it can be cached if it is so important. This
is what String does.

However I doubt these should be used as keys in general - that is the issue
to be fixed if anything if there is a performance problem.

Do you want people to use equals()? Dunno it's up to the caller really.

Sean
On Feb 23, 2012 5:25 PM, "Jake Mannix" <[email protected]> wrote:

> Hey Devs.
>
>  Was prototyping some stuff in Mahout last night, and noticed something
> I'm not sure if we've talked about before: because we have equals() for
> Vector instances return true iff the numeric values of the vectors are
> equal, and we also have a consistent hashCode(), anytime you have
> HashMap<Vector, Anything>, all the typical things you think are O(1) are
> really O(vector.numNonZeroes()).  I tried to look through the codebase and
> see where we hang onto maps with vector keys, and we do it sometimes.
>  Maybe we shouldn't?  Most Vectors have identities (clusterId, documentId,
> topicId, etc...) which we could normalize away... or maybe we should be
> using IdentityHashMap, to ensure you're using strict object identity and
> avoid doing this calculation?  This could be really slow if these are big
> dense vectors, for instance.
>
>  This looks like it could be a really easy place to accidentally add heavy
> complexity to things.  Do we really want people do be checking
> *mathematical* equals() on vectors which have floating point precision?
>
>  -jake
>

Reply via email to