you could only do the deep check if the hashcodes are the same and design hashcodes that do not take all elements into account.
the alternative seems to be putting cache statements all over graphx, as is currently the case, which is trouble for any long lived application where caching is carefully managed. I think? I am currently forced to do unpersists on vertices after almost every intermediate graph transformation, or accept my rdd cache getting polluted On Jul 7, 2014 12:03 AM, "Ankur Dave" <ankurd...@gmail.com> wrote: > Well, the alternative is to do a deep equality check on the index arrays, > which would be somewhat expensive since these are pretty large arrays (one > element per vertex in the graph). But, in case the reference equality check > fails, it actually might be a good idea to do the deep check before > resorting to the slow code path. > > Ankur <http://www.ankurdave.com/> >