Re: Primitives vs Objects (the Movie!)

Avery Ching Tue, 06 Sep 2011 23:33:36 -0700

On 9/6/11 10:05 PM, Jake Mannix wrote:

On Wed, Sep 7, 2011 at 1:02 AM, Avery Ching <[email protected]<mailto:[email protected]>> wrote:
    Yeah, one edge is pretty silly.  To get some real numbers, I
    should try it out with a more realistic (power-law distributed)
    bit of synthetic data.
    Agreed.
I'll see if I can write up some simple extensions of the PageRankbenchmark to test this out. Addedhttps://issues.apache.org/jira/browse/GIRAPH-26 to track it.
    Lots of questions here, I'll try my best. =)  Basically the idea
    is that as a sorted map, the user would be able to know that the
    edges are sorted by their destination ids.  Range based edge
    queries are possible (i.e. the edges with destinations from
    com.yahoo.www to com.zillow.www).  Basically this would give the
    users a bit more functionality than a basic map.  A list would
    require a full scan to find/remove an edge.
I see, that makes sense. I wonder if inverting the API a bit wouldmake sense, instead of exposing such a concrete inner domain objectlike the actual SortedMap? By this, I mean: if you want to search /subselect edges by some criterion, then instead of having the caller do
  for(Edge<E,V> edge : getOutEdgeMap().values()) {
    // select edges you want to do something with
  }

  you expose to the caller:
public Collection<Edge<E,V>> filterEdges(Predicate<Edge<E,V>>filter), which the caller can use like
  for(Edge<E,V> edge : filterEdges(new Predicate<Edge<E,V>>() {
    boolean apply(Edge<E,V> e) {
return e.getVertexId().compareTo("com.zillow.www") >= 0 &&e.getVertexId().compareTo("com.yahoo.www") < 0;
    }
  }) {
    // do stuff on the edges you wanted to get
  }
I'm not sure that this is precisely the right API, but exposing theinner SortedMap definitely has a "leaky abstraction" smell to it tome, especially where there are no examples or algorithmimplementations in the codebase which currently *need* this to beexposed in such a way.

We could expose just the Map and provide helper methods as you suggest.It's a thought. You're right that it's not used in this way in anyexamples.

    As far as why not (Sorted)Map<I, E>, I believe we actually used to
    have that interface and I changed it use the Edge class.  This was
    primarily done since Edge objects would be used for edge requests
    (i.e. void addEdgeRequest(I sourceVertexId, Edge<I, E> edge)
    stores the add edge request with the Edge object in a list.  I
    think I also changed the user interfaces to use the Edge object
    since I initially thought it might make usage it a little clearer
    for users (i.e. edge.getDestVertexid() and some of the
    serialization a little simpler, but looking at it now, that might
    not be the case.  We can probably go back to Map<I, E> or
    SortedMap<I, E> to save some memory and internally use the Edge
    object to store the add edge requests.
I can see how the Edge object is nice, it does seem like a better apithen basically just a Map<I,E>. Why not a sorted List<Edge<I,E>>,then, where the Comparator ignores the edge value, but only sorts bythe dest vertexId? Because the contract of List doesn't *require* thesort order to remain fixed, I guess? This yet again underscores forme the "spidey-sense" wrongness of exposing the SortedMap exactly asit is. Maybe just expose the methods on a map that are expected to beneeded, and methods can be added as time goes on?

I think Map over List makes sense for the edges to find them quickly (mypersonal opinion).

    For my present case, I can probably hack around it by making a
    virtual
    SortedMap<LongWritable, Edge<LongWritable,FloatWritable>> which
    implements that interface, yet is backed by a
    primitive map, but I'm trying to understand what the API is
    trying to support...
    Hope my explanation helped somewhat.
Yes, definitely. But I've run into some ugly ugliness when I tried torun my code:
Problem 1) static mutable state (ew!) - lots of references toVertex.setSuperStep(), Vertex.setNumVertices(), Vertex.setNumEdges(),Vertex.setGraphMapper(), etc... which means that implemetingBasicVertex / MutableVertex isn't good enough, you have to actually bea subclass of Vertex! :\

Oh, you found our dirty little secret. =) Yes, we do have staticmutable state. The concern is that if every vertex actually keeps trackof all those things as members, I think a decent amount of memory wouldbe wasted. I'm sure there is a cleaner way to implement this though(i.e. Flyweight pattern perhaps), just never got around to fixing it.

Problem 2) Vertex.class is directly referenced as the base class inplaces like BspUtils, etc.

We never expected to have another base Vertex class. =)

So some refactoring may be necessary to even try out this Object ->primitive test (well, my code *compiles*, but won't run on account ofproblems 1 and 2 above - runtime exceptions galore), even after I"faked" a SortedMap by implementing a slimmed down delegator class.
Most likely the easiest solution would be to pull out a AbstractVertexclass, which holds this ugly static state, and make said stateprotected, instead of private, and have Vertex andLongDoubleFloatDoubleVertex both subclass it, and change reflectionreferences from setClass(VERTEX_CLASS, vertexClass, Vertex.class);to setClass(VERTEX_CLASS, vertexClass, AbstractVertex.class);

That sounds reasonable. One thing we have thought about is compressingthe vertices in memory at the cost of some CPU. While this isorthogonal, they are both efforts to conserve memory.

But the *right* solution is to remove that static stuff. Why is itthere, anyways? Why doesn't GraphMapper just call
            Vertex.setSuperstep(superstep);
            Vertex.setNumVertices(serviceWorker.getTotalVertices());
            Vertex.setNumEdges(serviceWorker.getTotalEdges());
directly on each of the BasicVertex'sin serviceWorker.getVertexRangeMap()?

Memory consumption (see above), these are aggregate members for all thevertices.

  -jake

Re: Primitives vs Objects (the Movie!)

Reply via email to