[ 
https://issues.apache.org/jira/browse/GIRAPH-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13568068#comment-13568068
 ] 

Alessandro Presta commented on GIRAPH-494:
------------------------------------------

1) Good point. Again, a benchmark would close the argument.
2) I agree with rationalizing the API. I'm not sure a single change like this 
will necessarily be good or bad, because we don't have a bigger picture in mind.

I think the issue is the following: we have many Vertex implementations with 
different storage/performance guarantees. Unfortunately this also carries 
differences in semantics for the API: in one case Edge is a reference that can 
be modified in-place; in the other it just carries a copy of immutable data.
We should get to a point where the API is entirely clear (we can even have a 
couple alternatives, like Vertex/MutableVertex, but not ill-defined as they are 
right now), and all implementations respect it. We should also get rid of those 
that aren't justified and keep only a few reasonable alternatives.
At that point, we can actually decouple algorithm implementation from low-level 
storage details: the user will just extend Vertex (or, say, MutableVertex) and 
define compute(). The underlying storage model can be customized by passing an 
option, as opposed to changing your vertex just to extend a different class. 
This will make a user codebase a lot cleaner.
                
> Edge should be an interface
> ---------------------------
>
>                 Key: GIRAPH-494
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-494
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Nitay Joffe
>            Assignee: Nitay Joffe
>         Attachments: GIRAPH-494.patch
>
>
> In terms of architecture and for flexibility I think our Edge class should be 
> an interface instead of a real class. In this diff I change it to an 
> interface and add a sub interface called MutableEdge. The existing Edge class 
> is now called DefaultEdge. Note that only one class in our codebase actually 
> needs a MutableEdge - RepresentativeVertex. Everything else works perfectly 
> fine using the immutable Edge interface.
> One nice thing this allowed me to do is to create a EdgeNoValue which we can 
> use for algorithms whose edges have no value at all. Currently the same 
> functionality is achieved by using NullWritable, however using EdgeNoValue 
> means not storing a reference to the single NullWritable instance in every 
> single edge. Working on a job that reads 1B+ edges per worker, a pointer per 
> edge adds up.
> https://reviews.apache.org/r/9172/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to