[
https://issues.apache.org/jira/browse/GIRAPH-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13568068#comment-13568068
]
Alessandro Presta commented on GIRAPH-494:
------------------------------------------
1) Good point. Again, a benchmark would close the argument.
2) I agree with rationalizing the API. I'm not sure a single change like this
will necessarily be good or bad, because we don't have a bigger picture in mind.
I think the issue is the following: we have many Vertex implementations with
different storage/performance guarantees. Unfortunately this also carries
differences in semantics for the API: in one case Edge is a reference that can
be modified in-place; in the other it just carries a copy of immutable data.
We should get to a point where the API is entirely clear (we can even have a
couple alternatives, like Vertex/MutableVertex, but not ill-defined as they are
right now), and all implementations respect it. We should also get rid of those
that aren't justified and keep only a few reasonable alternatives.
At that point, we can actually decouple algorithm implementation from low-level
storage details: the user will just extend Vertex (or, say, MutableVertex) and
define compute(). The underlying storage model can be customized by passing an
option, as opposed to changing your vertex just to extend a different class.
This will make a user codebase a lot cleaner.
> Edge should be an interface
> ---------------------------
>
> Key: GIRAPH-494
> URL: https://issues.apache.org/jira/browse/GIRAPH-494
> Project: Giraph
> Issue Type: Bug
> Reporter: Nitay Joffe
> Assignee: Nitay Joffe
> Attachments: GIRAPH-494.patch
>
>
> In terms of architecture and for flexibility I think our Edge class should be
> an interface instead of a real class. In this diff I change it to an
> interface and add a sub interface called MutableEdge. The existing Edge class
> is now called DefaultEdge. Note that only one class in our codebase actually
> needs a MutableEdge - RepresentativeVertex. Everything else works perfectly
> fine using the immutable Edge interface.
> One nice thing this allowed me to do is to create a EdgeNoValue which we can
> use for algorithms whose edges have no value at all. Currently the same
> functionality is achieved by using NullWritable, however using EdgeNoValue
> means not storing a reference to the single NullWritable instance in every
> single edge. Working on a job that reads 1B+ edges per worker, a pointer per
> edge adds up.
> https://reviews.apache.org/r/9172/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira