I am +1 for options two or three. If we go with #2, better to do it now and get the distruption for users over with early in Giraph's history.
On Wed, Aug 14, 2013 at 7:28 AM, Claudio Martella < [email protected]> wrote: > Hi Nitay, > > I'm +1 for (2). It would remove another explicit leakage of the Hadoop API > from Giraph. Although it makes clear that the Vertex signature does need to > support serialization, I agree that it is kind of all over the place (with > good and bad results, like checkpointing and ooc coming more easily). > At this point though, Giraph is more mature to take a different path in the > face of the initial advantage of having Writable API. > > > On Tue, Aug 13, 2013 at 11:25 PM, Nitay Joffe <[email protected]> wrote: > > > Hello Friends, > > > > I have a diff up that substantially changes our API, > > https://issues.apache.org/jira/browse/GIRAPH-684, which I would like to > > get people's vote on. > > > > Basically the question is whether we think that forcing the graph types > > (I,V,E,M1,M2) to be Writable/Comparable is the right thing to do. > > This requirement means we cannot easily externalize how a type gets > > serialized (for example if you wanted to test out different ways of > > serializing an integer). > > It also makes it more difficult to implement things like the Jython > > integration because every Jython object must be constantly > > wrapped/unwrapped in a Writable wrapper in order to conform to Giraph. > > Personally I have never liked the fact that we have serialization tied to > > the object in terms of code design patterns, but that is just me. > > The diff I have up removes the requirement of IVEMM being Writable, and > > allows you to specify, via a separate parameter, the serializer to use > for > > each type. Note that it is completely backwards compatible. That is, if > we > > detect that you are actually using a Writable then we stick in an > internal > > WritableSerializer (which just calls readFields()/write()) and you do not > > need to specify anything. > > > > The major con of removing the Writable interface is it makes things less > > clear for our users. So, the potential solutions to vote on here are: > > > > 1) Leave things as they are. Let Jython be ugly. Serialization stays tied > > to the object. No external serializers. > > 2) Remove Writable everywhere (as diff currently does). Explain to users > > that they can use Writable or define their own Serializer. > > 3) Remove Writable _internally_ only. Keep the outermost-facing Java API > > (Computation, VertexInputFormat, etc) still allowing Writable only, but > > internally no Writable required. This allows Jython and any expert users > to > > work with the more internal types that have no type restrictions yet > leaves > > our Java API Writable. Don't mention external serializers to average > users > > so our story stays the same. Essentially this option makes things a bit > > more confusing for developers instead of users. > > > > We have discussed this at Facebook (Avery, Alessandro, Maja and myself) > > for a while, and would like to get your opinions as well since this is a > > large change. > > What do folks think? Please weigh in. > > > > Thanks, > > - Nitay > > > > > -- > Claudio Martella > [email protected] >
