Hi Marko, I definitely like this and it would be closer to what we are doing with OrientDB distributed architecture. How quick do you think you could create a draft of this in TP 3.x?
On 13 January 2017 at 12:23, Marko Rodriguez <[email protected]> wrote: > Hi, > > *CURRENT*: The GraphComputer framework assumes “vertex-centric” > computing. That is, a vertex receives a message and does something with it. > Moreover, it can send messages to other vertices. > > We got this wrong and I think we should do it right with GraphActors. > > *FUTURE*: The GraphActors framework assumes “partition-centric” > computing. That is, a partition receives a message and does something with > it. Moreover, it can send messages to other partitions. > > —— > > VertexProgram.execute(final Vertex vertex, Iterator<M> messages) > > > should have been: > > PartitionProgram.execute(Partition partition, Iterator<M> message) > > > in fact, ActorProgram’s execute() method is defined as: > > ActorProgram.execute(M message). > > > 1. Every Actor owns a Partition and thus, you don’t need to pass in the > Partition. > 2. To support ASP (asynchrounous) and BSP (synchronous) computing, you > don’t provide an Iterator<M>, just an M as they come through (event-driven). > 3. All partitions are assumed to have random access capabilities. All the > data in the partition is randomly accessible. > 4. A partition is a generalization of GraphComputer’s Vertex, where at the > micro-limit, every Vertex is in its own Partition. This is how we think > about SparkGraphComputer, GiraphGraphComputer, etc. — the “star graph." > However, by generalizing to larger subgraphs than just Vertex, we can have > more work being done per iteration in SparkGraphComputer, etc. Moreover, by > generalizing to partition, we don’t have to have all edges of a vertex > co-located and thus, can support edge-cut systems (liked DSEGraph). > > So, what does this mean for the future? This injection from > “vertex-centric” to “partition-centric” allows us to easily create > SparkGraphActors. Next, how do you verify if a traversal will be able to > legally execute against the underlying GraphActors system? It depends on > “the rules” of the Partitioner. A Partitioner should have Features which > define the boundaries of its data sphere. By looking at those Features and > looking at the semantics of the Traversal, it is possible to ensure that > the Traversal will work against the Features. If not, > ActorVerificationException. If so, execute it. > > In conclusion — I’m starting to see GraphComputer as our OLAP 1.0 and > GraphActors as our OLAP/OLTP 2.0. I put in there OLTP because with systems > like Akka that don’t require big bulk data migrations, you can execute > against the Graph connection object…. Even with SparkGraphActors, you could > just have workers that work against Graph connection objects (the only RDD > data is messages!!!). Thus, with GraphActors, we start to smear the concept > of OLAP and OLTP. > > Anywho — I think if we get GraphActors right, we will solve many of the > shortcomings of GraphComputer while, at the same time, providing a powerful > distributed graph computing framework. > > Take care, > Marko. > > http://markorodriguez.com > > > > -- > You received this message because you are subscribed to the Google Groups > "Gremlin-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/gremlin-users/49B70143-DB77-429D-956F-869B20A554C4%40gmail.com > <https://groups.google.com/d/msgid/gremlin-users/49B70143-DB77-429D-956F-869B20A554C4%40gmail.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. >
