[ 
https://issues.apache.org/jira/browse/TINKERPOP-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stephen mallette updated TINKERPOP-1074:
----------------------------------------
    Fix Version/s:     (was: 3.2.7)

> More contractual testing/specifications around Persist and ResultGraph.
> -----------------------------------------------------------------------
>
>                 Key: TINKERPOP-1074
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1074
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.1.0-incubating
>            Reporter: Marko A. Rodriguez
>
> A {{ComputerResult}} references two objects: a graph and a memory. The graph 
> is the resultant computed graph and the memory contains all the sideEffect 
> data from the computation (if any).
> Right now, we have the following {{Persist}} options: {{NOTHING}}, 
> {{VERTEX_PROPERTIES}}, {{EDGES}}. We also have the following {{ResultGraph}} 
> options: {{ORIGINAL}}, {{NEW}}.
> * NOTHING + ORIGINAL = ComputerResult contains original graph reference.
> * NOTHING + NEW = ?? No test to force what this means! Should be 
> {{EmptyGraph.instance()}}.
> * VERTEX_PROPERTIES + ORIGINAL = ComputerResult contains original graph, but 
> the computed vertex properties have been "saved" to it. (no contractual test 
> cases here either!)
> * VERTEX_PROPERTIES + NEW = ComputerResult contains new graph with only 
> vertices and their properties.
> * EDGES + NEW = ComputerResult contains new graph with vertices, edges, and 
> their properties.
> * EDGES + ORIGINAL = ComputerResult contains original graph, but the computed 
> vertex properties and edges have been "saved" to it. (no contractual test 
> cases here either!)
> {{TinkerGraphComputer}} is the only system that supports all the above 
> configuration combinations. Add test cases to {{GraphComputerTest}} that 
> verify the behavior of all combinations.
> HOWEVER !!!! ------ should we really respect ORIGINAL+PERSIST? Most providers 
> will use {{BulkLoaderVertexProgram}} to write the computed graph back to the 
> original graph. If there are TWO ways of doing this, this seems bad? In fact, 
> the way that TinkerGraphComputer writes the computed graph back to the 
> original graph is nearly identical to how it BulkLoaderVertexProgram works. 
> Thus, I'm wondering if we simply get rid the concept of {{ResultGraph}} and 
> ONLY have {{Persist}}.
> * Persist.NOTHING: Returns the original graph in {{ComputerResult}}.
> * Persist.VERTEX_PROPERTIES: Returns a new graph with only vertices and 
> properties.
> * Persist.EDGES: Returns a new graph with vertices, edges, and their 
> properties.
> For in-memory graphs like {{TinkerGraph}}, "new graph" can mean the original 
> graph with the {{GraphView}} overlay. Thus, its not really a full copy of the 
> original graph. Moreover, Persist.NOTHING just garbage collects the GraphView 
> and thus, the original graph.
> ------------------
> Next, what does {{Persist}} mean for memory? Remember, {{ComputerResult}} 
> also has a reference to sideEffect memory. What if you want to run a job, NOT 
> persist the graph, but persist the memory only. I think we should ALWAYS 
> assume memory persistence. For TinkerGraph, that means the the 
> ComputerResult.memory() has a HashMap of memory values. For Giraph/Spark, 
> that means that the {{Storage}} will always have resultant sideEffect data in 
> the output directory even if there is no graph.
> * {{NOTHING}}: persist memory and return the original graph.
> * {{VERTEX_PROPERTIES}}: persist memory and return new graph of just vertex 
> properties.
> * {{EDGES}}: persist memory and return new graph of vertex properties, and 
> edges.
> Decisions, decisions, decisions....



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to