[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user mpollmeier commented on the issue: https://github.com/apache/tinkerpop/pull/705 Thanks @spmallette. I already voted +1 above. Made those changes, merged (--no-ff) to tp32 and then merged tp32 to master (again --no-ff). ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/705 Added a couple of minor nits as comments, but otherwise: VOTE +1 @mpollmeier if you vote yourself, you'll have your 3 +1s and can merge. please be sure to merge tp32 to master as well. ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user robertdale commented on the issue: https://github.com/apache/tinkerpop/pull/705 VOTE +1 ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user mpollmeier commented on the issue: https://github.com/apache/tinkerpop/pull/705 @spmallette addressed your comments. VOTE +1 ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/705 At this point I'd be -1 if we turned this into a "feature". I only thought of this as a convenience to TinkerGraph. As I mentioned before I really don't see why a `clone()` would make sense in most other graph databases. I sort of think of `clone()` as a feature of TinkerGraph the way indexing is a feature of TinkerGraph. So I technically preferred the PR as it was as opposed to a generalized utility function that will work shoddily for large graphs. Anyway, here's the solution I have that should make everyone content. @okram liked this as a utility class but ultimately didn't have strong feeling about it either way. @mpollmeier seemed to make it clear that this was to help with testing. How about we just move `GraphHelper` to gremlin-test. https://github.com/apache/tinkerpop/tree/master/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin Then it is a utility that clearly exists for testing use cases only. TinkerGraph depends on gremlin-test and can thus directly test it's capabilties - maybe just add your "clone" test to: https://github.com/apache/tinkerpop/blob/master/tinkergraph-gremlin/src/test/java/org/apache/tinkerpop/gremlin/tinkergraph/structure/TinkerGraphTest.java @mpollmeier if this is agreeable to you, perhaps wait a few days to see if there are other comments before progressing forward. i'd hate for you to make more changes and then someone yells -1 at you. ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user robertdale commented on the issue: https://github.com/apache/tinkerpop/pull/705 That dependency cycle is bad. It should probably be put in `gremlin-test`. Maybe even make it a [Graph Feature](http://tinkerpop.apache.org/docs/current/reference/#_features) - Cloning. ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user mpollmeier commented on the issue: https://github.com/apache/tinkerpop/pull/705 hmm, since I introduced a dependency from gremlin-core:test to tinkergraph-gremlin the build complains about a cyclic dependency. I thought that would have been fine, given that it's just a test dependency. Any suggestions how to best resolve that without scrapping the test? ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user mpollmeier commented on the issue: https://github.com/apache/tinkerpop/pull/705 Ok gents, next iteration is ready for review: * addressed stephen's comments, rebased and changed target to tp32 * moved the impl to GraphHelper which lives next to ElementHelper in org.apache.tinkerpop.gremlin.structure.util. Note: I added a dependency with scope `test` to gremlin-core * updated CHANGELOG and upgrade docs ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/705 Yes. Exactly. Here is the thing is -- I don't really care for TinkerPop3 anymore so it can go as people want it too, I won't fight. But if you want to start aligning concepts so TinkerPop4 is done "right," we need to start getting away from methods on `Graph` as users will not have access to `Graph` only `Connections`. Thus, every action/process is a `Traversal` and thus, all notions of cloning, clearing, etc. should be via compilation/strategy_introspection of the `Traversal`. "But Marko, what about schema stuff like `createIndex`? How will that work in TinkerPop4" ... That is a vendor specific schema DSL. `schema.index("name")`, etc. I think instead of just going "willy nilly" with TinkerPop3, we should start to make decisions on its design that will align best with TinkerPop4 so we start to get the right thinking in place now for the big effort to come. ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/705 hmm - and then a TinkerGraph strategy that converts `g.V().drop()` to a `clear()` operation somehow (to avoid overhead of iterating all the vertices? by that logic is `clone()` really `g.E().subgraph('sg').cap('sg')`? ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/705 `TinkerGraph.clear()` is bad too. It is `g.V().drop()`. ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/705 `clone()` won't work well for all graphs though. Making it a generalized utility for `clone()` just seems to invite more trouble than having it just available to TinkerGraph where it at least makes some sense. Cloning a JanusGraph or DseGraph instance, - weird - but `clone()` a small in-memory TinkerGraph - makes more sense. Any reason you think that `clone()` on TinkerGraph is different from a TinkerGraph specific method like `clear()`? ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user okram commented on the issue: https://github.com/apache/tinkerpop/pull/705 I don't think it is a good idea to implement `TinkerGraph.clone()`. I think, instead, that there could be a "clone utility" that works for all graphs, not just TinkerGraph. E.g. ``` GraphHelper.cloneGraph(original, new) ``` The reasoning of implementing `clone()` just for test cases is weak and doesn't warrant the semantic confusion that this would entail (as no other graphs are cloneable). ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/705 @mpollmeier a few things: 1. could you please retarget this at tp32 - seems like it would be ok to include this feature there 2. update CHANGELOG to mention this change 3. add a brief entry to the upgrade docs to bring attention to this new feature 4. looks like you need to rebase given some conflicts and i made some other comments on the code itself generally speaking though, it looks good. nice ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user mpollmeier commented on the issue: https://github.com/apache/tinkerpop/pull/705 there we go @spmallette @robertdale ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user mpollmeier commented on the issue: https://github.com/apache/tinkerpop/pull/705 Nice! And it does preserve the IDs as well. I'll update the PR. Thanks @spmallette ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user spmallette commented on the issue: https://github.com/apache/tinkerpop/pull/705 You could deep clone (brand new vertices/edges/properties) of tinkergraph by using detachment. Do something like this: ```text gremlin> graph1 = TinkerFactory.createModern() ==>tinkergraph[vertices:6 edges:6] gremlin> graph2 = TinkerGraph.open() ==>tinkergraph[vertices:0 edges:0] gremlin> graph1.vertices().forEachRemaining { v -> DetachedFactory.detach(v, true).attach(Attachable.Method.create(graph2)) } gremlin> graph2 ==>tinkergraph[vertices:6 edges:0] gremlin> graph1.edges().forEachRemaining { e -> DetachedFactory.detach(e, true).attach(Attachable.Method.create(graph2)) } gremlin> graph2 ==>tinkergraph[vertices:6 edges:6] ``` I think that gets it, right? ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user mpollmeier commented on the issue: https://github.com/apache/tinkerpop/pull/705 That's very true. I just experimented with deep clones and it doesn't look straightforward * kryo can do it directly, but all referenced classes need a no-arg-constructor, and since there are some coming in from apache commons, it doesn't look like it's feasible. https://github.com/EsotericSoftware/kryo#copyingcloning * java's builtin ByteArrayOutputStream has a similar problem: all referenced classes need to be marked as `Serializable` * implementing `clone` for TinkerVertex, TinkerEdge, TinkerProperty etc. is hard because they all reference each other, i.e. you're in a chicken and egg situation. I guess we have two options: 1) accept that it's a shallow copy and add a comment, as you suggested 2) disregard this PR Thoughts? Any other ideas for making a deep copy? There's always the option to serialise it into graphml/graphson etc., but that's slow for large graphs. ---
[GitHub] tinkerpop issue #705: make TinkerGraph cloneable
Github user robertdale commented on the issue: https://github.com/apache/tinkerpop/pull/705 This doesn't look like a deep copy e.g. modifying a property would be reflected in both graphs. Which may be fine. I think any limitations should be noted in the clone method comment. ---