Cool. Keep it coming Nuo! Cheers,
/peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. On Tue, Oct 25, 2011 at 1:43 PM, Nuo Yan <[email protected]> wrote: > For the record, in case someone else has similar need, I came up with the > following query that does what I described in the last email below (still on > gremlin 1.2 so still using Commit Manager): > > manager = TransactionalGraphHelper.createCommitManager(g, 50); > g.v(1).out('foo').transform{[it, it.name, > it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value > -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) > {g.removeVertex(a[0]); manager.incrCounter()}}} > manager.close(); > > After going through this I got a lot better understanding in Gremlin. Thanks > Peter and Marko. > > > On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan <[email protected]> wrote: > >> Thanks very much Marko. I researched the query one step at a time and >> gained much more knowledge about gremlin. >> >> However, I wanted to do something a little bit different, instead of >> comparing the "name" property of the children nodes to the source node, I >> wanted to compare among the siblings of the children nodes (only first level >> under the source node) and if there are duplicates, only keep the one with >> the biggest degree of "bar" relationship. (The source node doesn't have a >> "name" property). >> >> For example, >> >> v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> would become: >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> So instead of doing >> >> >> g.v(1).sideEffect{x = >> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} >> >> I proposed doing: >> >> g.v(1).out("foo").transform{[it, it.name, >> it.out("bar").count]}.aggregate.cap >> >> to get an array of first level children nodes, their names, and degree of >> "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", 15], >> [v(5), "xyz", 20] >> >> And then I can sort the array by the name property, and iterate through >> that array to delete nodes that have a smaller count based on the count >> value specified in each sub array. >> >> But since my gremlin knowledge is still very limited, before digging too >> much into this proposed solution I want to verify with you that it would >> work and see if you have better or easier approach to do it (i.e. maybe one >> simple method that I can make use that I'm not aware of). Thanks very much >> again. >> >> >> On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez <[email protected]>wrote: >> >>> Hi, >>> >>> > Currently I'm doing the following in my own code with multiple requests >>> to the standalone neo4j server. I wonder if it's possible to achieve in one >>> gremlin query/script so that I can post the gremlin query to the server as 1 >>> request and done. What I'm trying to achieve is: >>> > >>> > Start from one given node (e.g. v1), get all of the nodes connected >>> through a given type of relationship (e.g. relationship "foo"), within all >>> of these nodes, see if their "name" property has the same value, and if so, >>> delete the node (and the "foo" relationship connected to it) with smaller >>> outgoing degree (on a specific type of relationship, say, "bar"). If there >>> are more than two nodes with the same "name" property, only keep the one >>> with biggest outgoing degree (on type "bar"). >>> >>> >>> The query below is to warm you up. It will delete all vertices with same >>> property value as source vertex that are 'foo' related to source vertex. >>> Given that you are mutating the graph, you will want to deal with >>> transaction buffers so you don't do one transaction per mutations: >>> https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions >>> >>> g.v(1).sideEffect{x = >>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)} >>> >>> ----------------------------------------- >>> >>> To do the stuff with the smaller counts, etc. You can do: >>> >>> g.v(1).sideEffect{x = >>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.transform{[it, >>> it.outE('bar').count()]}.filter{it[1] > 0}.aggregate.cap.next().sort{a,b -> >>> b[1] <=> a[1]}.eachWithIndex{a,i -> if(i > 0) g.removeVertex(a[0])} >>> >>> There you go! One big fatty Gremlin query to solve your problem. >>> >>> I would recommend going through each step and seeing what it returns so >>> you understand what is going on.... Again, given that you are mutating the >>> graph, be sure to be wise about transactions. >>> >>> Enjoy!, >>> Marko. >>> >>> http://markorodriguez.com >>> >>> _______________________________________________ >>> Neo4j mailing list >>> [email protected] >>> https://lists.neo4j.org/mailman/listinfo/user >>> >> >> > _______________________________________________ > Neo4j mailing list > [email protected] > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

