Thanks very much Marko. I researched the query one step at a time and gained
much more knowledge about gremlin.
However, I wanted to do something a little bit different, instead of
comparing the "name" property of the children nodes to the source node, I
wanted to compare among the siblings of the children nodes (only first level
under the source node) and if there are duplicates, only keep the one with
the biggest degree of "bar" relationship. (The source node doesn't have a
"name" property).
For example,
v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes)
v(1) --foo--> v(3) name: "abc --bar --> (20 nodes)
v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes)
v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes)
would become:
v(1) --foo--> v(3) name: "abc --bar --> (20 nodes)
v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes)
So instead of doing
g.v(1).sideEffect{x =
it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}
I proposed doing:
g.v(1).out("foo").transform{[it, it.name,
it.out("bar").count]}.aggregate.cap
to get an array of first level children nodes, their names, and degree of
"bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", 15],
[v(5), "xyz", 20]
And then I can sort the array by the name property, and iterate through that
array to delete nodes that have a smaller count based on the count value
specified in each sub array.
But since my gremlin knowledge is still very limited, before digging too
much into this proposed solution I want to verify with you that it would
work and see if you have better or easier approach to do it (i.e. maybe one
simple method that I can make use that I'm not aware of). Thanks very much
again.
On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez <[email protected]>wrote:
> Hi,
>
> > Currently I'm doing the following in my own code with multiple requests
> to the standalone neo4j server. I wonder if it's possible to achieve in one
> gremlin query/script so that I can post the gremlin query to the server as 1
> request and done. What I'm trying to achieve is:
> >
> > Start from one given node (e.g. v1), get all of the nodes connected
> through a given type of relationship (e.g. relationship "foo"), within all
> of these nodes, see if their "name" property has the same value, and if so,
> delete the node (and the "foo" relationship connected to it) with smaller
> outgoing degree (on a specific type of relationship, say, "bar"). If there
> are more than two nodes with the same "name" property, only keep the one
> with biggest outgoing degree (on type "bar").
>
>
> The query below is to warm you up. It will delete all vertices with same
> property value as source vertex that are 'foo' related to source vertex.
> Given that you are mutating the graph, you will want to deal with
> transaction buffers so you don't do one transaction per mutations:
> https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions
>
> g.v(1).sideEffect{x =
> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)}
>
> -----------------------------------------
>
> To do the stuff with the smaller counts, etc. You can do:
>
> g.v(1).sideEffect{x =
> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.transform{[it,
> it.outE('bar').count()]}.filter{it[1] > 0}.aggregate.cap.next().sort{a,b ->
> b[1] <=> a[1]}.eachWithIndex{a,i -> if(i > 0) g.removeVertex(a[0])}
>
> There you go! One big fatty Gremlin query to solve your problem.
>
> I would recommend going through each step and seeing what it returns so you
> understand what is going on.... Again, given that you are mutating the
> graph, be sure to be wise about transactions.
>
> Enjoy!,
> Marko.
>
> http://markorodriguez.com
>
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user