Hi,

Note that with Blueprints 1.0, you do not have to deal with a commit manager. 
You can do:

        graph.setTransactionBufferSize(50);

...and then simply do your traversal. No manager.incrCount() needed. I believe 
the latest Neo4j release uses Gremlin 1.3 and Blueprints 1.0. ?? Peter?

Take care,
Marko.

http://markorodriguez.com

On Oct 25, 2011, at 12:43 PM, Nuo Yan wrote:

> For the record, in case someone else has similar need, I came up with the
> following query that does what I described in the last email below (still on
> gremlin 1.2 so still using Commit Manager):
> 
> manager = TransactionalGraphHelper.createCommitManager(g, 50);
> g.v(1).out('foo').transform{[it, it.name,
> it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value
> -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0)
> {g.removeVertex(a[0]); manager.incrCounter()}}}
> manager.close();
> 
> After going through this I got a lot better understanding in Gremlin. Thanks
> Peter and Marko.
> 
> 
> On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan <[email protected]> wrote:
> 
>> Thanks very much Marko. I researched the query one step at a time and
>> gained much more knowledge about gremlin.
>> 
>> However, I wanted to do something a little bit different, instead of
>> comparing the "name" property of the children nodes to the source node, I
>> wanted to compare among the siblings of the children nodes (only first level
>> under the source node) and if there are duplicates, only keep the one with
>> the biggest degree of "bar" relationship. (The source node doesn't have a
>> "name" property).
>> 
>> For example,
>> 
>> v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes)
>> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes)
>> v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes)
>> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes)
>> 
>> would become:
>> 
>> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes)
>> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes)
>> 
>> So instead of doing
>> 
>> 
>> g.v(1).sideEffect{x =
>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}
>> 
>> I proposed doing:
>> 
>> g.v(1).out("foo").transform{[it, it.name,
>> it.out("bar").count]}.aggregate.cap
>> 
>> to get an array of first level children nodes, their names, and degree of
>> "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", 15],
>> [v(5), "xyz", 20]
>> 
>> And then I can sort the array by the name property, and iterate through
>> that array to delete nodes that have a smaller count based on the count
>> value specified in each sub array.
>> 
>> But since my gremlin knowledge is still very limited, before digging too
>> much into this proposed solution I want to verify with you that it would
>> work and see if you have better or easier approach to do it (i.e. maybe one
>> simple method that I can make use that I'm not aware of).  Thanks very much
>> again.
>> 
>> 
>> On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez <[email protected]>wrote:
>> 
>>> Hi,
>>> 
>>>> Currently I'm doing the following in my own code with multiple requests
>>> to the standalone neo4j server. I wonder if it's possible to achieve in one
>>> gremlin query/script so that I can post the gremlin query to the server as 1
>>> request and done. What I'm trying to achieve is:
>>>> 
>>>> Start from one given node (e.g. v1), get all of the nodes connected
>>> through a given type of relationship (e.g. relationship "foo"), within all
>>> of these nodes, see if their "name" property has the same value, and if so,
>>> delete the node (and the "foo" relationship connected to it) with smaller
>>> outgoing degree (on a specific type of relationship, say, "bar"). If there
>>> are more than two nodes with the same "name" property, only keep the one
>>> with biggest outgoing degree (on type "bar").
>>> 
>>> 
>>> The query below is to warm you up. It will delete all vertices with same
>>> property value as source vertex that are 'foo' related to source vertex.
>>> Given that you are mutating the graph, you will want to deal with
>>> transaction buffers so you don't do one transaction per mutations:
>>>       https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions
>>> 
>>> g.v(1).sideEffect{x =
>>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)}
>>> 
>>> -----------------------------------------
>>> 
>>> To do the stuff with the smaller counts, etc. You can do:
>>> 
>>> g.v(1).sideEffect{x =
>>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.transform{[it,
>>> it.outE('bar').count()]}.filter{it[1] > 0}.aggregate.cap.next().sort{a,b ->
>>> b[1] <=> a[1]}.eachWithIndex{a,i -> if(i > 0) g.removeVertex(a[0])}
>>> 
>>> There you go! One big fatty Gremlin query to solve your problem.
>>> 
>>> I would recommend going through each step and seeing what it returns so
>>> you understand what is going on.... Again, given that you are mutating the
>>> graph, be sure to be wise about transactions.
>>> 
>>> Enjoy!,
>>> Marko.
>>> 
>>> http://markorodriguez.com
>>> 
>>> _______________________________________________
>>> Neo4j mailing list
>>> [email protected]
>>> https://lists.neo4j.org/mailman/listinfo/user
>>> 
>> 
>> 
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user

_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to