Re: [Neo4j] Gremlin help
On Tue, Oct 25, 2011 at 12:35 PM, Peter Neubauer < peter.neuba...@neotechnology.com> wrote: > Yes, > that is true. We are still in QA with 1.5 GA, expect it during the > next few weeks as we are hunting down HA potential issues. Hope it is > ok to wait for some more days? > Sure no problem. :) > > Cheers, > > /peter neubauer > > GTalk: neubauer.peter > Skype peter.neubauer > Phone +46 704 106975 > LinkedIn http://www.linkedin.com/in/neubauer > Twitter http://twitter.com/peterneubauer > > http://www.neo4j.org - NOSQL for the Enterprise. > http://startupbootcamp.org/- Öresund - Innovation happens HERE. > > > > On Tue, Oct 25, 2011 at 2:32 PM, Nuo Yan wrote: > > Hi Marko, > > > > I believe 1.5 milestone release has Gremlin 1.3 and Blueprints 1.0 but > > before 1.5 stable release I'm going to be using 1.4.x. In 1.4.2 it only > has > > Gremlin 1.2 and doesn't appear to have the setTransactionBufferSize > stuff. > > > > On Tue, Oct 25, 2011 at 11:52 AM, Marko Rodriguez >wrote: > > > >> Hi, > >> > >> Note that with Blueprints 1.0, you do not have to deal with a commit > >> manager. You can do: > >> > >>graph.setTransactionBufferSize(50); > >> > >> ...and then simply do your traversal. No manager.incrCount() needed. I > >> believe the latest Neo4j release uses Gremlin 1.3 and Blueprints 1.0. ?? > >> Peter? > >> > >> Take care, > >> Marko. > >> > >> http://markorodriguez.com > >> > >> On Oct 25, 2011, at 12:43 PM, Nuo Yan wrote: > >> > >> > For the record, in case someone else has similar need, I came up with > the > >> > following query that does what I described in the last email below > (still > >> on > >> > gremlin 1.2 so still using Commit Manager): > >> > > >> > manager = TransactionalGraphHelper.createCommitManager(g, 50); > >> > g.v(1).out('foo').transform{[it, it.name, > >> > > >> > it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value > >> > -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) > >> > {g.removeVertex(a[0]); manager.incrCounter()}}} > >> > manager.close(); > >> > > >> > After going through this I got a lot better understanding in Gremlin. > >> Thanks > >> > Peter and Marko. > >> > > >> > > >> > On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan wrote: > >> > > >> >> Thanks very much Marko. I researched the query one step at a time and > >> >> gained much more knowledge about gremlin. > >> >> > >> >> However, I wanted to do something a little bit different, instead of > >> >> comparing the "name" property of the children nodes to the source > node, > >> I > >> >> wanted to compare among the siblings of the children nodes (only > first > >> level > >> >> under the source node) and if there are duplicates, only keep the one > >> with > >> >> the biggest degree of "bar" relationship. (The source node doesn't > have > >> a > >> >> "name" property). > >> >> > >> >> For example, > >> >> > >> >> v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) > >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) > >> >> v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) > >> >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) > >> >> > >> >> would become: > >> >> > >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) > >> >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) > >> >> > >> >> So instead of doing > >> >> > >> >> > >> >> g.v(1).sideEffect{x = > >> >> > >> > it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} > >> >> > >> >> I proposed doing: > >> >> > >> >> g.v(1).out("foo").transform{[it, it.name, > >> >> it.out("bar").count]}.aggregate.cap > >> >> > >> >> to get an array of first level children nodes, their names, and > degree > >> of > >> >> "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", > >> 15], > >> >> [v(5), "xyz", 20] > >> >> > >> >> And then I can sort the array by the name property, and iterate > through > >> >> that array to delete nodes that have a smaller count based on the > count > >> >> value specified in each sub array. > >> >> > >> >> But since my gremlin knowledge is still very limited, before digging > too > >> >> much into this proposed solution I want to verify with you that it > would > >> >> work and see if you have better or easier approach to do it (i.e. > maybe > >> one > >> >> simple method that I can make use that I'm not aware of). Thanks > very > >> much > >> >> again. > >> >> > >> >> > >> >> On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez < > okramma...@gmail.com > >> >wrote: > >> >> > >> >>> Hi, > >> >>> > >> Currently I'm doing the following in my own code with multiple > >> requests > >> >>> to the standalone neo4j server. I wonder if it's possible to achieve > in > >> one > >> >>> gremlin query/script so that I can post the gremlin query to the > server > >> as 1 > >> >>> request and done. What I'm trying to achieve is: > >> > >> Start from one given node (e.g. v1), get all of the nodes connected > >> >>> thr
Re: [Neo4j] Gremlin help
Yes, that is true. We are still in QA with 1.5 GA, expect it during the next few weeks as we are hunting down HA potential issues. Hope it is ok to wait for some more days? Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. On Tue, Oct 25, 2011 at 2:32 PM, Nuo Yan wrote: > Hi Marko, > > I believe 1.5 milestone release has Gremlin 1.3 and Blueprints 1.0 but > before 1.5 stable release I'm going to be using 1.4.x. In 1.4.2 it only has > Gremlin 1.2 and doesn't appear to have the setTransactionBufferSize stuff. > > On Tue, Oct 25, 2011 at 11:52 AM, Marko Rodriguez wrote: > >> Hi, >> >> Note that with Blueprints 1.0, you do not have to deal with a commit >> manager. You can do: >> >> graph.setTransactionBufferSize(50); >> >> ...and then simply do your traversal. No manager.incrCount() needed. I >> believe the latest Neo4j release uses Gremlin 1.3 and Blueprints 1.0. ?? >> Peter? >> >> Take care, >> Marko. >> >> http://markorodriguez.com >> >> On Oct 25, 2011, at 12:43 PM, Nuo Yan wrote: >> >> > For the record, in case someone else has similar need, I came up with the >> > following query that does what I described in the last email below (still >> on >> > gremlin 1.2 so still using Commit Manager): >> > >> > manager = TransactionalGraphHelper.createCommitManager(g, 50); >> > g.v(1).out('foo').transform{[it, it.name, >> > >> it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value >> > -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) >> > {g.removeVertex(a[0]); manager.incrCounter()}}} >> > manager.close(); >> > >> > After going through this I got a lot better understanding in Gremlin. >> Thanks >> > Peter and Marko. >> > >> > >> > On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan wrote: >> > >> >> Thanks very much Marko. I researched the query one step at a time and >> >> gained much more knowledge about gremlin. >> >> >> >> However, I wanted to do something a little bit different, instead of >> >> comparing the "name" property of the children nodes to the source node, >> I >> >> wanted to compare among the siblings of the children nodes (only first >> level >> >> under the source node) and if there are duplicates, only keep the one >> with >> >> the biggest degree of "bar" relationship. (The source node doesn't have >> a >> >> "name" property). >> >> >> >> For example, >> >> >> >> v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> >> v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) >> >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> >> >> would become: >> >> >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> >> >> So instead of doing >> >> >> >> >> >> g.v(1).sideEffect{x = >> >> >> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} >> >> >> >> I proposed doing: >> >> >> >> g.v(1).out("foo").transform{[it, it.name, >> >> it.out("bar").count]}.aggregate.cap >> >> >> >> to get an array of first level children nodes, their names, and degree >> of >> >> "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", >> 15], >> >> [v(5), "xyz", 20] >> >> >> >> And then I can sort the array by the name property, and iterate through >> >> that array to delete nodes that have a smaller count based on the count >> >> value specified in each sub array. >> >> >> >> But since my gremlin knowledge is still very limited, before digging too >> >> much into this proposed solution I want to verify with you that it would >> >> work and see if you have better or easier approach to do it (i.e. maybe >> one >> >> simple method that I can make use that I'm not aware of). Thanks very >> much >> >> again. >> >> >> >> >> >> On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez > >wrote: >> >> >> >>> Hi, >> >>> >> Currently I'm doing the following in my own code with multiple >> requests >> >>> to the standalone neo4j server. I wonder if it's possible to achieve in >> one >> >>> gremlin query/script so that I can post the gremlin query to the server >> as 1 >> >>> request and done. What I'm trying to achieve is: >> >> Start from one given node (e.g. v1), get all of the nodes connected >> >>> through a given type of relationship (e.g. relationship "foo"), within >> all >> >>> of these nodes, see if their "name" property has the same value, and if >> so, >> >>> delete the node (and the "foo" relationship connected to it) with >> smaller >> >>> outgoing degree (on a specific type of relationship, say, "bar"). If >> there >> >>> are more than two nodes with the same "name" property, only keep the >> one >> >>> with biggest outgoing degree (on typ
Re: [Neo4j] Gremlin help
Hi Marko, I believe 1.5 milestone release has Gremlin 1.3 and Blueprints 1.0 but before 1.5 stable release I'm going to be using 1.4.x. In 1.4.2 it only has Gremlin 1.2 and doesn't appear to have the setTransactionBufferSize stuff. On Tue, Oct 25, 2011 at 11:52 AM, Marko Rodriguez wrote: > Hi, > > Note that with Blueprints 1.0, you do not have to deal with a commit > manager. You can do: > >graph.setTransactionBufferSize(50); > > ...and then simply do your traversal. No manager.incrCount() needed. I > believe the latest Neo4j release uses Gremlin 1.3 and Blueprints 1.0. ?? > Peter? > > Take care, > Marko. > > http://markorodriguez.com > > On Oct 25, 2011, at 12:43 PM, Nuo Yan wrote: > > > For the record, in case someone else has similar need, I came up with the > > following query that does what I described in the last email below (still > on > > gremlin 1.2 so still using Commit Manager): > > > > manager = TransactionalGraphHelper.createCommitManager(g, 50); > > g.v(1).out('foo').transform{[it, it.name, > > > it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value > > -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) > > {g.removeVertex(a[0]); manager.incrCounter()}}} > > manager.close(); > > > > After going through this I got a lot better understanding in Gremlin. > Thanks > > Peter and Marko. > > > > > > On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan wrote: > > > >> Thanks very much Marko. I researched the query one step at a time and > >> gained much more knowledge about gremlin. > >> > >> However, I wanted to do something a little bit different, instead of > >> comparing the "name" property of the children nodes to the source node, > I > >> wanted to compare among the siblings of the children nodes (only first > level > >> under the source node) and if there are duplicates, only keep the one > with > >> the biggest degree of "bar" relationship. (The source node doesn't have > a > >> "name" property). > >> > >> For example, > >> > >> v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) > >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) > >> v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) > >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) > >> > >> would become: > >> > >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) > >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) > >> > >> So instead of doing > >> > >> > >> g.v(1).sideEffect{x = > >> > it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} > >> > >> I proposed doing: > >> > >> g.v(1).out("foo").transform{[it, it.name, > >> it.out("bar").count]}.aggregate.cap > >> > >> to get an array of first level children nodes, their names, and degree > of > >> "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", > 15], > >> [v(5), "xyz", 20] > >> > >> And then I can sort the array by the name property, and iterate through > >> that array to delete nodes that have a smaller count based on the count > >> value specified in each sub array. > >> > >> But since my gremlin knowledge is still very limited, before digging too > >> much into this proposed solution I want to verify with you that it would > >> work and see if you have better or easier approach to do it (i.e. maybe > one > >> simple method that I can make use that I'm not aware of). Thanks very > much > >> again. > >> > >> > >> On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez >wrote: > >> > >>> Hi, > >>> > Currently I'm doing the following in my own code with multiple > requests > >>> to the standalone neo4j server. I wonder if it's possible to achieve in > one > >>> gremlin query/script so that I can post the gremlin query to the server > as 1 > >>> request and done. What I'm trying to achieve is: > > Start from one given node (e.g. v1), get all of the nodes connected > >>> through a given type of relationship (e.g. relationship "foo"), within > all > >>> of these nodes, see if their "name" property has the same value, and if > so, > >>> delete the node (and the "foo" relationship connected to it) with > smaller > >>> outgoing degree (on a specific type of relationship, say, "bar"). If > there > >>> are more than two nodes with the same "name" property, only keep the > one > >>> with biggest outgoing degree (on type "bar"). > >>> > >>> > >>> The query below is to warm you up. It will delete all vertices with > same > >>> property value as source vertex that are 'foo' related to source > vertex. > >>> Given that you are mutating the graph, you will want to deal with > >>> transaction buffers so you don't do one transaction per mutations: > >>> https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions > >>> > >>> g.v(1).sideEffect{x = > >>> > it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)} > >>> > >>> - > >>> > >>> To do the stuff with the smaller counts, etc. You can do: > >>> > >>>
Re: [Neo4j] Gremlin help
Cool. Keep it coming Nuo! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. On Tue, Oct 25, 2011 at 1:43 PM, Nuo Yan wrote: > For the record, in case someone else has similar need, I came up with the > following query that does what I described in the last email below (still on > gremlin 1.2 so still using Commit Manager): > > manager = TransactionalGraphHelper.createCommitManager(g, 50); > g.v(1).out('foo').transform{[it, it.name, > it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value > -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) > {g.removeVertex(a[0]); manager.incrCounter()}}} > manager.close(); > > After going through this I got a lot better understanding in Gremlin. Thanks > Peter and Marko. > > > On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan wrote: > >> Thanks very much Marko. I researched the query one step at a time and >> gained much more knowledge about gremlin. >> >> However, I wanted to do something a little bit different, instead of >> comparing the "name" property of the children nodes to the source node, I >> wanted to compare among the siblings of the children nodes (only first level >> under the source node) and if there are duplicates, only keep the one with >> the biggest degree of "bar" relationship. (The source node doesn't have a >> "name" property). >> >> For example, >> >> v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> would become: >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> So instead of doing >> >> >> g.v(1).sideEffect{x = >> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} >> >> I proposed doing: >> >> g.v(1).out("foo").transform{[it, it.name, >> it.out("bar").count]}.aggregate.cap >> >> to get an array of first level children nodes, their names, and degree of >> "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", 15], >> [v(5), "xyz", 20] >> >> And then I can sort the array by the name property, and iterate through >> that array to delete nodes that have a smaller count based on the count >> value specified in each sub array. >> >> But since my gremlin knowledge is still very limited, before digging too >> much into this proposed solution I want to verify with you that it would >> work and see if you have better or easier approach to do it (i.e. maybe one >> simple method that I can make use that I'm not aware of). Thanks very much >> again. >> >> >> On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez wrote: >> >>> Hi, >>> >>> > Currently I'm doing the following in my own code with multiple requests >>> to the standalone neo4j server. I wonder if it's possible to achieve in one >>> gremlin query/script so that I can post the gremlin query to the server as 1 >>> request and done. What I'm trying to achieve is: >>> > >>> > Start from one given node (e.g. v1), get all of the nodes connected >>> through a given type of relationship (e.g. relationship "foo"), within all >>> of these nodes, see if their "name" property has the same value, and if so, >>> delete the node (and the "foo" relationship connected to it) with smaller >>> outgoing degree (on a specific type of relationship, say, "bar"). If there >>> are more than two nodes with the same "name" property, only keep the one >>> with biggest outgoing degree (on type "bar"). >>> >>> >>> The query below is to warm you up. It will delete all vertices with same >>> property value as source vertex that are 'foo' related to source vertex. >>> Given that you are mutating the graph, you will want to deal with >>> transaction buffers so you don't do one transaction per mutations: >>> https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions >>> >>> g.v(1).sideEffect{x = >>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)} >>> >>> - >>> >>> To do the stuff with the smaller counts, etc. You can do: >>> >>> g.v(1).sideEffect{x = >>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.transform{[it, >>> it.outE('bar').count()]}.filter{it[1] > 0}.aggregate.cap.next().sort{a,b -> >>> b[1] <=> a[1]}.eachWithIndex{a,i -> if(i > 0) g.removeVertex(a[0])} >>> >>> There you go! One big fatty Gremlin query to solve your problem. >>> >>> I would recommend going through each step and seeing what it returns so >>> you understand what is going on Again, given that you are mutating the >>> graph, be sure to be wise ab
Re: [Neo4j] Gremlin help
Hi, Note that with Blueprints 1.0, you do not have to deal with a commit manager. You can do: graph.setTransactionBufferSize(50); ...and then simply do your traversal. No manager.incrCount() needed. I believe the latest Neo4j release uses Gremlin 1.3 and Blueprints 1.0. ?? Peter? Take care, Marko. http://markorodriguez.com On Oct 25, 2011, at 12:43 PM, Nuo Yan wrote: > For the record, in case someone else has similar need, I came up with the > following query that does what I described in the last email below (still on > gremlin 1.2 so still using Commit Manager): > > manager = TransactionalGraphHelper.createCommitManager(g, 50); > g.v(1).out('foo').transform{[it, it.name, > it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value > -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) > {g.removeVertex(a[0]); manager.incrCounter()}}} > manager.close(); > > After going through this I got a lot better understanding in Gremlin. Thanks > Peter and Marko. > > > On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan wrote: > >> Thanks very much Marko. I researched the query one step at a time and >> gained much more knowledge about gremlin. >> >> However, I wanted to do something a little bit different, instead of >> comparing the "name" property of the children nodes to the source node, I >> wanted to compare among the siblings of the children nodes (only first level >> under the source node) and if there are duplicates, only keep the one with >> the biggest degree of "bar" relationship. (The source node doesn't have a >> "name" property). >> >> For example, >> >> v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> would become: >> >> v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) >> v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) >> >> So instead of doing >> >> >> g.v(1).sideEffect{x = >> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} >> >> I proposed doing: >> >> g.v(1).out("foo").transform{[it, it.name, >> it.out("bar").count]}.aggregate.cap >> >> to get an array of first level children nodes, their names, and degree of >> "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", 15], >> [v(5), "xyz", 20] >> >> And then I can sort the array by the name property, and iterate through >> that array to delete nodes that have a smaller count based on the count >> value specified in each sub array. >> >> But since my gremlin knowledge is still very limited, before digging too >> much into this proposed solution I want to verify with you that it would >> work and see if you have better or easier approach to do it (i.e. maybe one >> simple method that I can make use that I'm not aware of). Thanks very much >> again. >> >> >> On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez wrote: >> >>> Hi, >>> Currently I'm doing the following in my own code with multiple requests >>> to the standalone neo4j server. I wonder if it's possible to achieve in one >>> gremlin query/script so that I can post the gremlin query to the server as 1 >>> request and done. What I'm trying to achieve is: Start from one given node (e.g. v1), get all of the nodes connected >>> through a given type of relationship (e.g. relationship "foo"), within all >>> of these nodes, see if their "name" property has the same value, and if so, >>> delete the node (and the "foo" relationship connected to it) with smaller >>> outgoing degree (on a specific type of relationship, say, "bar"). If there >>> are more than two nodes with the same "name" property, only keep the one >>> with biggest outgoing degree (on type "bar"). >>> >>> >>> The query below is to warm you up. It will delete all vertices with same >>> property value as source vertex that are 'foo' related to source vertex. >>> Given that you are mutating the graph, you will want to deal with >>> transaction buffers so you don't do one transaction per mutations: >>> https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions >>> >>> g.v(1).sideEffect{x = >>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)} >>> >>> - >>> >>> To do the stuff with the smaller counts, etc. You can do: >>> >>> g.v(1).sideEffect{x = >>> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.transform{[it, >>> it.outE('bar').count()]}.filter{it[1] > 0}.aggregate.cap.next().sort{a,b -> >>> b[1] <=> a[1]}.eachWithIndex{a,i -> if(i > 0) g.removeVertex(a[0])} >>> >>> There you go! One big fatty Gremlin query to solve your problem. >>> >>> I would recommend going through each step and seeing what it returns so >>> you understand what is going on Again, given that you are mutating the >>> graph, be sure to be wis
Re: [Neo4j] Gremlin help
For the record, in case someone else has similar need, I came up with the following query that does what I described in the last email below (still on gremlin 1.2 so still using Commit Manager): manager = TransactionalGraphHelper.createCommitManager(g, 50); g.v(1).out('foo').transform{[it, it.name, it.outE('bar').count()]}.aggregate().cap.next().groupBy{it[1]}.each{key,value -> value.sort{a,b -> b[2] <=> a[2]}.eachWithIndex{a,i -> if(i > 0) {g.removeVertex(a[0]); manager.incrCounter()}}} manager.close(); After going through this I got a lot better understanding in Gremlin. Thanks Peter and Marko. On Sat, Oct 22, 2011 at 6:04 PM, Nuo Yan wrote: > Thanks very much Marko. I researched the query one step at a time and > gained much more knowledge about gremlin. > > However, I wanted to do something a little bit different, instead of > comparing the "name" property of the children nodes to the source node, I > wanted to compare among the siblings of the children nodes (only first level > under the source node) and if there are duplicates, only keep the one with > the biggest degree of "bar" relationship. (The source node doesn't have a > "name" property). > > For example, > > v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) > v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) > v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) > v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) > > would become: > > v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) > v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) > > So instead of doing > > > g.v(1).sideEffect{x = > it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} > > I proposed doing: > > g.v(1).out("foo").transform{[it, it.name, > it.out("bar").count]}.aggregate.cap > > to get an array of first level children nodes, their names, and degree of > "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", 15], > [v(5), "xyz", 20] > > And then I can sort the array by the name property, and iterate through > that array to delete nodes that have a smaller count based on the count > value specified in each sub array. > > But since my gremlin knowledge is still very limited, before digging too > much into this proposed solution I want to verify with you that it would > work and see if you have better or easier approach to do it (i.e. maybe one > simple method that I can make use that I'm not aware of). Thanks very much > again. > > > On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez wrote: > >> Hi, >> >> > Currently I'm doing the following in my own code with multiple requests >> to the standalone neo4j server. I wonder if it's possible to achieve in one >> gremlin query/script so that I can post the gremlin query to the server as 1 >> request and done. What I'm trying to achieve is: >> > >> > Start from one given node (e.g. v1), get all of the nodes connected >> through a given type of relationship (e.g. relationship "foo"), within all >> of these nodes, see if their "name" property has the same value, and if so, >> delete the node (and the "foo" relationship connected to it) with smaller >> outgoing degree (on a specific type of relationship, say, "bar"). If there >> are more than two nodes with the same "name" property, only keep the one >> with biggest outgoing degree (on type "bar"). >> >> >> The query below is to warm you up. It will delete all vertices with same >> property value as source vertex that are 'foo' related to source vertex. >> Given that you are mutating the graph, you will want to deal with >> transaction buffers so you don't do one transaction per mutations: >>https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions >> >> g.v(1).sideEffect{x = >> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)} >> >> - >> >> To do the stuff with the smaller counts, etc. You can do: >> >> g.v(1).sideEffect{x = >> it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.transform{[it, >> it.outE('bar').count()]}.filter{it[1] > 0}.aggregate.cap.next().sort{a,b -> >> b[1] <=> a[1]}.eachWithIndex{a,i -> if(i > 0) g.removeVertex(a[0])} >> >> There you go! One big fatty Gremlin query to solve your problem. >> >> I would recommend going through each step and seeing what it returns so >> you understand what is going on Again, given that you are mutating the >> graph, be sure to be wise about transactions. >> >> Enjoy!, >> Marko. >> >> http://markorodriguez.com >> >> ___ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user >> > > ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Gremlin help
Thanks very much Marko. I researched the query one step at a time and gained much more knowledge about gremlin. However, I wanted to do something a little bit different, instead of comparing the "name" property of the children nodes to the source node, I wanted to compare among the siblings of the children nodes (only first level under the source node) and if there are duplicates, only keep the one with the biggest degree of "bar" relationship. (The source node doesn't have a "name" property). For example, v(1) --foo--> v(2) name: "abc" --bar--> (15 nodes) v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) v(1) --foo--> v(4) name "xyz" --bar--> (15 nodes) v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) would become: v(1) --foo--> v(3) name: "abc --bar --> (20 nodes) v(1) --foo--> v(5) name "xyz" --bar--> (25 nodes) So instead of doing g.v(1).sideEffect{x = it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)} I proposed doing: g.v(1).out("foo").transform{[it, it.name, it.out("bar").count]}.aggregate.cap to get an array of first level children nodes, their names, and degree of "bar" edges like [v(2), "abc", 15], [v(3), "abc", 20], [v(4), "xyz", 15], [v(5), "xyz", 20] And then I can sort the array by the name property, and iterate through that array to delete nodes that have a smaller count based on the count value specified in each sub array. But since my gremlin knowledge is still very limited, before digging too much into this proposed solution I want to verify with you that it would work and see if you have better or easier approach to do it (i.e. maybe one simple method that I can make use that I'm not aware of). Thanks very much again. On Sat, Oct 22, 2011 at 9:40 AM, Marko Rodriguez wrote: > Hi, > > > Currently I'm doing the following in my own code with multiple requests > to the standalone neo4j server. I wonder if it's possible to achieve in one > gremlin query/script so that I can post the gremlin query to the server as 1 > request and done. What I'm trying to achieve is: > > > > Start from one given node (e.g. v1), get all of the nodes connected > through a given type of relationship (e.g. relationship "foo"), within all > of these nodes, see if their "name" property has the same value, and if so, > delete the node (and the "foo" relationship connected to it) with smaller > outgoing degree (on a specific type of relationship, say, "bar"). If there > are more than two nodes with the same "name" property, only keep the one > with biggest outgoing degree (on type "bar"). > > > The query below is to warm you up. It will delete all vertices with same > property value as source vertex that are 'foo' related to source vertex. > Given that you are mutating the graph, you will want to deal with > transaction buffers so you don't do one transaction per mutations: >https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions > > g.v(1).sideEffect{x = > it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)} > > - > > To do the stuff with the smaller counts, etc. You can do: > > g.v(1).sideEffect{x = > it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.transform{[it, > it.outE('bar').count()]}.filter{it[1] > 0}.aggregate.cap.next().sort{a,b -> > b[1] <=> a[1]}.eachWithIndex{a,i -> if(i > 0) g.removeVertex(a[0])} > > There you go! One big fatty Gremlin query to solve your problem. > > I would recommend going through each step and seeing what it returns so you > understand what is going on Again, given that you are mutating the > graph, be sure to be wise about transactions. > > Enjoy!, > Marko. > > http://markorodriguez.com > > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Gremlin help
Hi, > Currently I'm doing the following in my own code with multiple requests to > the standalone neo4j server. I wonder if it's possible to achieve in one > gremlin query/script so that I can post the gremlin query to the server as 1 > request and done. What I'm trying to achieve is: > > Start from one given node (e.g. v1), get all of the nodes connected through a > given type of relationship (e.g. relationship "foo"), within all of these > nodes, see if their "name" property has the same value, and if so, delete the > node (and the "foo" relationship connected to it) with smaller outgoing > degree (on a specific type of relationship, say, "bar"). If there are more > than two nodes with the same "name" property, only keep the one with biggest > outgoing degree (on type "bar"). The query below is to warm you up. It will delete all vertices with same property value as source vertex that are 'foo' related to source vertex. Given that you are mutating the graph, you will want to deal with transaction buffers so you don't do one transaction per mutations: https://github.com/tinkerpop/blueprints/wiki/Graph-Transactions g.v(1).sideEffect{x = it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.sideEffect{g.removeVertex(it)} - To do the stuff with the smaller counts, etc. You can do: g.v(1).sideEffect{x = it.getProperty('name')}.out('foo').filter{it.getProperty('name').equals(x)}.transform{[it, it.outE('bar').count()]}.filter{it[1] > 0}.aggregate.cap.next().sort{a,b -> b[1] <=> a[1]}.eachWithIndex{a,i -> if(i > 0) g.removeVertex(a[0])} There you go! One big fatty Gremlin query to solve your problem. I would recommend going through each step and seeing what it returns so you understand what is going on Again, given that you are mutating the graph, be sure to be wise about transactions. Enjoy!, Marko. http://markorodriguez.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Gremlin help
Nuo, In principle this looks ok except you will have to take care that you are not deleting nodes that are in the current traversal and would recursively change your traversal result. Dunno the Groovy expression for this, but if you can do it in Java, you can do it in Groovy, for instance http://docs.neo4j.org/chunked/snapshot/gremlin-plugin.html#rest-api-send-an-arbitrary-groovy-script---lucene-sorting HTH /peter On Friday, October 21, 2011, Nuo Yan wrote: > Hi Marko and Gremlin gurus: > > Currently I'm doing the following in my own code with multiple requests to > the standalone neo4j server. I wonder if it's possible to achieve in one > gremlin query/script so that I can post the gremlin query to the server as 1 > request and done. What I'm trying to achieve is: > > Start from one given node (e.g. v1), get all of the nodes connected through > a given type of relationship (e.g. relationship "foo"), within all of these > nodes, see if their "name" property has the same value, and if so, delete > the node (and the "foo" relationship connected to it) with smaller outgoing > degree (on a specific type of relationship, say, "bar"). If there are more > than two nodes with the same "name" property, only keep the one with biggest > outgoing degree (on type "bar"). > > > For example, for the following graph: > > v1 --foo--> v2("name" => "abc") --"bar"--> (15 nodes) > v1 --foo--> v3("name" => "abc") --"bar"--> (5 nodes) > v1 --foo--> v4("name" => "abc") --"bar"--> (8 nodes) > v1 --foo--> v5("name" => "xyz")--"bar"-->(16 nodes) > v1 --foo--> v6("name" => "abc")--"not_bar"--> (20 nodes) > > Ideally, after running the gremlin script, it should be: > > v1 --foo--> v2("name" => "abc") --"bar"--> (15 nodes) > v1 --foo--> v5("name" => "xyz")--"bar"-->(16 nodes) > v1 --foo--> v6("name" => "abc")--"not_bar"--> (20 nodes) > > with v3 and v4 (and the "foo" relationships connecting them to v1) deleted > because they have the same "name" attributes with v2 but a smaller degree > with outgoing "bar" relationship. > > It this possible to achieve relatively easily with Gremlin? > ___ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > -- Sent from Gmail Mobile ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Gremlin help
Hi Marko and Gremlin gurus: Currently I'm doing the following in my own code with multiple requests to the standalone neo4j server. I wonder if it's possible to achieve in one gremlin query/script so that I can post the gremlin query to the server as 1 request and done. What I'm trying to achieve is: Start from one given node (e.g. v1), get all of the nodes connected through a given type of relationship (e.g. relationship "foo"), within all of these nodes, see if their "name" property has the same value, and if so, delete the node (and the "foo" relationship connected to it) with smaller outgoing degree (on a specific type of relationship, say, "bar"). If there are more than two nodes with the same "name" property, only keep the one with biggest outgoing degree (on type "bar"). For example, for the following graph: v1 --foo--> v2("name" => "abc") --"bar"--> (15 nodes) v1 --foo--> v3("name" => "abc") --"bar"--> (5 nodes) v1 --foo--> v4("name" => "abc") --"bar"--> (8 nodes) v1 --foo--> v5("name" => "xyz")--"bar"-->(16 nodes) v1 --foo--> v6("name" => "abc")--"not_bar"--> (20 nodes) Ideally, after running the gremlin script, it should be: v1 --foo--> v2("name" => "abc") --"bar"--> (15 nodes) v1 --foo--> v5("name" => "xyz")--"bar"-->(16 nodes) v1 --foo--> v6("name" => "abc")--"not_bar"--> (20 nodes) with v3 and v4 (and the "foo" relationships connecting them to v1) deleted because they have the same "name" attributes with v2 but a smaller degree with outgoing "bar" relationship. It this possible to achieve relatively easily with Gremlin? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user