[jira] [Commented] (TINKERPOP-1495) Global list deduplication doesn't work in OLAP
[ https://issues.apache.org/jira/browse/TINKERPOP-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15568292#comment-15568292 ] ASF GitHub Bot commented on TINKERPOP-1495: --- Github user dkuppitz commented on the issue: https://github.com/apache/tinkerpop/pull/455 VOTE: +1 > Global list deduplication doesn't work in OLAP > -- > > Key: TINKERPOP-1495 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1495 > Project: TinkerPop > Issue Type: Bug > Components: process >Affects Versions: 3.2.2 >Reporter: Daniel Kuppitz >Assignee: Marko A. Rodriguez > > {noformat} > gremlin> g = TinkerFactory.createModern().traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> a = TinkerFactory.createModern().traversal().withComputer() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], graphcomputer] > gremlin> > gremlin> > g.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > gremlin> > a.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5]] > gremlin> > {noformat} > _Not tested in {{tp31}}._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1495) Global list deduplication doesn't work in OLAP
[ https://issues.apache.org/jira/browse/TINKERPOP-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565588#comment-15565588 ] Marko A. Rodriguez commented on TINKERPOP-1495: --- The problem is with {{emit()}} in {{RepeatStep}} in OLAP. I fixed the dedup issue in this branch. I'll see whats up with {{repeat()}} (perhaps its an OLAP traverser bulk side-effect). > Global list deduplication doesn't work in OLAP > -- > > Key: TINKERPOP-1495 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1495 > Project: TinkerPop > Issue Type: Bug > Components: process >Affects Versions: 3.2.2 >Reporter: Daniel Kuppitz >Assignee: Marko A. Rodriguez > > {noformat} > gremlin> g = TinkerFactory.createModern().traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> a = TinkerFactory.createModern().traversal().withComputer() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], graphcomputer] > gremlin> > gremlin> > g.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > gremlin> > a.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5]] > gremlin> > {noformat} > _Not tested in {{tp31}}._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1495) Global list deduplication doesn't work in OLAP
[ https://issues.apache.org/jira/browse/TINKERPOP-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565556#comment-15565556 ] Marko A. Rodriguez commented on TINKERPOP-1495: --- So I got it working, but then [~dkuppitz] query isn't working as expected. However, I think its a different bug in that query related to {{select()}} in {{group()}}! Dah!. not sure. Created a branch with the kuppitz test commented out. > Global list deduplication doesn't work in OLAP > -- > > Key: TINKERPOP-1495 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1495 > Project: TinkerPop > Issue Type: Bug > Components: process >Affects Versions: 3.2.2 >Reporter: Daniel Kuppitz >Assignee: Marko A. Rodriguez > > {noformat} > gremlin> g = TinkerFactory.createModern().traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> a = TinkerFactory.createModern().traversal().withComputer() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], graphcomputer] > gremlin> > gremlin> > g.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > gremlin> > a.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5]] > gremlin> > {noformat} > _Not tested in {{tp31}}._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1495) Global list deduplication doesn't work in OLAP
[ https://issues.apache.org/jira/browse/TINKERPOP-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565468#comment-15565468 ] Marko A. Rodriguez commented on TINKERPOP-1495: --- I know why this is happening. There is a simply solution and there is the real solution. The problem is that the traversal goes into "local mode" after {{groupCount()}} and thus, until an element is seen, the traversal should be treated like an OLTP traversal. However, its still treated like an OLAP traversal as {{GraphComputing}} steps need a {{setOnGraphComputer(boolean)}} method to say "we are in local model, treat as OLTP." > Global list deduplication doesn't work in OLAP > -- > > Key: TINKERPOP-1495 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1495 > Project: TinkerPop > Issue Type: Bug > Components: process >Affects Versions: 3.2.2 >Reporter: Daniel Kuppitz >Assignee: Marko A. Rodriguez > > {noformat} > gremlin> g = TinkerFactory.createModern().traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> a = TinkerFactory.createModern().traversal().withComputer() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], graphcomputer] > gremlin> > gremlin> > g.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > gremlin> > a.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5]] > gremlin> > {noformat} > _Not tested in {{tp31}}._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1495) Global list deduplication doesn't work in OLAP
[ https://issues.apache.org/jira/browse/TINKERPOP-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565442#comment-15565442 ] Marc de Lignie commented on TINKERPOP-1495: --- Maybe the following is handy for comparison. This works ok: gremlin> :> g.V().map{it -> 1}.dedup() ==>1 > Global list deduplication doesn't work in OLAP > -- > > Key: TINKERPOP-1495 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1495 > Project: TinkerPop > Issue Type: Bug > Components: process >Affects Versions: 3.2.2 >Reporter: Daniel Kuppitz >Assignee: Marko A. Rodriguez > > {noformat} > gremlin> g = TinkerFactory.createModern().traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> a = TinkerFactory.createModern().traversal().withComputer() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], graphcomputer] > gremlin> > gremlin> > g.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > gremlin> > a.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5]] > gremlin> > {noformat} > _Not tested in {{tp31}}._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1495) Global list deduplication doesn't work in OLAP
[ https://issues.apache.org/jira/browse/TINKERPOP-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565386#comment-15565386 ] Marko A. Rodriguez commented on TINKERPOP-1495: --- Great. This helps alot. [~dkuppitz] Is the the "full situation" in which this problem occurs? That is, when you {{select(values).unfold().dedup()}} ? {code} gremlin> g = TinkerFactory.createModern().traversal() ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] gremlin> g.V().groupCount().select(values).unfold().dedup() ==>1 gremlin> g.withComputer().V().groupCount().select(values).unfold().dedup() ==>1 ==>1 ==>1 ==>1 ==>1 ==>1 {code} > Global list deduplication doesn't work in OLAP > -- > > Key: TINKERPOP-1495 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1495 > Project: TinkerPop > Issue Type: Bug > Components: process >Affects Versions: 3.2.2 >Reporter: Daniel Kuppitz >Assignee: Marko A. Rodriguez > > {noformat} > gremlin> g = TinkerFactory.createModern().traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> a = TinkerFactory.createModern().traversal().withComputer() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], graphcomputer] > gremlin> > gremlin> > g.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > gremlin> > a.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5]] > gremlin> > {noformat} > _Not tested in {{tp31}}._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TINKERPOP-1495) Global list deduplication doesn't work in OLAP
[ https://issues.apache.org/jira/browse/TINKERPOP-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565366#comment-15565366 ] Marc de Lignie commented on TINKERPOP-1495: --- This one looks a lot simpler: gremlin> graph = GraphFactory.open('conf/hadoop/hadoop-gryo.properties') ==>hadoopgraph[gryoinputformat->gryooutputformat] gremlin> g = graph.traversal(computer(SparkGraphComputer)) ==>graphtraversalsource[hadoopgraph[gryoinputformat->gryooutputformat], sparkgraphcomputer] gremlin> g.V().groupCount().select(values).unfold().dedup() ==>1 ==>1 ==>1 ==>1 ==>1 ==>1 In TinkerGraph this gives a single "1". OLAP via remote connection also fails. > Global list deduplication doesn't work in OLAP > -- > > Key: TINKERPOP-1495 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1495 > Project: TinkerPop > Issue Type: Bug > Components: process >Affects Versions: 3.2.2 >Reporter: Daniel Kuppitz >Assignee: Marko A. Rodriguez > > {noformat} > gremlin> g = TinkerFactory.createModern().traversal() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard] > gremlin> a = TinkerFactory.createModern().traversal().withComputer() > ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], graphcomputer] > gremlin> > gremlin> > g.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > gremlin> > a.V().as("a").repeat(both()).times(3).emit().as("b").group().by(select("a")).by(select("b").dedup().order().by(id).fold()).select(values).unfold().dedup() > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5],v[6]] > ==>[v[1],v[3],v[4],v[5],v[6]] > ==>[v[1],v[2],v[3],v[4],v[6]] > ==>[v[1],v[2],v[3],v[4],v[5]] > gremlin> > {noformat} > _Not tested in {{tp31}}._ -- This message was sent by Atlassian JIRA (v6.3.4#6332)