Hi All, I propose that the semantics of the repeat() step should be changed starting 3.8.x. In particular, the repeat() step uses a hybrid execution approach that can be unintuitive. Initially, a single traverser is passed to the repeat traversal, so in its first iteration, there is only a single starting traverser. But in subsequent iterations there can be multiple starting traversers, if the first iteration produces multiple traversers. It may make more sense to allow multiple starting traversers in the first iteration to make it consistent throughout the traversal. This would make the entire repeat traversal a "global child" rather than a mix of "global and local child". The most significant impact would be on barriers, as they could now pull in everything even in the first iteration.
Current: gremlin> g.V().repeat(both().order()).times(2).path() ==>[v[1],v[2],v[1]] ==>[v[1],v[3],v[1]] ==>[v[1],v[4],v[1]] ==>[v[1],v[4],v[3]] ==>[v[1],v[3],v[4]] ==>[v[1],v[4],v[5]] ==>[v[1],v[3],v[6]] ==>[v[2],v[1],v[2]] ==>[v[2],v[1],v[3]] ==>[v[2],v[1],v[4]] ==>[v[3],v[4],v[1]] ==>[v[3],v[1],v[2]] ==>[v[3],v[1],v[3]] ==>[v[3],v[4],v[3]] ==>[v[3],v[6],v[3]] ==>[v[3],v[1],v[4]] ==>[v[3],v[4],v[5]] ==>[v[4],v[3],v[1]] ==>[v[4],v[1],v[2]] ==>[v[4],v[1],v[3]] ==>[v[4],v[1],v[4]] ==>[v[4],v[3],v[4]] ==>[v[4],v[5],v[4]] ==>[v[4],v[3],v[6]] ==>[v[5],v[4],v[1]] ==>[v[5],v[4],v[3]] ==>[v[5],v[4],v[5]] ==>[v[6],v[3],v[1]] ==>[v[6],v[3],v[4]] ==>[v[6],v[3],v[6]] Proposed: gremlin> g.V().repeat(both().order()).times(2).path() ==>[v[1],v[2],v[1]] ==>[v[1],v[3],v[1]] ==>[v[4],v[3],v[1]] ==>[v[6],v[3],v[1]] ==>[v[1],v[4],v[1]] ==>[v[3],v[4],v[1]] ==>[v[5],v[4],v[1]] ==>[v[2],v[1],v[2]] ==>[v[3],v[1],v[2]] ==>[v[4],v[1],v[2]] ==>[v[2],v[1],v[3]] ==>[v[3],v[1],v[3]] ==>[v[4],v[1],v[3]] ==>[v[1],v[4],v[3]] ==>[v[3],v[4],v[3]] ==>[v[5],v[4],v[3]] ==>[v[3],v[6],v[3]] ==>[v[2],v[1],v[4]] ==>[v[3],v[1],v[4]] ==>[v[4],v[1],v[4]] ==>[v[1],v[3],v[4]] ==>[v[4],v[3],v[4]] ==>[v[6],v[3],v[4]] ==>[v[4],v[5],v[4]] ==>[v[1],v[4],v[5]] ==>[v[3],v[4],v[5]] ==>[v[5],v[4],v[5]] ==>[v[1],v[3],v[6]] ==>[v[4],v[3],v[6]] ==>[v[6],v[3],v[6]] For comparison, here's what the unrolled query does: gremlin> g.V().both().order().both().order().path() ==>[v[1],v[2],v[1]] ==>[v[1],v[3],v[1]] ==>[v[4],v[3],v[1]] ==>[v[6],v[3],v[1]] ==>[v[1],v[4],v[1]] ==>[v[3],v[4],v[1]] ==>[v[5],v[4],v[1]] ==>[v[2],v[1],v[2]] ==>[v[3],v[1],v[2]] ==>[v[4],v[1],v[2]] ==>[v[2],v[1],v[3]] ==>[v[3],v[1],v[3]] ==>[v[4],v[1],v[3]] ==>[v[1],v[4],v[3]] ==>[v[3],v[4],v[3]] ==>[v[5],v[4],v[3]] ==>[v[3],v[6],v[3]] ==>[v[2],v[1],v[4]] ==>[v[3],v[1],v[4]] ==>[v[4],v[1],v[4]] ==>[v[1],v[3],v[4]] ==>[v[4],v[3],v[4]] ==>[v[6],v[3],v[4]] ==>[v[4],v[5],v[4]] ==>[v[1],v[4],v[5]] ==>[v[3],v[4],v[5]] ==>[v[5],v[4],v[5]] ==>[v[1],v[3],v[6]] ==>[v[4],v[3],v[6]] ==>[v[6],v[3],v[6]] Notice how currently, the elements are ordered based on the starting traverser while in the proposed solution the traversers are globally ordered per iteration and so the final iteration is sorted. Also, the unrolled query is generally what I expect the result to be and it matches the proposed semantics. Any questions or comments? Thanks, Ken
