[
https://issues.apache.org/jira/browse/TINKERPOP-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17695684#comment-17695684
]
Stephen Mallette edited comment on TINKERPOP-2875 at 3/2/23 12:08 PM:
----------------------------------------------------------------------
> One more question is that I can count the correct number of vertices even if
> we set bulk to false. I think they should be consistent.
i'm not sure. there seems to be some tension between {{withBulk()}} and
injections of {{barrier()}} that can produce situations like the one you show.
i don't know if the inconsistency is desirable for certain exotic use cases.
that would need more consideration.
> Also, I would like to know how can I turn off "Bulk Optimization"?
i think you're confusing {{withBulk()}} with {{barrier()}} injection.
{{withBulk()}} is a little used configuration that forces bulk to be "1" for
each traverser. When injected you won't see it in a {{profile()}} but will see
a {{NoOpBarrierStep}} in the {{explain()}}.
{code}
gremlin> :> g.withBulk(false).E().outV()
==>v[1]
{code}
If you do an {{explain()}} on the above you will see it includes the
{{NoOpBarrierStep}} which then tries to the type of bulking you'd like to
avoid. As you've forced bulk to "1" for all traversers the {{NoOpBarrierStep}}
basically behaves like a {{dedup()}}. in your second query:
{code}
gremlin> :> g.withBulk(false).E().outV().count()
==>2
{code}
if you profile the above you will find that because you {{count()}} bulking is
not enabled and there are no {{NoOpBarrierStep} instances so you get all the
vertices counted. Generally speaking, the {{withBulk()}} setting is for
serious fine tuning of your traversal mechanics when you know exactly what you
are trying to do. That said, I wonder how necessary it is and if the name of
"bulk" has been overloaded. It's easy to think as you did that setting it
{{false}} would disable bulking in general. Perhaps it should...again not sure.
if you did the {{explain()}} on the first traversal as I suggested you can
probably see how to disable the bulking you are concerned about. You need to
remove {{LazyBarrierStrategy}} (and any strategy that injects a {{barrier()}}.
{code}
gremlin> g.withoutStrategies(LazyBarrierStrategy).withBulk(false).E().outV()
==>v[1]
==>v[1]
==>v[1]
==>v[4]
==>v[4]
==>v[6]
gremlin>
g.withoutStrategies(LazyBarrierStrategy).withBulk(false).E().outV().count()
==>6
{code}
was (Author: spmallette):
> One more question is that I can count the correct number of vertices even if
> we set bulk to false. I think they should be consistent.
i'm not sure. there seems to be some tension between {{withBulk()}} and
injections of {{barrier()}} that can produce situations like the one you show.
i don't know if the inconsistency is desirable for certain exotic use cases.
that would need more consideration.
> Also, I would like to know how can I turn off "Bulk Optimization"?
i think you're confusing {{withBulk()}} with {{barrier()}} injection.
{{withBulk()}} is a little used configuration that forces bulk to be "1" for
each traverser. When injected you won't see it in a {{profile()}] but will see
a {{NoOpBarrierStep}} in the {{explain()}}.
{code}
gremlin> :> g.withBulk(false).E().outV()
==>v[1]
{code}
If you do an {{explain()}} on the above you will see it includes the
{{NoOpBarrierStep}} which then tries to the type of bulking you'd like to
avoid. As you've forced bulk to "1" for all traversers the {{NoOpBarrierStep}}
basically behaves like a {{dedup()}}. in your second query:
{code}
gremlin> :> g.withBulk(false).E().outV().count()
==>2
{code}
if you profile the above you will find that because you {{count()}} bulking is
not enabled and there are no {{NoOpBarrierStep} instances so you get all the
vertices counted. Generally speaking, the {{withBulk()}} setting is for
serious fine tuning of your traversal mechanics when you know exactly what you
are trying to do. That said, I wonder how necessary it is and if the name of
"bulk" has been overloaded. It's easy to think as you did that setting it
{{false}} would disable bulking in general. Perhaps it should...again not sure.
if you did the {{explain()}} on the first traversal as I suggested you can
probably see how to disable the bulking you are concerned about. You need to
remove {{LazyBarrierStrategy}} (and any strategy that injects a {{barrier()}}.
{code}
gremlin> g.withoutStrategies(LazyBarrierStrategy).withBulk(false).E().outV()
==>v[1]
==>v[1]
==>v[1]
==>v[4]
==>v[4]
==>v[6]
gremlin>
g.withoutStrategies(LazyBarrierStrategy).withBulk(false).E().outV().count()
==>6
{code}
> Duplicate elements are omitted when turning off bulk optimization
> -----------------------------------------------------------------
>
> Key: TINKERPOP-2875
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2875
> Project: TinkerPop
> Issue Type: Bug
> Components: process
> Affects Versions: 3.5.5
> Reporter: Lei Tang
> Priority: Major
>
> I create three vertices and two edges.
> {code:java}
> Vertex v1 = g.addV("vl0").property("vp0", 1).next(); // v[1]
> Vertex v2 = g.addV("vl0").property("vp0", 2).next(); // v[2]
> Vertex v3 = g.addV("vl0").property("vp0", 3).next(); // v[3]
> Edge e1 = g.addE("el0").from(v1).to(v2).next();
> Edge e2 = g.addE("el0").from(v1).to(v3).next();{code}
> When I execute the query 'g.E().outV()' without using bulk optimization, I
> expect the result \{v1,v1} is returned. However, it removes the duplicate
> vertices.
> {code:java}
> gremlin> :> g.withBulk(false).E().outV()
> ==>v[1]
> gremlin> :> g.E().outV()
> ==>v[1]
> ==>v[1]
> {code}
> Since I do not use bulk operations in this query, I expect that even if we
> turn off bulk opitmization, we can compute the correct result.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)