[
https://issues.apache.org/jira/browse/TINKERPOP-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159237#comment-17159237
]
ASF GitHub Bot commented on TINKERPOP-2376:
-------------------------------------------
spmallette merged pull request #1301:
URL: https://github.com/apache/tinkerpop/pull/1301
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Probability distribution controlled by weight when using sample step
> --------------------------------------------------------------------
>
> Key: TINKERPOP-2376
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2376
> Project: TinkerPop
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.4.6
> Environment: Gremlin-Tinkerpop 3.4.6 on Fedora 32
> Reporter: zjxian
> Assignee: Stephen Mallette
> Priority: Minor
> Attachments: SampleGlobalStep.java, out.csv
>
>
> create a simple graph with 1 central node and 3 surronding nodes
> add 3 edges with equal weight (1) and form a stargraph
> traverse from center ( v[0] ) to other (3) nodes, sample(1) and record the
> destination node
> do that 10000 times
> estimated probabitlity distribution:
> v[1]:v[2]:v[3] = 3333:3333:3333 (1:1:1)
> what i got:
> v[1]:v[2]:v[3] = 3320:4439:2241
> I've checked some source file, like
> ([https://github.com/apache/tinkerpop/blob/master/gremlin-core/src/main/java/org/apache/tinkerpop/gremlin/process/traversal/step/filter/SampleGlobalStep.java]).
> The probability distribution shoud be like 1/3:4/9:2/9, which is very close
> to the results I got.
> I think some improvements is needed here to make "random walk" in tinkerpop
> really useful.
> the script i use:
> {code:java}
> //代码占位符
> conf = new BaseConfiguration()
> conf.setProperty("gremlin.tinkergraph.vertexIdManager","LONG")
> conf.setProperty("gremlin.tinkergraph.edgeIdManager","LONG")
> conf.setProperty("gremlin.tinkergraph.vertexPropertyIdManager","LONG");
> graph = TinkerGraph.open(conf)g=graph.traversal()
> for(i=0;i<=3;i++){
> g.addV().iterate()
> }
> for(i=1;i<=3;i++){
> g.V(0).addE("connect").property("weight",1).to(g.V(i)).iterate()
> }
> ["bash", "-c", "rm -f out.csv"].execute().waitFor()file=new
> File("out.csv")file.append("id\r\n")
> for(i=0;i<10000;i++){
> g.V(0).outE().sample(1).by("weight").otherV().map{file.append
> it.get().id()+"\r\n"}.iterate()
> }
> {code}
> see result in attached out.csv
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)