[
https://issues.apache.org/jira/browse/S2GRAPH-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199186#comment-15199186
]
ASF GitHub Bot commented on S2GRAPH-60:
---------------------------------------
Github user asfgit closed the pull request at:
https://github.com/apache/incubator-s2graph/pull/43
> Add divide operation to scorePropagateOp
> ----------------------------------------
>
> Key: S2GRAPH-60
> URL: https://issues.apache.org/jira/browse/S2GRAPH-60
> Project: S2Graph
> Issue Type: New Feature
> Reporter: DOYUNG YOON
> Assignee: DOYUNG YOON
> Priority: Trivial
> Labels: newbie, query, score
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> Ratio value in their service is common use cases of service analysis. Known
> methods to calculate ratio is that divide values between counting data or
> aggregating values. Already, S2Graph query supports counting or aggregating
> values within S2Graph storage. With S2Graph's function, you can calculate
> ratio just dividing values. That is an easy way to calculate the ratio.
> However, it can be a more simple way to calculate the ratio. It is that
> calculation occurred in S2Graph web application with just one RPC, one graph
> query call.
> This is a suggestion of the ratio calculation query.
> If we suppose to have two labels(impression feedbacks label and click
> feedbacks label), we can get a number of impressions and a number of clicks
> by a user. Using two value, we can calculate CTR(Click Through Rate) with
> below two count query.
> Impression query
> {noformat}
> {
> "srcVertices": [{
> "serviceName": "some_service",
> "columnName": "user_id",
> "id": "user_a"
> }],
> "steps": [{
> "step": [{
> "label": "impression_feedback_label",
> "direction": "out",
> "offset": 0,
> "limit": 100
> }]
> }]
> }
> {noformat}
> Click query
> {noformat}
> {
> "srcVertices": [{
> "serviceName": "some_service",
> "columnName": "user_id",
> "id": "user_a"
> }],
> "steps": [{
> "step": [{
> "label": "click_feedback_label",
> "direction": "out",
> "offset": 0,
> "limit": 100
> }]
> }]
> }
> {noformat}
> After fetching each result with upper queries, we can get a CTR.
> However, we can make a one query with `divide` operation to
> `scorePropagageOp`.
> {noformat}
> {
> "limit" : 10,
> "groupBy" : [ "from" ],
> "duplicate" : "sum",
> "srcVertices" : [ {
> "serviceName" : "some_service",
> "columnName" : "user_id",
> "id" : "user_a"
> } ],
> "steps" : [ {
> "step" : [ {
> "label" : "impression_feedback_label",
> "direction" : "out",
> "offset" : 0,
> "limit" : 10,
> "groupBy" : [ "from" ],
> "duplicate" : "countSum",
> "transform" : [ [ "_from" ] ]
> } ]
> }, {
> "step" : [ {
> "label": "click_feedback_label",
> "direction" : "out",
> "offset" : 0,
> "limit" : 10,
> "scorePropagateOp" : "divide",
> "scorePropagateShrinkage" : 500
> } ]
> } ]
> }
> {noformat}
> There is another query param option key, `scorePropagateShrinkage`. It is
> used to try normalizing results. We use just ratio value to sort the results.
> However, ratio value can be non-deterministic. Ratio 1.0 by 1/1 is larger
> than 0.9 by 9/10. For this reason, we can add `scorePropagateShrinkage` score
> value which is sufficiently big to the denominator. Now we can re-calculate
> by 1 / (1 + 500) =0.00199600798403 and 9 / (1 + 500) = 0.01796407185629, then
> the latter is larger value.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)