[ 
https://issues.apache.org/jira/browse/S2GRAPH-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

DOYUNG YOON updated S2GRAPH-60:
-------------------------------
    Summary: Add divide operation to scorePropagateOp  (was: Add “divide” 
operation to “scorePropagateOp")

> Add divide operation to scorePropagateOp
> ----------------------------------------
>
>                 Key: S2GRAPH-60
>                 URL: https://issues.apache.org/jira/browse/S2GRAPH-60
>             Project: S2Graph
>          Issue Type: New Feature
>            Reporter: DOYUNG YOON
>            Assignee: DOYUNG YOON
>            Priority: Trivial
>              Labels: newbie, query, score
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Ratio value in their service is common use cases of  service analysis. Known 
> methods to calculate ratio is that divide values between counting data or 
> aggregating values. Already, S2Graph query supports counting or aggregating 
> values within S2Graph storage. With S2Graph's function, you can calculate 
> ratio just dividing values. That is an easy way to calculate the ratio. 
> However, it can be a more simple way to calculate the ratio. It is that 
> calculation occurred in S2Graph web application with just one RPC, one graph 
> query call.
> This is a suggestion of the ratio calculation query. 
> If we suppose to have two labels(impression feedbacks label and click 
> feedbacks label), we can get a number of impressions and a number of clicks 
> by a user. Using two value, we can calculate CTR(Click Through Rate) with 
> below two count query.
> Impression query
> {noformat}
> {
>   "srcVertices": [{
>     "serviceName": "some_service",
>     "columnName": "user_id",
>     "id": "user_a"
>   }],
>   "steps": [{
>     "step": [{
>       "label": "impression_feedback_label",
>       "direction": "out",
>       "offset": 0,
>       "limit": 100
>     }]
>   }]
> }
> {noformat}
> Click query
> {noformat}
> {
>   "srcVertices": [{
>     "serviceName": "some_service",
>     "columnName": "user_id",
>     "id": "user_a"
>   }],
>   "steps": [{
>     "step": [{
>       "label": "click_feedback_label",
>       "direction": "out",
>       "offset": 0,
>       "limit": 100
>     }]
>   }]
> }
> {noformat}
> After fetching each result with upper queries, we can get a CTR.
> However, we can make a one query with `divide` operation to 
> `scorePropagageOp`.
> {noformat}
> {
>   "limit" : 10,
>   "groupBy" : [ "from" ],
>   "duplicate" : "sum",
>   "srcVertices" : [ {
>     "serviceName" : "some_service",
>     "columnName" : "user_id",
>     "id" : "user_a"
>   } ],
>   "steps" : [ {
>     "step" : [ {
>       "label" : "impression_feedback_label",
>       "direction" : "out",
>       "offset" : 0,
>       "limit" : 10,
>       "groupBy" : [ "from" ],
>       "duplicate" : "countSum",
>       "transform" : [ [ "_from" ] ]
>     } ]
>   }, {
>     "step" : [ {
>       "label": "click_feedback_label",
>       "direction" : "out",
>       "offset" : 0,
>       "limit" : 10,
>       "scorePropagateOp" : "divide",
>       "scorePropagateShrinkage" : 500
>     } ]
>   } ]
> }
> {noformat}
> There is another query param option key, `scorePropagateShrinkage`. It is 
> used to try normalizing results. We use just ratio value to sort the results. 
> However, ratio value can be non-deterministic. Ratio 1.0 by 1/1 is larger 
> than 0.9 by 9/10. For this reason, we can add `scorePropagateShrinkage` score 
> value which is sufficiently big to the denominator. Now we can re-calculate 
> by 1 / (1 + 500) =0.00199600798403 and 9 / (1 + 500) = 0.01796407185629, then 
> the latter is larger value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to