DOYUNG YOON created S2GRAPH-123:
-----------------------------------
Summary: Support different index on out/in direction.
Key: S2GRAPH-123
URL: https://issues.apache.org/jira/browse/S2GRAPH-123
Project: S2Graph
Issue Type: New Feature
Affects Versions: 0.2.0
Reporter: DOYUNG YOON
Assignee: DOYUNG YOON
Fix For: 0.2.0
In some situation, user might want to set different behavior based on
`direction` of edge.
Based on my experience on deploying and operating S2Graph with user's news
article click activity, It is extremely common that few of article get most of
clicks.
More formal way to describe problem, let's say we have `user_article_click`
label and each edge consist of `user_id` and `article_id` as source/target
vertex.
In this case, 'out' direction edge spread out evenly because we are prepending
murmur hash at the beginning of row key. we have very few edges per each source
vertex(`user_id`) since each individual can't click million articles.
However 'in' direction, which hold all edges connecting all `user_id` for each
`article_id` have different scenario. only few `article_id` get lots of click
from million users and this quickly become the `super node`. This yield
excessive region server resource usage and It is not reasonable million edges
on one single source vertex anyway because it would be timeout to send million
edges to client.
Currently, there is no way to control how to process edge per each direction,
but above case can be avoided if we can provide options.
I suggest new feature to provide separate index with write options for each
`direction`.
Possible write options can be followings(based on our write transaction steps).
# `IndexEdge`: dropAll/sampling/storeAll(default)
# `SnapshotEdge`: drop/store(default)
# `Degree`: ignore/update(default)
By enabling/disabling each element in write transaction, users can decide what
to do when they know how their data will be.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)