[
https://issues.apache.org/jira/browse/S2GRAPH-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101269#comment-15101269
]
DOYUNG YOON commented on S2GRAPH-26:
------------------------------------
Actually I am suggesting dropping mutateEdge, mutateVertex on Graph to process
single request.
Reason is that squashing multiple edges according to their edge reference(from,
to, label, direction) is very cheap and it can be applied into any storage
backend(I mean not specific to HBase storage).
Also change mutateElements to call mutateEdges/mutateVertices after split
multiple input graph elements into sequence of edges and sequence of vertices.
I think this refactoring will reduce chance of not using nice squash
optimization feature.
> Apply squash optimization on mutateElements and make it default behavior on
> Storage.
> ------------------------------------------------------------------------------------
>
> Key: S2GRAPH-26
> URL: https://issues.apache.org/jira/browse/S2GRAPH-26
> Project: S2Graph
> Issue Type: Improvement
> Reporter: DOYUNG YOON
> Assignee: DOYUNG YOON
> Labels: optimization, write
> Original Estimate: 72h
> Remaining Estimate: 72h
>
> Currently AsynchbaseStorage that implement Storage trait using HBase use
> squash optimization on snapshot edges at mutateEdges.
> for example, if there are requests on same snapshotEdge that consists of
> (insert, delete, insert, delete, insert) in their timestamp order, then we
> only need to apply last insert.
> ex) lets assume that there are 5 requests on from “shon” to “dun” with label
> “friend”, insert(t0), delete(t1), insert(t2), delete(t3), insert(t4).
> without squashing same snapshot edges in memory, then 5 of following actions
> need to be done.
> # fetch snapshot edge
> # lock this snapshot edge.
> # build new update and delete, insert, degree on IndexEdges.
> # fire above update/delete/insert/degree mutations into HBase.
> # release lock on this snapshotEdge.
> we can do 1, 2, 5(fetch, lock, release lock) one time and squash mutations
> that built from multiple requests and squash them.
> above logic needs to keep consistency between multiple indexEdges and purpose
> above logic is make only one thread on same snapshot edge be able to
> mutate(note the lock). since we will acquire lock on this edge, it is much
> efficient squash multiple requests on same snapshotEdge to avoid heavy
> operation, fetch, lock, release lock.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)