[
https://issues.apache.org/jira/browse/S2GRAPH-26?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109929#comment-15109929
]
ASF GitHub Bot commented on S2GRAPH-26:
---------------------------------------
GitHub user SteamShon opened a pull request:
https://github.com/apache/incubator-s2graph/pull/14
[S2GRAPH-26] Apply squash optimization on mutateElements and make it
default behavior on Storage.
+ Change Graph.mutateElements to use mutateEdges and mutateVertices to use
squash optimization.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/SteamShon/incubator-s2graph S2GRAPH-26
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-s2graph/pull/14.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #14
----
commit e7518db227c947a1f53195c13819ecd567e0ce16
Author: DO YUNG YOON <[email protected]>
Date: 2016-01-21T02:29:56Z
Change Graph.mutateElements to use mutateEdges and mutateVertices to use
squash optimization.
----
> Apply squash optimization on mutateElements and make it default behavior on
> Storage.
> ------------------------------------------------------------------------------------
>
> Key: S2GRAPH-26
> URL: https://issues.apache.org/jira/browse/S2GRAPH-26
> Project: S2Graph
> Issue Type: Improvement
> Reporter: DOYUNG YOON
> Assignee: DOYUNG YOON
> Labels: optimization, write
> Original Estimate: 72h
> Remaining Estimate: 72h
>
> Currently AsynchbaseStorage that implement Storage trait using HBase use
> squash optimization on snapshot edges at mutateEdges.
> for example, if there are requests on same snapshotEdge that consists of
> (insert, delete, insert, delete, insert) in their timestamp order, then we
> only need to apply last insert.
> ex) lets assume that there are 5 requests on from “shon” to “dun” with label
> “friend”, insert(t0), delete(t1), insert(t2), delete(t3), insert(t4).
> without squashing same snapshot edges in memory, then 5 of following actions
> need to be done.
> # fetch snapshot edge
> # lock this snapshot edge.
> # build new update and delete, insert, degree on IndexEdges.
> # fire above update/delete/insert/degree mutations into HBase.
> # release lock on this snapshotEdge.
> we can do 1, 2, 5(fetch, lock, release lock) one time and squash mutations
> that built from multiple requests and squash them.
> above logic needs to keep consistency between multiple indexEdges and purpose
> above logic is make only one thread on same snapshot edge be able to
> mutate(note the lock). since we will acquire lock on this edge, it is much
> efficient squash multiple requests on same snapshotEdge to avoid heavy
> operation, fetch, lock, release lock.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)