[ https://issues.apache.org/jira/browse/S2GRAPH-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545018#comment-16545018 ]
ASF GitHub Bot commented on S2GRAPH-226: ---------------------------------------- GitHub user SteamShon opened a pull request: https://github.com/apache/incubator-s2graph/pull/182 [S2GRAPH-226]: Provide example spark jobs to explain how to utilize WAL log. - initial commit: add wal package, WalLog class, UserDefinedAggregateFunction. You can merge this pull request into a Git repository by running: $ git pull https://github.com/SteamShon/incubator-s2graph S2GRAPH-226 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-s2graph/pull/182.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #182 ---- commit bbc64682dee2fa6cd269ee2b90a010ec29cacd45 Author: DO YUNG YOON <steamshon@...> Date: 2018-07-16T09:58:32Z add wal package, WalLog class, UserDefinedAggregateFunction. ---- > Provide example spark jobs to explain how to utilize WAL log. > ------------------------------------------------------------- > > Key: S2GRAPH-226 > URL: https://issues.apache.org/jira/browse/S2GRAPH-226 > Project: S2Graph > Issue Type: New Feature > Components: s2core, s2jobs > Reporter: DOYUNG YOON > Assignee: DOYUNG YOON > Priority: Major > Original Estimate: 336h > Remaining Estimate: 336h > > Even though s2graph publish all incoming vertex/edge into Kafka, there is no > example showing how to use this WAL log. > I suggest adding a simple example showing how to process WAL and let me > explain what use cases this example can benefit. > At kakao, s2graph have been used as the fact storage, which store all user's > activities such as click content, buy a product, search query. > {noformat} > [{ > "timestamp": 1, > "elem": "e", > "from": "steamshon", > "to": "s2graph", > "label": "search_query", > "props": {} > }, { > "timestamp": 10, > "elem": "e", > "from": "steamshon", > "to": "github.com/apache/incubator-s2graph", > "label": "content_click", > "props": {} > }, { > "timestamp": 12, > "elem": "v", > "id": "steamshon", > "serviceName": "s2graph", > "columnName": "user", > "props": { > "gender": "M" > } > }] > {noformat} > Each activity, label in s2graph words, consisting of their own graph, but > when they are all connected together, then it gives much more information. > Above edges can be aggregated as Vertex. > It is up to users how to connect each graph, but in our case, we used `user` > to merge multiple graphs. for example, we made each activity such as click > content, buy a product, search query all use the same `userId` for the same > `user`. > Below is simple example data. > {noformat} > { > "timestamp": 10, > "elem": "v", > "id": "steamshon", > "serviceName": "s2graph", > "columnName": "user", > "props": { > "gender": "M", > "edges": [{ > "timestamp": 1, > "to": "s2graph", > "label": "search_query", > "props": {} > }, { > "timestamp": 10, > "to": "github.com/apache/incubator-s2graph", > "label": "content_click", > "props": {} > }] > } > } > {noformat} > This connected graph can be used not only for OLTP but also OLAP. > I believe s2graph WAL log is good way to integrate OLTP and OLAP, and adding > this example can help for user to understand how to leverage it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)