[ https://issues.apache.org/jira/browse/S2GRAPH-15?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114644#comment-15114644 ]
ASF GitHub Bot commented on S2GRAPH-15: --------------------------------------- Github user emeth-kim commented on the pull request: https://github.com/apache/incubator-s2graph/pull/15#issuecomment-174370175 I launched a marathon job with the follow json: ``` { "id": "stat.json-1453453512", "instances": 1, "cpus": 0.1, "mem": 128.0, "container": { "type": "DOCKER", "docker": { "image": "path/to/image/s2lambda:stat.json-1453453512", "forcePullImage": true } } } ``` This job reads data from Kafka, computes that - count by `serviceName`, `label`, `operation`, and `log_type` then writes the results to Kafka back. The job works without any problem for 3 days. > S2Lambda, speed and batch layers of the lambda architecture > ----------------------------------------------------------- > > Key: S2GRAPH-15 > URL: https://issues.apache.org/jira/browse/S2GRAPH-15 > Project: S2Graph > Issue Type: New Feature > Reporter: Minseok Kim > Labels: features > Attachments: s2lambda.001.png > > > h4. Background > As the lambda architecture view, S2Graph provides a great real-time view with > serving layer on HBase. > The input stream came from the REST API is stored to HBase, and it can be > served by the graph query in real-time. > The stream, which is write-ahead log is also written to Kafka, it allows us > to do a lot of things. > There are several works (or sub-projects) using this stream. > * S2Counter - computes the real-time count by the combinations of > properties using Kafka stream directly. > * WalToHdfs - Kafka stream to the incremental view > * S2ML - performs machine learning algorithm using the incremental view. > * … > h4. S2Lambda > Because the above works have been developed, respectively, they use different > Spark versions and duplicated codes. > This causes difficulty of build and code reusability. > S2Lambda should be designed to solve this problem to support a general > framework of speed and batch layers. > IMHO, first, A JSON-formatted job description is designed for compatible with > both speed and batch layer. > then the S2Lambda is implemented by corresponding it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)