[ 
https://issues.apache.org/jira/browse/S2GRAPH-15?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114644#comment-15114644
 ] 

ASF GitHub Bot commented on S2GRAPH-15:
---------------------------------------

Github user emeth-kim commented on the pull request:

    https://github.com/apache/incubator-s2graph/pull/15#issuecomment-174370175
  
    I launched a marathon job with the follow json:
    
    ```
    {
      "id": "stat.json-1453453512",
      "instances": 1,
      "cpus": 0.1,
      "mem": 128.0,
      "container": {
        "type": "DOCKER",
        "docker": {
          "image": "path/to/image/s2lambda:stat.json-1453453512",
          "forcePullImage": true
        }
      }
    }
    ```
    This job reads data from Kafka, computes that
      - count by `serviceName`, `label`, `operation`, and  `log_type`
    then writes the results to Kafka back.
    
    The job works without any problem for 3 days.


> S2Lambda, speed and batch layers of the lambda architecture
> -----------------------------------------------------------
>
>                 Key: S2GRAPH-15
>                 URL: https://issues.apache.org/jira/browse/S2GRAPH-15
>             Project: S2Graph
>          Issue Type: New Feature
>            Reporter: Minseok Kim
>              Labels: features
>         Attachments: s2lambda.001.png
>
>
> h4. Background
> As the lambda architecture view, S2Graph provides a great real-time view with 
> serving layer on HBase.
> The input stream came from the REST API is stored to HBase, and it can be 
> served by the graph query in real-time.
> The stream, which is write-ahead log is also written to Kafka, it allows us 
> to do a lot of things. 
> There are several works (or sub-projects) using this stream.
>   * S2Counter - computes the real-time count by the combinations of 
> properties using Kafka stream directly.
>   * WalToHdfs - Kafka stream to the incremental view
>   * S2ML - performs machine learning algorithm using the incremental view.
>   * …
> h4. S2Lambda
> Because the above works have been developed, respectively, they use different 
> Spark versions and duplicated codes.
> This causes difficulty of build and code reusability.
> S2Lambda should be designed to solve this problem to support a general 
> framework of speed and batch layers.
> IMHO, first, A JSON-formatted job description is designed for compatible with 
> both speed and batch layer.
> then the S2Lambda is implemented by corresponding it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to