[image: pasted1]
(The figure shows S2Graph as lambda architecture view. As mentioned in
[S2GRAPH-15] <https://issues.apache.org/jira/browse/S2GRAPH-15>, S2Graph
provides a great real-time view with serving layer on HBase.)

Hi folks.

I think we'd better implement "the common Spark environment" for batch and
speed(streaming) layer.
This environment is valuable for loader and s2counter which are currently
parts of S2Graph as well as other jobs in the project roadmap(draft:
http://markmail.org/message/uigl6pbt6urelgma?q=list:org%2Eapache%2Es2graph).

In my opinion, the environment includes three important features:
  1. Spark launcher with HA and fault tolerance scheduler using such as
Marathon or Chronos.
  2. Resumable Kafka stream (at least once or exact once)
  3. Configurable Engines like predictionio

Thanks

Minseok

Reply via email to