Hi folks.

I think we should discuss what we provide as subproject until next release.

Since initial code imports to apache, we have not worked on other
subprojects except s2core, s2rest_play.

Here is what you can find in each subproject(from our README).

   1. s2core: The core library, containing the data abstractions for graph
   entities, storage adapters and utilities.
   2. s2rest_play: The REST server built with Play framework
   <https://www.playframework.com/>, providing the write and query API.
   3. s2rest_netty: The REST server built directly using Netty,
   implementing only the query API.
   4. loader: A collection of Spark jobs for bulk loading streaming data
   into S2Graph.
   5. spark: Spark utilities for loader and s2counter_loader.
   6. s2counter_core: The core library providing data structures and logics
   for s2counter_loader.
   7. s2counter_loader: Spark streaming jobs that consume Kafka WAL logs
   and calculate various top-*K* results on-the-fly.


I want to suggest to merge loader, spark, s2counter_loader into one project
called s2loader, make it responsible for streaming/batch utils to work with
S2Graph.

The reason behind of this is improving codebase(we have lots of duplicate
codes currently and it seems quite abandoned).

Also documentations are missed so we should provide firm documentation to
help others to understand them.

Finally there is no specs and test cases. I think adding test cases is
important because we can start refactor our code to easily testable one.

I have opened discussion thread at
http://markmail.org/message/3j2hbfquwwybyz4e but not enough attention has
been showed, so please give any feedback on this so we can start to work on
our subprojects.

Thanks.

Reply via email to