Re: [DISCUSS] manage our subprojects

daewon Mon, 28 Nov 2016 06:59:54 -0800

I agree to integrate spark-related projects such as streaming and batch.

Because the above projects have similar functionality, there is a lot of
code duplication.


Project integration is expected to simplify project elimination and project
redundancy.

First, after integrating streaming and batch related projects, I would like
to discuss about removing the 's2rest_netty' project for http layer
integration.

S2GRAPH-85 (https://issues.apache.org/jira/browse/S2GRAPH-85) had a bit of
discussion about the http layer.

On Mon, Nov 28, 2016 at 7:16 PM DO YUNG YOON <[email protected]> wrote:

> Hi folks.
>
> I think we should discuss what we provide as subproject until next release.
>
> Since initial code imports to apache, we have not worked on other
> subprojects except s2core, s2rest_play.
>
> Here is what you can find in each subproject(from our README).
>
>    1. s2core: The core library, containing the data abstractions for graph
>    entities, storage adapters and utilities.
>    2. s2rest_play: The REST server built with Play framework
>    <https://www.playframework.com/>, providing the write and query API.
>    3. s2rest_netty: The REST server built directly using Netty,
>    implementing only the query API.
>    4. loader: A collection of Spark jobs for bulk loading streaming data
>    into S2Graph.
>    5. spark: Spark utilities for loader and s2counter_loader.
>    6. s2counter_core: The core library providing data structures and logics
>    for s2counter_loader.
>    7. s2counter_loader: Spark streaming jobs that consume Kafka WAL logs
>    and calculate various top-*K* results on-the-fly.
>
>
> I want to suggest to merge loader, spark, s2counter_loader into one project
> called s2loader, make it responsible for streaming/batch utils to work with
> S2Graph.
>
> The reason behind of this is improving codebase(we have lots of duplicate
> codes currently and it seems quite abandoned).
>
> Also documentations are missed so we should provide firm documentation to
> help others to understand them.
>
> Finally there is no specs and test cases. I think adding test cases is
> important because we can start refactor our code to easily testable one.
>
> I have opened discussion thread at
> http://markmail.org/message/3j2hbfquwwybyz4e but not enough attention has
> been showed, so please give any feedback on this so we can start to work on
> our subprojects.
>
> Thanks.
>

Re: [DISCUSS] manage our subprojects

Reply via email to