Here is the problems about S2Counter project from my perspectives.

1. There is no document, specifications at all.
2. There are lots of codes that seems unnecessary if s2counter use s2core.
especially s2counter_core has it's own hbase schema and
serializer/deserializer. it even has schema for redis. all of these codes
just what we imported when incubator starts and these codes has not been
maintained.
3. There is no test and s2counter_core codes are not easily testable.
4. Many counting logics are implemented on s2counter_loader project on RDD
from spark. I think all of logics regarding to counting should be
implemented as POJO, not on RDD.

I am suggesting following on s2counter_core project.
1. deprecate old codes.
2. create Counter class that has all counting logics in it.
3. create CounterClient that has external resources, such as s2graph, http
client, etc.

Also on s2counter_loader project.
1. may be we should remove this project and merge it into loader project.
2. loader should initialize CounterClient and simply delegate operations
into CounterClient.

If user want to use different streaming processing framework, other that
spark streaming, they should be able to initialize CounterClient and just
call methods.
What you guys think?

Reply via email to