Chris Riccomini created SAMZA-11:
------------------------------------
Summary: Refactor Samza subprojects
Key: SAMZA-11
URL: https://issues.apache.org/jira/browse/SAMZA-11
Project: Samza
Issue Type: Bug
Reporter: Chris Riccomini
In a recent merge, I refactored some packaging in samza-api, so that not
everything was just in the samza root package space. I did the package re-org
based on logical grouping (i.e. classes that were similar in some logical way
would end up in the same package space).
There's been some discussion about the package structure. Right now, samza-api
package has a mixture of framework-level interfaces, and public (user-facing)
interfaces. For example, SystemConsumer is a framework interface, but
StreamTask is a public user-facing interface.
Question is: does it make sense to split these two interface groups up in some
way? If so, how?
I think it does make sense to split the interfaces up, because it's basically
free to do, and will make it very obvious (based on where you put an interface)
what the contract is with the parties involved (framework implementor, and
stream task implementor). It will also allow us to generate two different sets
of Javadocs: one for framework developers, and one for regular StreamTask
developers. Lastly, it'll let the average developer just go to a specific
subfolder, and learn about Samza, without having to pay attention to things
like serializers, system consumers, etc.
I can think of two ways to do this:
1. Annotations (@Public, @Framework).
2. Within samza-api, add two package spaces: samza.api, and samza.framework.
3. Create two separate artifacts/projects: samza-api, and samza-framework-api.
I prefer the third approach. First, it means we can keep the package spaces the
same (samza.task vs samza.api.task, samza.system vs samza.framework.system,
etc). Second, it means we can generate two completely separate Javadoc
directories. This is kind of cool because it should make things more
straightforward for the average user to pick up. Third, it actually means that
you could implement an entirely separate framework to run Samza tasks without
using our framework API. I'm not saying this is a good idea, but it's generally
a good sign when things line up this way.
My proposed change would be:
Refactor Samza API to:
* samza-task-api
* samza-framework-api (depends on samza-api)
Refactor samza-core to:
* samza-util (depends on samza-task-api)
* samza-container (depends on samza-task-api, samza-framework-api, and
samza-util)
The reason for the container/util split is that there are some other
subprojects (samza-kafka, and samza-yarn) that pull in all of core just to get
access to some util classes. This split would allow other subprojects (and
other projects, in general) to pull in util stuff without pulling in all of
container, which they don't need.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira