[jira] [Created] (SAMZA-11) Refactor Samza subprojects

Chris Riccomini (JIRA) Tue, 13 Aug 2013 09:36:36 -0700

Chris Riccomini created SAMZA-11:
------------------------------------

             Summary: Refactor Samza subprojects
                 Key: SAMZA-11
                 URL: https://issues.apache.org/jira/browse/SAMZA-11
             Project: Samza
          Issue Type: Bug
            Reporter: Chris Riccomini



In a recent merge, I refactored some packaging in samza-api, so that not 
everything was just in the samza root package space. I did the package re-org 
based on logical grouping (i.e. classes that were similar in some logical way 
would end up in the same package space).

There's been some discussion about the package structure. Right now, samza-api 
package has a mixture of framework-level interfaces, and public (user-facing) 
interfaces. For example, SystemConsumer is a framework interface, but 
StreamTask is a public user-facing interface.

Question is: does it make sense to split these two interface groups up in some 
way? If so, how?

I think it does make sense to split the interfaces up, because it's basically 
free to do, and will make it very obvious (based on where you put an interface) 
what the contract is with the parties involved (framework implementor, and 
stream task implementor). It will also allow us to generate two different sets 
of Javadocs: one for framework developers, and one for regular StreamTask 
developers. Lastly, it'll let the average developer just go to a specific 
subfolder, and learn about Samza, without having to pay attention to things 
like serializers, system consumers, etc.

I can think of two ways to do this:

1. Annotations (@Public, @Framework).
2. Within samza-api, add two package spaces: samza.api, and samza.framework.
3. Create two separate artifacts/projects: samza-api, and samza-framework-api.

I prefer the third approach. First, it means we can keep the package spaces the 
same (samza.task vs samza.api.task, samza.system vs samza.framework.system, 
etc). Second, it means we can generate two completely separate Javadoc 
directories. This is kind of cool because it should make things more 
straightforward for the average user to pick up. Third, it actually means that 
you could implement an entirely separate framework to run Samza tasks without 
using our framework API. I'm not saying this is a good idea, but it's generally 
a good sign when things line up this way.

My proposed change would be:

Refactor Samza API to:

 * samza-task-api
 * samza-framework-api (depends on samza-api)

Refactor samza-core to:

 * samza-util (depends on samza-task-api)
 * samza-container (depends on samza-task-api, samza-framework-api, and 
samza-util)

The reason for the container/util split is that there are some other 
subprojects (samza-kafka, and samza-yarn) that pull in all of core just to get 
access to some util classes. This split would allow other subprojects (and 
other projects, in general) to pull in util stuff without pulling in all of 
container, which they don't need.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (SAMZA-11) Refactor Samza subprojects

Reply via email to