[ 
https://issues.apache.org/jira/browse/HBASE-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Corgan updated HBASE-5977:
-------------------------------

    Attachment: Potential-HBase-Module-Descriptions-v1.pdf
                Potential-HBase-Modules-v1.pdf

Jesse, Stack and I have discussed this from a few different angles to try to 
identify some of the reasons for creating modules.  The main benefit of modules 
is to isolate complex implementations behind simple interfaces.  The main 
drawback is that modules add overhead in the form of more things to open in 
eclipse and more jar files in the build.

Pasting from HBASE-5720 some arguments for creating a "codec" module that 
contains wrapper classes for individual HFile block types:
* make it more testable, like a normal in-memory data structure without having 
to set up heavyweight testing environments
* separate the encoding concerns from IO concerns. after the checksum happens, 
encoders/decoders should not even know what an IOException is
* strongly discourage people from modifying anything in the codec packages 
without knowing what they're getting into
* ensure the main project code only references the interfaces and not any codec 
internals (see if main project compiles without codecs in classpath)
* make it easier for contributors to develop and profile the codecs without 
having to become experts in all aspects of hbase
* help to simplify the main project. imagine if the gzip or snappy internals 
were sprinkled throughout the regionserver code. yikes.

Attaching Potential-HBase-Modules-v1.pdf and 
Potential-HBaseModule-Descriptions-v1.pdf to illustrate a possible roadmap for 
extracting modules.  We currently have hbase-server, and first going to "pull 
up" some files into hbase-common.  Eventually we may "push down" an 
integration-test module.  

Extracting these modules can't really be done all at once, so this is just a 
roadmap meant to start discussion.  For example, there's probably an 
opportunity to isolate some of regionserver and master code, but they also 
share a lot.  This v1 doc shows a push down of master code out of the server 
module, but we probably need to think through that in more detail.

* Link to dependency chart: 
https://docs.google.com/presentation/d/16Kf9FAFjtneWwCnpy9Bql4QhXmORf7U9uJLoRobePHQ/edit
* Link to description doc: 
https://docs.google.com/document/d/1RHrUa9qWGvIR6ZmqVYP17rS7JTPSzCFCPKNjTo-XY38/edit

                
> Usage of modules 
> -----------------
>
>                 Key: HBASE-5977
>                 URL: https://issues.apache.org/jira/browse/HBASE-5977
>             Project: HBase
>          Issue Type: Brainstorming
>          Components: build
>    Affects Versions: 0.96.0
>            Reporter: Jesse Yates
>         Attachments: Potential-HBase-Module-Descriptions-v1.pdf, 
> Potential-HBase-Modules-v1.pdf
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> With HBASE-4336, HBase will have the ability to add multiple modules for 
> different aspects of the codebase (less tests, see HBASE-4336 for details). 
> We need to set a policy for when modules should be used versus putting the 
> code into a single existing module or dispersed across modules. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to