mmiklavc opened a new pull request #1483: METRON-2217 Migrate current HBase 
client from HTableInterface to Table
URL: https://github.com/apache/metron/pull/1483
 
 
   ## Contributor Comments
   
   https://issues.apache.org/jira/browse/METRON-2217
   
   I want to get this in front of people to start reviewing asap. It's going to 
take me a couple days to work through a reasonable test plan for this, but this 
should not hold up reviewing the approach. Of note is that the change from 
HTableInterface to Table by HBase has now shifted the burden of connection 
management from the HTable implementation to the end user/client. We previously 
had very little, if any, hooks to close our HBase tables or attempt to clean up 
resources. In response to this change, there were a couple options for dealing 
with this:
   1. Completely rewrite our HBase client logic to fully manage connection 
lifecycle
   2. Isolate the connection management change to the existing HTableProvider 
implementation used throughout Metron and make a smaller, incremental change to 
set us up for the eventual upgrade to HBase 2.x.
   
   This PR takes the approach in option 2. The biggest question surrounding 
this approach is whether the included connection management changes introduced 
in the TableProvider are sufficient, or if we need to immediately take a more 
robust connection pooling approach to dealing with HBase connections. I spent 
some time looking at the current HTable implementation that we depend on. Every 
time an HTable is created, the underlying code makes a call to an internal 
connection manager. It's unclear to me what the connection management contract 
is for the user in this case, e.g. stale connection cleanup, connection 
retries, connection pooling, etc. This is probably the riskiest part of this 
change. The way that I'm handling this is to 
   1. Make the connections thread safe through use of a ThreadLocal connection 
variable. There are issues with instantiating a ThreadLocal variable by default 
in code that will be serialized by Storm. ThreadLocal is not serializable. In 
order to get around a massive API rewrite that would add initialization similar 
to our other startup hooks, e.g. StellarFunctions.initialize(), I opted for an 
approach that would allow us to get a similar effect via lazy initialization.
   2. Providing some basic connection retry logic that will initiate a new 
connection if the thread's current connection happens to have closed for 
whatever reason.
   The combination of these 2 options provides a semi-robust way to handle 
connections without boiling the ocean as well as offering per-thread connection 
re-use that should limit the overall number of connections we keep open to 
HBase in a reasonable and reliable way.
   
   ## Pull Request Checklist
   
   Thank you for submitting a contribution to Apache Metron.  
   Please refer to our [Development 
Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235)
 for the complete guide to follow for contributions.  
   Please refer also to our [Build Verification 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview)
 for complete smoke testing guides.  
   
   
   In order to streamline the review of the contribution we ask you follow 
these guidelines and ask you to double check the following:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? If not one needs to be 
created at [Metron 
Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
   - [ ] Does your PR title start with METRON-XXXX where XXXX is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically master)?
   
   
   ### For code changes:
   - [ ] Have you included steps to reproduce the behavior or problem that is 
being changed or addressed?
   - [ ] Have you included steps or a guide to how the change may be verified 
and tested manually?
   - [ ] Have you ensured that the full suite of tests and checks have been 
executed in the root metron folder via:
     ```
     mvn -q clean integration-test install && 
dev-utilities/build-utils/verify_licenses.sh 
     ```
   
   - [ ] Have you written or updated unit tests and or integration tests to 
verify your changes?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] Have you verified the basic functionality of the build by building and 
running locally with Vagrant full-dev environment or the equivalent?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which 
it is rendered by building and verifying the site-book? If not then run the 
following commands and the verify changes via 
`site-book/target/site/index.html`:
   
     ```
     cd site-book
     mvn site
     ```
   
   - [ ] Have you ensured that any documentation diagrams have been updated, 
along with their source files, using [draw.io](https://www.draw.io/)? See 
[Metron Development 
Guidelines](https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines)
 for instructions.
   
   #### Note:
   Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
   It is also recommended that [travis-ci](https://travis-ci.org) is set up for 
your personal repository such that your branches are built there before 
submitting a pull request.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to