[ 
https://issues.apache.org/jira/browse/TRAFODION-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15084500#comment-15084500
 ] 

liu ming commented on TRAFODION-1729:
-------------------------------------

Thanks Stack, I need to learn how to submit a patch to HBase community.
And thanks Dave, here is the list of coprocessors:
org.apache.hadoop.hbase.coprocessor.transactional.TrxRegionObserver
org.apache.hadoop.hbase.coprocessor.transactional.TrxRegionEndpoint
org.apache.hadoop.hbase.coprocessor.AggregateImplementation
in the future, there is one more:
org.apache.hadoop.hbase.coprocessor.transactional.SsccRegionEndpoint

The overhead of adding coprocessor is only involved when a Region is first 
open. And I assume there is no difference between we use addCoprocessor() java 
API to add, or we modify hbase-site.xml to add. Just two different ways to add 
coprocessor in HBase. hbase-site.xml is a global setting. In fact, it will let 
ALL hbase tables to load Trafodion coprocessors, even for those native hbase 
tables, which not created by Trafodion. With this change, we can control that 
only Trafodion tables will be equipped with these coprocessors. Yet another 
reason to do this change :-) 

> change the coprocessor deployment method
> ----------------------------------------
>
>                 Key: TRAFODION-1729
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1729
>             Project: Apache Trafodion
>          Issue Type: Improvement
>          Components: dtm
>            Reporter: liu ming
>            Assignee: mashengchen
>
> I have a proposal to change our current HBase coprocessor configuration 
> method. 
> There are three ways to add a coprocessor to a HBase table:
> 1.       Via editing hbase-site.xml, which will load coprocessor for ALL 
> tables (Trafodion is using this method now)
> 2.       Via HBase shell command
> 3.       Via HTableDescriptor.addCoprocessor() java API
> Trafodion now is using the first method. I proposed to use method 3, I 
> finished a prototype and test seems works well.
>  
> Here are the reasons I propose for this change:
> At present, the Trafodion installer needs to modify the hbase-site.xml and 
> then restart HBase instance for the configuration to take effect. This step 
> not only complicate the installer but also let user think Trafodion is 
> intrusive into underlying HBase system. It will be ideal if we can avoid this 
> step. Another problem: in CDH, there is a concept called ‘region server 
> group’ or something, so the settings will have to carefully handled by 
> installer to apply to all groups. As we saw recently in WebRoot deployment, 
> Trafodion failed due to this reason. All these are very error prone and 
> complicate the Trafodion installer. Once CDH or HDP changed something, 
> Trafodion may fail again.
>  
> So I spent time to investigate why we need to restart HBase in order to 
> install Trafodion.  
> As I understand, there are 3 major reasons 
> 1.       To add hbase-trx coprocessors
> 2.       To overload HRegion with TransactionalRegion
> 3.       Various configuration settings, need to check one by one.
> The first configuration can be avoided by applying my proposed change.
> The second one, I look through the TransactionalRegion.java, and find out the 
> only reason (now) is to overload the getScanner() method to be public so can 
> be invoked by the coprocessor. And there are only 1 or 2 places that API is 
> invoked in Trafodion code. I checked with Kevin and he proposed by using 
> ‘java reflection’ we can also avoid this. 
> All other configuration items to some extent look like ‘best to have’, but 
> not ‘must to have’. And I also find two config items seems never been used:
> hbase.bulkload.staging.dir     /hbase-staging         (Suresh can confirm, 
> but I search in all code, seems this is never used)
> hbase.regionserver.region.transactional.tlog   true     (Narendra can 
> confirm, this is NEVER used, maybe a legacy config item?)
> Yes, by now, there are still some other config items seems cannot be avoided, 
> but I hope we can find some way to remove them in the future. I am not trying 
> to solve all issues right now, just want to start the effort to remove 
> unnecessary hbase reconfiguration.
> For this example, Coprocessors can be added to a table at run time, no need 
> to edit the hbase-site.xml and restart hbase. This is only the first step to 
> try to remove the deep impact to the current HBase config and restart HBase.
>  
> So I asked for your opinions about this change. If you think this is 
> necessary, I will continue to file a JIRA and fix it. 
>  
> I strongly recommend to get rid of the step of ‘modify hbase-site.xml and 
> restart your hbase’ for Trafodion installation, it should be an option , to 
> tune the system to best suit Trafodion, but should not be a forced step. To 
> be note: Apache Phoenix is also a SQL on HBase, its installation will change 
> nothing of underlying HBase, very lightweight, no ‘intrude into’ the existing 
> HBase system. Trafodion is considered to be heavy and intrusive in this 
> manner, and I feel maybe we can change this.
>  
> Should I start this discussion in the dev mail list?
>  
> P.S. a list of changed config items. My proposal will remove the last one, 
> hope we can get rid of all of them:
> hbase.master.distributed.log.splitting      false
> hbase.snapshot.master.timeoutMillis      600000
> hbase_regionserver_lease_period         600000
> hbase.hregion.impl                     
> org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegion
> hbase.regionserver.region.split.policy      
> org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy
> hbase.snapshot.enabled                 true
> hbase.bulkload.staging.dir                /hbase-staging
> hbase.regionserver.region.transactional.tlog  true
> hbase.snapshot.region.timeout            600000
> hbase_coprocessor_region_classes  
> org.apache.hadoop.hbase.coprocessor.transactional.TrxRegionObserver,org.apache.hadoop.hbase.coprocessor.transactional.TrxRegionEndpoint,org.apache.hadoop.hbase.coprocessor.AggregateImplementation
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to