[ 
https://issues.apache.org/jira/browse/HBASE-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457564#comment-13457564
 ] 

Andrew Purtell commented on HBASE-6800:
---------------------------------------

bq. (1) How do the users figure out what co-processor applications are stable, 
so that they can use in their production deployment?

This is exactly the motivation for starting all coprocessor based 
applications/contributions as external projects. We will have no registry of 
"approved" or "stable" coprocessor applications. I'd imagine users would expect 
all such apps in the HBase distribution proper to be in such a state. Beyond 
that, I don't think the project can have the bandwidth to track a number of 
ideas in development. We can't know in advance what support, interest, or 
stability any given contribution would have, so starting as an external project 
establishes this on its own merit. A popular and well cared for contribution 
would eventually be candidate for inclusion into the HBase source distribution 
proper. This is my characterization of what has been discussed and the 
consensus reached by the PMC. If others feel this in error, or if we should do 
something differently here, please speak up. 

bq. (2) How do we ensure the co-processor applications continue to be 
compatible with the changes in the HBase project, and compatible with each 
other?

We don't. The onus is on the contributor. If at some point the consensus of the 
project is to bring in a particular contribution into the ASF HBase source 
distribution, then at that point we must insure these things... But only with 
what is in the source distribution. 

bq. (3) How do the users get the co-processor applications? They can no longer 
get these from the Apache HBase release, and may need to perform manual 
integrations - not something average business users will do, and the main 
reason that we put the full HBase source tree out 

HBase is a mavenized project and your DOT system is a coprocessor application. 
There is no technical reason, barring issues with the CP framework itself, I 
can see why you have to include and maintain a full fork of HBase. Simply 
depend on HBase project artifacts and the complete DOT application can be 
compiled as a jar to drop on the classpath of a HBase installation. Where the 
CP framework may be insufficient, we can address that. Or, if there is some 
other technical reason (like a patch to core HBase), please list those so we 
can look at addressing it. 

Like Ted says also, the modularization of HBase means we could accept a 
mavenized project that depends on HBase core artifacts pretty easily. 
                
> Build a Document Store on HBase for Better Query Processing
> -----------------------------------------------------------
>
>                 Key: HBASE-6800
>                 URL: https://issues.apache.org/jira/browse/HBASE-6800
>             Project: HBase
>          Issue Type: New Feature
>          Components: coprocessors, performance
>    Affects Versions: 0.96.0
>            Reporter: Jason Dai
>         Attachments: dot-deisgn.pdf
>
>
> In the last couple of years, increasingly more people begin to stream data 
> into HBase in near time, and 
> use high level queries (e.g., Hive) to analyze the data in HBase directly. 
> While HBase already has very effective MapReduce integration with its good 
> scanning performance, query processing using MapReduce on HBase still has 
> significant gaps compared to HDFS: ~3x space overheads and 3~5x performance 
> overheads according to our measurement.
> We propose to implement a document store on HBase, which can greatly improve 
> query processing on HBase (by leveraging the relational model and read-mostly 
> access patterns). According to our prototype, it can reduce space usage by 
> up-to ~3x and speedup query processing by up-to ~1.8x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to