[ 
https://issues.apache.org/jira/browse/HBASE-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457508#comment-13457508
 ] 

Jason Dai commented on HBASE-6800:
----------------------------------

bq. coprocessor based applications should begin as independent code 
contributions, perhaps hosted in a GitHub repository
bq. It would be helpful if only the changes on top of stock HBase code appear 
here.
This could work, though I think we need to figure out how to address several 
implications brought by the proposal, such as:
(1) How do the users figure out what co-processor applications are stable, so 
that they can use in their production deployment?
(2) How do we ensure the co-processor applications continue to be compatible 
with the changes in the HBase project, and compatible with each other?
(3) How do the users get the co-processor applications? They can no longer get 
these from the Apache HBase release, and may need to perform manual 
integrations - not something average business users will do, and the main 
reason that we put the full HBase source tree out (several of our users and 
customers want to get a prototype of DOT to try it out).

bq. We would be delighted to work with you on the necessary coprocessor 
framework extensions. I'd recommend a separate JIRA specifically for this.
Yes, we do plan to submit the proposal for observers for the filter operations 
as a separate JIRA (the original plan was to make it a sub task of this JIRA).


                
> Build a Document Store on HBase for Better Query Processing
> -----------------------------------------------------------
>
>                 Key: HBASE-6800
>                 URL: https://issues.apache.org/jira/browse/HBASE-6800
>             Project: HBase
>          Issue Type: New Feature
>          Components: coprocessors, performance
>    Affects Versions: 0.96.0
>            Reporter: Jason Dai
>         Attachments: dot-deisgn.pdf
>
>
> In the last couple of years, increasingly more people begin to stream data 
> into HBase in near time, and 
> use high level queries (e.g., Hive) to analyze the data in HBase directly. 
> While HBase already has very effective MapReduce integration with its good 
> scanning performance, query processing using MapReduce on HBase still has 
> significant gaps compared to HDFS: ~3x space overheads and 3~5x performance 
> overheads according to our measurement.
> We propose to implement a document store on HBase, which can greatly improve 
> query processing on HBase (by leveraging the relational model and read-mostly 
> access patterns). According to our prototype, it can reduce space usage by 
> up-to ~3x and speedup query processing by up-to ~1.8x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to