[jira] [Commented] (HBASE-6800) Build a Document Store on HBase for Better Query Processing

Andrew Purtell (JIRA) Tue, 18 Sep 2012 09:52:08 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-6800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13457945#comment-13457945
 ]


Andrew Purtell commented on HBASE-6800:
---------------------------------------

bq. If moving your code into Apache is a goal, you could also start the 
co-processor project in the apache incubator.  You could do that while being 
consistent with andrew's suggested methodology (not forking HBase, mavenized 
integration...).

This is a good suggestion. Panthera isn't so much an enhancement to HBase but 
rather a full application on top, and with wider scope than just HBase -- also 
Hive, and additional new components. In the scope of the HBase project alone, 
API changes, core changes, and (incorporating my earlier comment) utility 
coprocessors of sufficient generality make a lot of sense, as well as 
addressing the meta issues raised (I.e. should HBase have Eclipse plugin like 
tooling for getting and installing CPs). HBase should be a good platform for 
your work, let us know what you need.
                
> Build a Document Store on HBase for Better Query Processing
> -----------------------------------------------------------
>
>                 Key: HBASE-6800
>                 URL: https://issues.apache.org/jira/browse/HBASE-6800
>             Project: HBase
>          Issue Type: New Feature
>          Components: coprocessors, performance
>    Affects Versions: 0.96.0
>            Reporter: Jason Dai
>         Attachments: dot-deisgn.pdf
>
>
> In the last couple of years, increasingly more people begin to stream data 
> into HBase in near time, and 
> use high level queries (e.g., Hive) to analyze the data in HBase directly. 
> While HBase already has very effective MapReduce integration with its good 
> scanning performance, query processing using MapReduce on HBase still has 
> significant gaps compared to HDFS: ~3x space overheads and 3~5x performance 
> overheads according to our measurement.
> We propose to implement a document store on HBase, which can greatly improve 
> query processing on HBase (by leveraging the relational model and read-mostly 
> access patterns). According to our prototype, it can reduce space usage by 
> up-to ~3x and speedup query processing by up-to ~1.8x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6800) Build a Document Store on HBase for Better Query Processing

Reply via email to