RE: HBase: project ideas

Jonathan Gray Thu, 19 Aug 2010 12:27:31 -0700

Himanshu,

Seems like you might have an interest in using Coprocessors to do stuff like 
low-latency aggregates.  This is a big area of interest for some of us but not 
a lot of concerted effort in this direction yet.  There is plenty to do here 
for a research project.


Check out:

https://issues.apache.org/jira/browse/HBASE-2000

And specifically:

https://issues.apache.org/jira/browse/HBASE-1512

JG

> -----Original Message-----
> From: Himanshu Vashishtha [mailto:[email protected]]
> Sent: Thursday, August 19, 2010 11:30 AM
> To: [email protected]
> Cc: [email protected]
> Subject: Re: HBase: project ideas
> 
> Hello Stack,
> Thanks for the reply. please see inline.
> 
> Cheers,
> Himanshu
> 
> On Thu, Aug 19, 2010 at 11:22 AM, Stack <[email protected]> wrote:
> 
> > On Thu, Aug 19, 2010 at 2:47 AM, Himanshu Vashishtha
> > <[email protected]> wrote:
> > > Dear All:
> > > I have been looking around HBase (running/debugging it, etc) for a
> couple
> > of
> > > weeks now, and it is fascinating. I am in search of a good project
> for my
> > > grad studies, focussing around HBase, but am not able to finalize
> it. I
> > am
> > > looking for some project idea that I can use. It can be user or a
> dev
> > > project, I am open to all :)
> > >
> > > One idea (user specific) is to migrate a XQuery like tool that uses
> > > relational db schema (there are bunch of papers suggesting it) to
> HBase,
> > but
> > > I don't sure whether it is really a judicial use of HBase. Please
> > suggest.
> > >
> > >
> >
> > Hello Himanshu.
> >
> > Its hard to make suggestion when I've no clue as to your interests.
> >
> Hadoop fascinates me. I wrote a tool for my lab which indexes a given
> document collection (of plain text files) and then user can query it
> from
> four predefined operations... I store those indexes on HDFS using
> Mapfiles(to reduce the request-response latency).
> 
> Can you cite some of the papers you mention?
> > So, I want to carry it forward for XML, and I came across two
> approaches:
> > indexing the doc, OR storing them in a rdbms style while also
> considering
> > schema info.
> >
> Paper ( for index based approach): An efficient inverted index
> technique for
> XML documents using RDBMS, Chiyoung Seo, others..2003.
> 
> and for rdbms approach: *A Comprehensive XQuery* to *SQL* Translation
> using
> Dynamic Interval Encoding. David DeHaan, David Toman, Mariano P.
> Consens,
> others... in 2003, and its references.
> 
> I developed a prototype for the index based one in HBase, but it is
> limited
> in usage (due to its inherent approach of indexing, you can't fire
> elegant
> operations like summing, grouping etc). Its quite raw.
> 
>  + Have you looked at HIVE?  It might be more pertinent making this run
> > better atop hbase rather than making a new XQuery-like tool for
> hbase.
> >
> 
> Not yet. I read that it runs a MR job for every query, and it kind of
> slows
> its response time, so I skipped it past. But yes, it does provides lot
> of
> relational schema stuff I see.
> 
> > + Build an app that allows various kind of location queries using
> > geohashing+hbase combo.  There's a few fellas floating on the list
> who
> > might be able to help you out on this project.
> >
> > For extra points, whatever you do, build it using hbase-2000
> coprocessors.
> >   I am sorry I couldn't get this.
> >
> 
> 
> > Thanks for writing the list Himanshu.
> > St.Ack
> >

RE: HBase: project ideas

Reply via email to