Re: Online/Realtime query with filter and join?

Pradeep Gollakota Mon, 02 Dec 2013 11:12:42 -0800

@Viral I'm not sure... I just know that they mentioned on the front page
that PrestoDB can query HBase tables.



On Mon, Dec 2, 2013 at 11:07 AM, James Taylor <[email protected]>wrote:

> I agree with Doug Meil's advice. Start with your row key design. In
> Phoenix, your PRIMARY KEY CONSTRAINT defines your row key. You should lead
> with the columns that you'll filter against most frequently. Then, take a
> look at adding secondary indexes to speedup queries against other columns.
>
> Thanks,
> James
>
>
> On Mon, Dec 2, 2013 at 11:01 AM, Pradeep Gollakota <[email protected]
> >wrote:
>
> > In addition to Impala and Pheonix, I'm going to throw PrestoDB into the
> > mix. :)
> >
> > http://prestodb.io/
> >
> >
> > On Mon, Dec 2, 2013 at 10:58 AM, Doug Meil <
> [email protected]
> > >wrote:
> >
> > >
> > > You are going to want to figure out a rowkey (or a set of tables with
> > > rowkeys) to restrict the number of I/O's. If you just slap Impala in
> > front
> > > of HBase (or even Phoenix, for that matter) you could write SQL against
> > it
> > > but if it's winds up doing a full-scan of an Hbase table underneath you
> > > won't get your < 100ms response time.
> > >
> > > Note:  I'm not saying you can't do this with Impala or Phoenix, I'm
> just
> > > saying start with the rowkeys first so that you limit the I/O.  Then
> > start
> > > adding frameworks as needed (and/or build a schema with Phoenix in the
> > > same rowkey exercise).
> > >
> > > Such response-time requirements make me think that this is for
> > application
> > > support, so why the requirement for SQL? Might want to start writing it
> > as
> > > a Java program first.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On 11/29/13 4:32 PM, "Mourad K" <[email protected]> wrote:
> > >
> > > >You might want to consider something like Impala or Phoenix, I presume
> > > >you are trying to do some report query for dashboard or UI?
> > > >MapReduce is certainly not adequate as there is too much latency on
> > > >startup. If you want to give this a try, cdh4 and Impala are a good
> > start.
> > > >
> > > >Mouradk
> > > >
> > > >On 29 Nov 2013, at 10:33, Ramon Wang <[email protected]> wrote:
> > > >
> > > >> The general performance requirement for each query is less than 100
> > ms,
> > > >> that's the average level. Sounds crazy, but yes we need to find a
> way
> > > >>for
> > > >> it.
> > > >>
> > > >> Thanks
> > > >> Ramon
> > > >>
> > > >>
> > > >> On Fri, Nov 29, 2013 at 5:01 PM, yonghu <[email protected]>
> > wrote:
> > > >>
> > > >>> The question is what you mean of "real-time". What is your
> > performance
> > > >>> request? In my opinion, I don't think the MapReduce is suitable for
> > the
> > > >>> real time data processing.
> > > >>>
> > > >>>
> > > >>> On Fri, Nov 29, 2013 at 9:55 AM, Azuryy Yu <[email protected]>
> > wrote:
> > > >>>
> > > >>>> you can try phoniex.
> > > >>>> On 2013-11-29 3:44 PM, "Ramon Wang" <[email protected]> wrote:
> > > >>>>
> > > >>>>> Hi Folks
> > > >>>>>
> > > >>>>> It seems to be impossible, but I still want to check if there is
> a
> > > >>>>>way
> > > >>> we
> > > >>>>> can do "complex" query on HBase with "Order By", "JOIN".. etc
> like
> > we
> > > >>>> have
> > > >>>>> with normal RDBMS, we are asked to provided such a solution for
> it,
> > > >>>>>any
> > > >>>>> ideas? Thanks for your help.
> > > >>>>>
> > > >>>>> BTW, i think maybe impala from CDH would be a way to go, but
> > haven't
> > > >>> got
> > > >>>>> time to check it yet.
> > > >>>>>
> > > >>>>> Thanks
> > > >>>>> Ramon
> > > >>>
> > >
> > >
> >
>

Re: Online/Realtime query with filter and join?

Reply via email to