Re: Question regarding to Drill

AnilKumar B Mon, 10 Jun 2013 02:26:30 -0700

Thanks Ted.

What exactly, I thought is pre-computing the aggregations like cubes might
be better. But as you mentioned, that might be true, If I know ahead of
time.



On Mon, Jun 10, 2013 at 2:20 PM, Ted Dunning <[email protected]> wrote:

> On Mon, Jun 10, 2013 at 10:35 AM, AnilKumar B <[email protected]>
> wrote:
>
> > Hi,
> >
> > I went through the Drill documentation and going through the source
> code, I
> > have few questions regarding to drill. Can any one help me in
> understanding
> > it much better?
> >
> > 1) How the Drill aggregations are real time? Anyway it is going to scan
> all
> > the records right? What exactly it optimizes when compare to Map Reduce
> > based Hive(Considering index feature)?
> >
>
> Real-time is often used in a bit of a sloppy fashion.  The meaning with
> respect to Drill is "ad hoc, interactive queries".
>
>
> > 2) For aggregations, Is in't Cube materialization will be better
> solution?
> >  For example like HBase-Lattice kind of solution.
> >
>
> Cubes are fine if you know what you are doing ahead of time.  They still
> require a pass over the data.  Nothing prevents Drill from creating and/or
> cubes.
>
> 3) What exactly the real use cases for Drill? Whenever we say interactive,
> > mostly they include aggregations, and when we say aggregations definitely
> > they cannot be real time, when we scan whole raw data.
> >
>
> Aggregation is a fine use case.  There are many others as well.  For
> instance, incremental cooccurrence counting.  Or, with special UDF's, the
> inner loop of many machine learning applications.
>
> Drill has an especially flexible scanner API which will allow cross data
> source scanning.
>
> Not sure what you are getting at, though, so I may have mis interpreted
> something you said.
>

Re: Question regarding to Drill

Reply via email to