Re: Probabilistic data structures in Drill

Edmon Begoli Sun, 01 May 2016 19:27:06 -0700

Yes, I am preparing a research seminar, and I am doing a survey of the uses
or probabilistic and synopsis data structures in post-Hadoop "Big Data"
technologies.


On Sun, May 1, 2016 at 8:34 PM, Julian Hyde <[email protected]> wrote:

> Drill also makes use of hash tables and hash partitioning.
>
> I’m not sure what was the purpose of your question. Are you carrying out a
> survey?
>
> Julian
>
>
> > On May 1, 2016, at 5:22 PM, Ted Dunning <[email protected]> wrote:
> >
> > Drill doesn't use any such data structures in itself. The emphasis has
> been
> > on being correct first with the option of introducing approximations
> later.
> >
> > That said, you can definitely define aggregators yourself. Last I
> checked,
> > however, user defined aggregators are single level ... that means that
> > everything that gets aggregated has to go through a single function which
> > definitely limits scalability. This was several months ago, though, so
> > things may have improved by now.
> >
> > Perhaps somebody can comment on whether multi-level user-defined
> > aggregators are possible?
> >
> >
> >
> > On Sat, Apr 30, 2016 at 8:32 AM, Edmon Begoli <[email protected]> wrote:
> >
> >> Is Drill using any of the probabilistic data structures [1], and if so -
> >> which ones and how?
> >>
> >> Thank you,
> >> Edmon
> >>
> >> 1. Probabilistic Data Structures -
> >> https://en.m.wikipedia.org/wiki/Category:Probabilistic_data_structures
> >>
>
>

Re: Probabilistic data structures in Drill

Reply via email to