Yes, I am preparing a research seminar, and I am doing a survey of the uses or probabilistic and synopsis data structures in post-Hadoop "Big Data" technologies.
On Sun, May 1, 2016 at 8:34 PM, Julian Hyde <[email protected]> wrote: > Drill also makes use of hash tables and hash partitioning. > > I’m not sure what was the purpose of your question. Are you carrying out a > survey? > > Julian > > > > On May 1, 2016, at 5:22 PM, Ted Dunning <[email protected]> wrote: > > > > Drill doesn't use any such data structures in itself. The emphasis has > been > > on being correct first with the option of introducing approximations > later. > > > > That said, you can definitely define aggregators yourself. Last I > checked, > > however, user defined aggregators are single level ... that means that > > everything that gets aggregated has to go through a single function which > > definitely limits scalability. This was several months ago, though, so > > things may have improved by now. > > > > Perhaps somebody can comment on whether multi-level user-defined > > aggregators are possible? > > > > > > > > On Sat, Apr 30, 2016 at 8:32 AM, Edmon Begoli <[email protected]> wrote: > > > >> Is Drill using any of the probabilistic data structures [1], and if so - > >> which ones and how? > >> > >> Thank you, > >> Edmon > >> > >> 1. Probabilistic Data Structures - > >> https://en.m.wikipedia.org/wiki/Category:Probabilistic_data_structures > >> > >
