I have to admit, I didn't realize columnar was such a big part of Drill.  I 
guess that's consistent with Dremel, so it makes sense.  I always thought the 
emphasis was on heterogenous data access, not on perf.  Cool!

So with that in mind, does drill do much with vector processing/SIMD operation?

-----Original Message-----
From: Jacques Nadeau [mailto:[email protected]] 
Sent: Monday, August 17, 2015 1:17 AM
To: [email protected]
Subject: Re: Benchmarks for Apache Drill

Drill is very fast.  This is because nearly everybody on the Drill team is 
focused on performance.  We haven't published any formal benchmarks yet.
That being said, there are a few out there.  I see that Ted mentioned the Intel 
one.  Another is here [1]. As Ted mentioned, these blogs test older and 
pre-release versions of Drill.  Nonetheless, Drill already outshines nearly all 
of the competition.  That being said, the reality is that most benchmarks are 
very skewed and poorly executed so I strongly recommend you try out Drill on 
your workload.  Once you get setup, ask the community for help to tune the 
system.  Many others are finding it to be incredibly fast and it has repeatedly 
displaced commercial MPP databases and older open source technologies.

Drill is the only open source pure columnar in-memory execution engine today.  
This means that Drill has the right architecture to continue to increase its 
lead over other engines. (Think of this as future-proofing.)  We'll be 
enhancing the engine with items including columnar functions, compilation 
optimizations and customized relational operators in the coming months.  This 
will simply extend Drill's performance lead.

thanks,
Jacques

[1] http://allegro.tech/fast-data-hackathon.html

--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Sun, Aug 16, 2015 at 1:47 AM, Ming Han Teh <[email protected]> wrote:

> Hi,
>
> Are there any benchmarks on Apache Drill?
> (standalone benchmarks OR vs Impala/Presto)
>
> Thanks,
> Ming Han
>

Reply via email to