Re: query time comparison to several SQL engines

James Turton Thu, 07 Apr 2022 07:30:27 -0700

What might be the biggest factor affecting running time here is thatDrill's query execution is not fault tolerant while Spark's is. Thephilosophy is different, Drill's says "when you're doing interactiveanalytics and a node dies, killing your query as it goes, just run thequery again."


On 2022/04/07 16:11, Wes Peng wrote:

Hi Jacek,
Spark and Drill have no direct relations. But they have the similararchitecture.
If you read the book "Learning Apache Drill" (I guess it's freeonline), chap 3 will give you Drill's SQL engine architecture:
It's quite similar to Spark's.
And the distributed implementation architecture is almost the same asSpark:
Though they are separated products, but have the similarimplementation IMO.
No, I didn't use a statement optimized for Drill. It's just a commonSQL statement.
The reason for drill is faster, I think it's b/c drill's direct mmaptechnology. It's more memory consumed than spark, so more faster.
Thanks.


Jacek Laskowski wrote:
Is this true that Drill is Spark or vice versa under the hood? If so,how is it possible that Drill is faster? What does Drill do to makethe query faster? Could this be that you used a type of query Drillis optimized for? Just guessing and am really curious (not implyingthat one is better or worse than the other(s)).



---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: query time comparison to several SQL engines

Reply via email to