Re: Drill Fragment setup time

Paul Rogers Wed, 05 Feb 2020 13:56:10 -0800

Hi Aditya,

While I cannot comment on MapR-DB in particular, I can say that, in general, 
Drill is designed for fairly large queries. There is a trade-off between the 
overhead of code gen and planning vs. the cost at runtime. Drill tends to 
invest more in up-front planning and code gen to minimize runtime costs.


Of course, if your query scans just a few rows (MapR-DB has indexes), then 
Drill's trade-off might not work out as well as if Drill were scanning multiple 
GBs of data.

That said, 6 seconds seems like a long time. In my experience, Drill can setup 
and execute queries in a few hundred ms. So, there are two possible sources of 
delay.

First, how complex is the query? Simple queries should be very fast. If, 
however, you have a very large number of columns or GROUP BY keys, etc. then we 
have occasionally seen longer planning times.

The other possible delay would relate to interaction with MapR-DB. For that, 
MapR folks would have better insight and might offer ways of identifying and 
resolving any issues.

Thanks,
- Paul

 

    On Wednesday, February 5, 2020, 12:41:56 PM PST, Aditya Allamraju 
<[email protected]> wrote:  
 
 Team,

Is there a way to reduce the "setup time" for a minor fragment?
In my case, it's Drill on Mapr-db JSON table.

As per documentation, it is time consumed for "runtime code generation and
opening a file".
While going through a query profile i see below:

Minor Fragment Hostname Setup Time Process Time Wait Time Max Batches Max
Records Peak Memory
07-00-03 hostA.com *6.242s* 0.384s 0.000s 3 10,235 7MB

Thanks
Aditya

Re: Drill Fragment setup time

Reply via email to