Hi Aditya,
While I cannot comment on MapR-DB in particular, I can say that, in general,
Drill is designed for fairly large queries. There is a trade-off between the
overhead of code gen and planning vs. the cost at runtime. Drill tends to
invest more in up-front planning and code gen to minimize runtime costs.
Of course, if your query scans just a few rows (MapR-DB has indexes), then
Drill's trade-off might not work out as well as if Drill were scanning multiple
GBs of data.
That said, 6 seconds seems like a long time. In my experience, Drill can setup
and execute queries in a few hundred ms. So, there are two possible sources of
delay.
First, how complex is the query? Simple queries should be very fast. If,
however, you have a very large number of columns or GROUP BY keys, etc. then we
have occasionally seen longer planning times.
The other possible delay would relate to interaction with MapR-DB. For that,
MapR folks would have better insight and might offer ways of identifying and
resolving any issues.
Thanks,
- Paul
On Wednesday, February 5, 2020, 12:41:56 PM PST, Aditya Allamraju
<[email protected]> wrote:
Team,
Is there a way to reduce the "setup time" for a minor fragment?
In my case, it's Drill on Mapr-db JSON table.
As per documentation, it is time consumed for "runtime code generation and
opening a file".
While going through a query profile i see below:
Minor Fragment Hostname Setup Time Process Time Wait Time Max Batches Max
Records Peak Memory
07-00-03 hostA.com *6.242s* 0.384s 0.000s 3 10,235 7MB
Thanks
Aditya