I am performance testing a single drill instance with different vCPU configurations in AWS. I have a parquet files on an EFS volume and use the same data for each EC2 instance.
I have used 4vCPUs, 8 and 16. Drill performance is ~25 second, 15 and 12 respectively. I have not changed any of the options. This an out of the box 1.11 installation. What Drill tuning options should I experiment with? I have read https://drill.apache.org/docs/asynchronous-parquet-reader/ but it is so technical that I can't consume it but it reads like the default options are the best ones. The query looks like this: SELECT store_key, SUM(sales_dollars) sd FROM dfs.root.sales_p GROUP BY store_key ORDER BY sd DESC LIMIT 10 Dan Holmes | Architect | Revenue Analytics, Inc. 300 Galleria Parkway, Suite 1900 | Atlanta, Georgia 30339 Direct: 770.859.1255 Cell: 404.617.3444 www.revenueanalytics.com<https://webmail.revenueanalytics.com/owa/redir.aspx?SURL=RqmyOJRm3r383jV2nPQLyg9BvjWZqMX4-tL3BHj81WfaslMWau_SCGgAdAB0AHAAOgAvAC8AdwB3AHcALgByAGUAdgBlAG4AdQBlAGEAbgBhAGwAeQB0AGkAYwBzAC4AYwBvAG0A&URL=http%3a%2f%2fwww.revenueanalytics.com> LinkedIn<https://webmail.revenueanalytics.com/owa/redir.aspx?SURL=SrcaeiXxVTCDhl49ibCO7CHhTsNynunc_8gSjHDaikXaslMWau_SCGgAdAB0AHAAcwA6AC8ALwB3AHcAdwAuAGwAaQBuAGsAZQBkAGkAbgAuAGMAbwBtAC8AYwBvAG0AcABhAG4AeQAvAHIAZQB2AGUAbgB1AGUALQBhAG4AYQBsAHkAdABpAGMAcwAtAGkAbgBjAC0A&URL=https%3a%2f%2fwww.linkedin.com%2fcompany%2frevenue-analytics-inc-> | Twitter<https://webmail.revenueanalytics.com/owa/redir.aspx?SURL=cdePsMV8TCGx8O_Rugbj-maE9C9DVT373vSJwbUc23faslMWau_SCGgAdAB0AHAAcwA6AC8ALwB0AHcAaQB0AHQAZQByAC4AYwBvAG0ALwBSAGUAdgBfAEEAbgBhAGwAeQB0AGkAYwBzAA..&URL=https%3a%2f%2ftwitter.com%2fRev_Analytics>
