Another thought: record batch tree creation time should be short. If any substantial work needs to be done, we should move it to setup. On Sep 25, 2015 6:47 AM, "Jacques Nadeau" <jacq...@dremio.com> wrote:
> Limit zero shouldn't use any readers if we know the schema. Look at the > upstream constant reduction rule. We should be able to go straight from > calcite algebra to result without hitting any execution code. Think direct > response same as explain. > On Sep 24, 2015 10:46 PM, "Jinfeng Ni" <jinfengn...@gmail.com> wrote: > >> The query itself is quite simple; it normally should not take 60 >> seconds for planning. I guess most of the planning time is spent on >> reading parquet metadata. The metadata caching that Steven worked >> should help in this case. >> >> >> On Thu, Sep 24, 2015 at 10:42 PM, Sudheesh Katkam <skat...@maprtech.com> >> wrote: >> > For the table below, 33 seconds for execution (includes parquet reader >> initialization) and 60 seconds for planning. >> > >> >> On Sep 24, 2015, at 10:01 PM, Jinfeng Ni <jinfengn...@gmail.com> >> wrote: >> >> >> >> "FragmentExecutor took 1,070,926 ms to create RecordBatch tree." >> >> >> >> 1,070,926 ms ~ 17.x minutes. In other words, the majority of 18 >> >> minutes of execution in hive case is spent on the initialization of >> >> Hive readers. If we want to improve "limit n", we probably should make >> >> "lazy" initialization of Hive reader; only when Drill has to read rows >> >> from reader, we do the initialization. Otherwise, to initialize all >> >> the readers before reading any single row means long setup time for >> >> limit "n" query, when n is relative small. >> >> >> >> For the second case, the 94 seconds query time seems to be too long as >> >> well. I guess most of the time is spent on parquet reader >> >> initialization (?) >> >> >> >> >> >> >> >> On Thu, Sep 24, 2015 at 9:32 PM, Sudheesh Katkam <skat...@maprtech.com> >> wrote: >> >>> Hey y'all, >> >>> >> >>> ### Short Question: >> >>> >> >>> How do we improve performance of SELECT * FROM plugin.table LIMIT 0? >> >>> >> >>> ### Extended Question: >> >>> >> >>> While investigating DRILL-3623 < >> https://issues.apache.org/jira/browse/DRILL-3623>, I did an analysis to >> see where we spend time for SELECT * FROM hive.table LIMIT 0 query. >> >>> >> >>> ## Setup: >> >>> Copy the drill/sample-data/region.parquet (x 20000) into a DFS >> (MapR-FS in my case) directory named region. Create a Hive external table >> pointing to region. Run Drill with default configuration. >> >>> >> >>> ## Now there are two ways to query this table: >> >>> >> >>>> SELECT * FROM hive.region LIMIT 0; >> >>> +--------------+---------+------------+ >> >>> | r_regionkey | r_name | r_comment | >> >>> +--------------+---------+------------+ >> >>> +--------------+---------+------------+ >> >>> No rows selected (1203.179 seconds) >> >>> ... >> >>> >> >>>> SELECT * FROM dfs.test.region LIMIT 0; >> >>> +--------------+---------+------------+ >> >>> | r_regionkey | r_name | r_comment | >> >>> +--------------+---------+------------+ >> >>> +--------------+---------+------------+ >> >>> No rows selected (94.396 seconds) >> >>> >> >>> Currently, we use HiveRecordReader for the first case and >> ParquetRecordReader in the second case. With DRILL-3209 < >> https://issues.apache.org/jira/browse/DRILL-3209>, both queries will use >> ParquetRecordReader. However, for formats that are non-native to Drill or >> other storage plugins, we still face this problem. Summarizing the query >> profile, >> >>> +-------+-----------+---------------+----------------+ >> >>> | Query | Fragments | Planning time | Execution time | >> >>> +-------+-----------+---------------+----------------+ >> >>> | hive | 1 | ~2 min | ~18 min | >> >>> | dfs | 1 | ~1 min | ~33 sec | >> >>> +-------+-----------+---------------+----------------+ >> >>> >> >>> ## The time hogs: >> >>> >> >>> # Planning time in both cases needs to improve. How? >> >>> >> >>> # With respect to execution, in the first case ImplCreator.getExec(…) >> call in the FragmentExecutor took 1,070,926 ms to create RecordBatch tree. >> There are 20,000 readers being initialized in HiveScanBatchCreator. How do >> we avoid this? What are the implications of chained impersonation (opening >> readers in ctor() rather than in setup())? >> >>> >> >>> ### Extending further: >> >>> >> >>> This can be generalized to any "LIMIT n" query with n is a small >> number. For n > 0, we parallelize scanning. So LIMIT 1 query runs faster >> than LIMIT 0. However there is a sweet "n" after which parallelization >> hurts. >> >>> >> >>> ### >> >>> >> >>> Thank you, >> >>> Sudheesh >> >>> >> > >> >