Sudheesh Katkam created DRILL-3921:
--------------------------------------
Summary: Hive LIMIT 1 queries takes too long
Key: DRILL-3921
URL: https://issues.apache.org/jira/browse/DRILL-3921
Project: Apache Drill
Issue Type: Bug
Components: Execution - Flow
Reporter: Sudheesh Katkam
Assignee: Sudheesh Katkam
Fragment initialization on a Hive table (that is backed by a directory of many
files) can take really long. This is evident through LIMIT 1 queries. The root
cause is that the underlying reader in the HiveRecordReader is initialized when
the ctor is called, rather than when setup is called.
Two changes need to be made:
1) lazily initialize the underlying record reader in HiveRecordReader
2) allow for running a callable as a proxy user within an operator (through
OperatorContext). This is required as initialization of the underlying record
reader needs to be done as a proxy user (proxy for owner of the file).
Previously, this was handled while creating the record batch tree.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)