[
https://issues.apache.org/jira/browse/HIVE-20552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Teddy Choi updated HIVE-20552:
------------------------------
Attachment: HIVE-20552.3.patch
> Get Schema from LogicalPlan faster
> ----------------------------------
>
> Key: HIVE-20552
> URL: https://issues.apache.org/jira/browse/HIVE-20552
> Project: Hive
> Issue Type: Improvement
> Reporter: Teddy Choi
> Assignee: Teddy Choi
> Priority: Major
> Labels: pull-request-available
> Attachments: HIVE-20552.1.patch, HIVE-20552.2.patch,
> HIVE-20552.3.patch
>
>
> To get the schema of a query faster, it currently needs to compile, optimize,
> and generate a TezPlan, which creates extra overhead when only the
> LogicalPlan is needed.
> 1. Copy the method \{{HiveMaterializedViewsRegistry.parseQuery}}, making it
> \{{public static}} and putting it in a utility class.
> 2. Change the return statement of the method to \{{return
> analyzer.getResultSchema();}}
> 3. Change the return type of the method to \{{List<FieldSchema>}}
> 4. Call the new method from \{{GenericUDTFGetSplits.createPlanFragment}}
> replacing the current code which does this:
> {code}
> if(num == 0) {
> //Schema only
> return new PlanFragment(null, schema, null);
> }
> {code}
> moving the call earlier in \{{getPlanFragment}} ... right after the HiveConf
> is created ... bypassing the code that uses \{{HiveTxnManager}} and
> \{{Driver}}.
> 5. Convert the \{{List<FieldSchema>}} to
> \{{org.apache.hadoop.hive.llap.Schema}}.
> 6. return from \{{getPlanFragment}} by returning \{{new PlanFragment(null,
> schema, null)}}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)