[ https://issues.apache.org/jira/browse/HIVE-17035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rajesh Balamohan updated HIVE-17035: ------------------------------------ Description: In a fairly large query which had tens of left join, time taken to create linageInfo itself took 1500+ seconds. This is due to the fact that the table had lots of columns and in some processing, it ended up processing 7000+ value columns in {{ReduceSinkLineage}}, though only 50 columns were projected in the query. It would be good to invoke lineage transform when rest of the optimizers in {{Optimizer}} are invoked. This would avoid unwanted processing and help in improving the runtime. was: In a fairly large query which had tens of left join, time taken to create linageInfo itself took 1500+ seconds. This is due to the fact that the table had lots of columns and in some processing, it ended up processing 7000+ value columns in {{ReduceSinkLineage}}, though only 50 columns were projected in the query. It would be good to invoke lineage transform when rest of the optimizers in {{Optimizer}} are invoked. This would avoid help in improving the runtime. > Optimizer: Lineage transform() should be invoked after rest of the optimizers > are invoked > ----------------------------------------------------------------------------------------- > > Key: HIVE-17035 > URL: https://issues.apache.org/jira/browse/HIVE-17035 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer > Reporter: Rajesh Balamohan > Assignee: Rajesh Balamohan > Priority: Minor > Attachments: HIVE-17035.1.patch > > > In a fairly large query which had tens of left join, time taken to create > linageInfo itself took 1500+ seconds. This is due to the fact that the table > had lots of columns and in some processing, it ended up processing 7000+ > value columns in {{ReduceSinkLineage}}, though only 50 columns were projected > in the query. > It would be good to invoke lineage transform when rest of the optimizers in > {{Optimizer}} are invoked. This would avoid unwanted processing and help in > improving the runtime. -- This message was sent by Atlassian JIRA (v6.4.14#64029)