[
https://issues.apache.org/jira/browse/HIVE-27080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Soumyakanti Das reassigned HIVE-27080:
--------------------------------------
Assignee: Soumyakanti Das
> Support project pushdown in JDBC storage handler even when filters are not
> pushed
> ---------------------------------------------------------------------------------
>
> Key: HIVE-27080
> URL: https://issues.apache.org/jira/browse/HIVE-27080
> Project: Hive
> Issue Type: Improvement
> Components: CBO
> Affects Versions: 4.0.0-alpha-2
> Reporter: Stamatis Zampetakis
> Assignee: Soumyakanti Das
> Priority: Major
> Labels: pull-request-available
> Attachments: jdbc_project_pushdown.q
>
>
> {code:sql}
> CREATE EXTERNAL TABLE book
> (
> id int,
> title varchar(20),
> author int
> )
> STORED BY
> 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/qtestDB",
> "hive.sql.dbcp.username" = "qtestuser",
> "hive.sql.dbcp.password" = "qtestpassword",
> "hive.sql.table" = "book"
> );
> {code}
> {code:sql}
> explain cbo select id from book where title = 'Les Miserables';
> {code}
> {noformat}
> CBO PLAN:
> HiveJdbcConverter(convention=[JDBC.POSTGRES])
> JdbcProject(id=[$0])
> JdbcFilter(condition=[=($1, _UTF-16LE'Les Miserables')])
> JdbcHiveTableScan(table=[[default, book]], table:alias=[book])
> {noformat}
> +Good case:+ Only the id column is fetched from the underlying database (see
> JdbcProject) since it is necessary for the result.
> {code:sql}
> explain cbo select id from book where UPPER(title) = 'LES MISERABLES';
> {code}
> {noformat}
> CBO PLAN:
> HiveProject(id=[$0])
> HiveFilter(condition=[=(CAST(UPPER($1)):VARCHAR(2147483647) CHARACTER SET
> "UTF-16LE", _UTF-16LE'LES MISERABLES')])
> HiveProject(id=[$0], title=[$1], author=[$2])
> HiveJdbcConverter(convention=[JDBC.POSTGRES])
> JdbcHiveTableScan(table=[[default, book]], table:alias=[book])
> {noformat}
> +Bad case:+ All table columns are fetched from the database although only id
> and title are necessary; id is the result so cannot be dropped and title is
> needed for HiveFilter since the UPPER operation was not pushed in the DBMS.
> The author column is not needed at all so the plan should have a JdbcProject
> with id, and title, on top of the JdbcHiveTableScan.
> Although it doesn't seem a big deal in some cases tables are pretty wide
> (more than 100 columns) while the queries rarely return all of them.
> Improving project pushdown to handle such cases can give a major performance
> boost.
> Pushing the filter with UPPER to JDBC storage handler is also a relevant
> improvement but this should be tracked under another ticket.
> The problem can be reproduced by running:
> {noformat}
> mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=jdbc_project_pushdown.q
> -Dtest.output.overwrite
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)