[ https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13913795#comment-13913795 ]
Selina Zhang commented on HIVE-6492: ------------------------------------ It is not a rare case when a table has 1000+ partitions. To avoid people issue a query lack of knowledge how many partitions will be scanned, introducing one more configure variable "hive.limit.query.max.table.partition" will enable system admin to protect the grid. The default value is set to -1 which means no limit. This variable will be ignored in the following cases: 1. Simple fetch query with limit : select * from table limit n; 2. Metadata only query: select distinct partition_key from partition_table; There is one special case: Sometimes BI tools such as Tableau (connected through ODBC driver) will issue select * from table at the initial stage to figure out table meta data. It will not hurt the grid because Tableau will cancel the query after it receives one or two rows. To allow Tableau still can work, code is added to mark the query client types such as CLIDriver and JDBC. And only allow ODBC-sourced query go through. > limit partition number involved in a table scan > ----------------------------------------------- > > Key: HIVE-6492 > URL: https://issues.apache.org/jira/browse/HIVE-6492 > Project: Hive > Issue Type: New Feature > Components: Query Processor > Affects Versions: 0.12.0 > Reporter: Selina Zhang > Fix For: 0.13.0 > > Original Estimate: 24h > Remaining Estimate: 24h > > To protect the cluster, a new configure variable > "hive.limit.query.max.table.partition" is added to hive configuration to > limit the table partitions involved in a table scan. > The default value will be set to -1 which means there is no limit by default. > This variable will not affect "metadata only" query. -- This message was sent by Atlassian JIRA (v6.1.5#6160)