[
https://issues.apache.org/jira/browse/HIVE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939016#comment-14939016
]
Pengcheng Xiong commented on HIVE-9447:
---------------------------------------
+! pending QA run.
> Metastore: inefficient Oracle query for removing unused column descriptors
> when add/drop table/partition
> --------------------------------------------------------------------------------------------------------
>
> Key: HIVE-9447
> URL: https://issues.apache.org/jira/browse/HIVE-9447
> Project: Hive
> Issue Type: Improvement
> Components: Metastore
> Affects Versions: 0.14.0
> Reporter: Selina Zhang
> Assignee: Selina Zhang
> Attachments: HIVE-9447.1.patch, HIVE-9447.2.patch
>
> Original Estimate: 3h
> Remaining Estimate: 3h
>
> Metastore needs removing unused column descriptors when drop/add partitions
> or tables. For query the unused column descriptor, the current implementation
> utilizes datanuleus' range function, which basically equals LIMIT syntax.
> However, Oracle does not support LIMIT, the query is converted as
> {quote}
> SQL> SELECT * FROM (SELECT subq.*,ROWNUM rn FROM (SELECT
> 'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS
> NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,
> A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM drhcat.SDS A0
> WHERE A0.CD_ID = ? ) subq ) WHERE rn <= 1;
> {quote}
> Given that CD_ID is not very selective, this query may have to access large
> amount of rows (depends how many partitions the table has, millions of rows
> in our case). Metastore may become unresponsive because of this.
> Since Metastore only needs to know if the specific CD_ID is referenced in SDS
> table and does not need access the whole row. We can use
> {quote}
> select count(1) from SDS where SDS.CD_ID=?
> {quote}
> CD_ID is index column, the above query will do range scan for index, which is
> faster.
> For other DBs support LIMIT syntax such as MySQL, this problem does not
> exist. However, the new query does not hurt.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)