[
https://issues.apache.org/jira/browse/HIVE-22957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Syed Shameerur Rahman updated HIVE-22957:
-----------------------------------------
Summary: Add Predicate Filtering In MSCK REPAIR TABLE (was: Predicate
Filtering In MSCK REPAIR TABLE)
> Add Predicate Filtering In MSCK REPAIR TABLE
> --------------------------------------------
>
> Key: HIVE-22957
> URL: https://issues.apache.org/jira/browse/HIVE-22957
> Project: Hive
> Issue Type: Improvement
> Components: Standalone Metastore
> Reporter: Syed Shameerur Rahman
> Assignee: Syed Shameerur Rahman
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22957.01.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently MSCK command supports full repair of table (all partitions) or some
> subset of partitions based on partitionSpec. The aim of this jira is to
> introduce a filterExp (=, !=, <, >, >=, <=, LIKE) in MSCK command so that a
> larger subset of partitions can be recovered (added/deleted) without firing a
> full repair might take time if the no. of partitions are huge.
> *Approach*:
> The initial approach is to add a where clause in MSCK command Eg: MCK REPAIR
> TABLE <tbl_name> ADD|DROP|SYNC PARTITIONS WHERE <pcol1> <filter_operator>
> <value> AND ....
> *Flow:*
> 1) Parse the where clause and generate filterExpression
> 2) fetch all the partitions from the metastore which matches the filter
> expression
> 3) fetch all the partition file from the filesystem
> 4) remove all the partition path which does not match with the filter
> expression
> 5) Based on ADD | DROP | SYNC do the remaining steps.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)