[ 
https://issues.apache.org/jira/browse/HIVE-22957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated HIVE-22957:
-----------------------------------------
    Description: 
*Design Doc: *
[^Design Doc_ Partition Filtering In MSCK REPAIR TABLE.pdf] 

  was:
Currently MSCK command supports full repair of table (all partitions) or some 
subset of partitions based on partitionSpec. The aim of this jira is to 
introduce a filterExp (=, !=, <, >, >=, <=, LIKE) in MSCK command so that a 
larger subset of partitions can be recovered (added/deleted) without firing a 
full repair might take time if the no. of partitions are huge.

*Approach*:

The initial approach is to add a where clause in MSCK command Eg: MCK REPAIR 
TABLE <tbl_name> ADD|DROP|SYNC PARTITIONS WHERE <pcol1> <filter_operator> 
<value> AND ....

*Flow:*

1) Parse the where clause and generate filterExpression

2) fetch all the partitions from the metastore which matches the filter 
expression

3) fetch all the partition file from the filesystem

4) remove all the partition path which does not match with the filter expression

5) Based on ADD | DROP | SYNC do the remaining steps.


> Add Predicate Filtering In MSCK REPAIR TABLE
> --------------------------------------------
>
>                 Key: HIVE-22957
>                 URL: https://issues.apache.org/jira/browse/HIVE-22957
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Syed Shameerur Rahman
>            Assignee: Syed Shameerur Rahman
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: Design Doc_ Partition Filtering In MSCK REPAIR 
> TABLE.pdf, HIVE-22957.01.patch
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> *Design Doc: *
> [^Design Doc_ Partition Filtering In MSCK REPAIR TABLE.pdf] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to