[GitHub] [spark] MaxGekk edited a comment on pull request #31097: [SPARK-31891][SQL] Recover tables with non-existing partition locations

GitBox Mon, 11 Jan 2021 02:02:31 -0800


MaxGekk edited a comment on pull request #31097:
URL: https://github.com/apache/spark/pull/31097#issuecomment-757842780



   @cloud-fan @HyukjinKwon Hive has special options for `MSCK REPAIR TABLE` to 
control recovering (see 
[doc](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RecoverPartitions(MSCKREPAIRTABLE))):
   - `MSCK REPAIR TABLE .. ADD PARTITIONS` - recover only new partition
   - `MSCK REPAIR TABLE .. DROP PARTITIONS` - remove missing partition
   - `MSCK REPAIR TABLE .. SYNC PARTITIONS` is equal to `MSCK REPAIR TABLE .. 
ADD PARTITIONS` + `MSCK REPAIR TABLE .. DROP PARTITIONS`
   
   The default `MSCK REPAIR TABLE` is shortcut for `MSCK REPAIR TABLE .. ADD 
PARTITIONS`. And `ALTER TABLE .. RECOVER PARTITIONS` is `MSCK REPAIR TABLE .. 
ADD PARTITIONS`.
   
   Currently, Spark implements only `MSCK REPAIR TABLE .. ADD PARTITIONS`, in 
fact.
   
   Should we follow Hive, WDYT?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] MaxGekk edited a comment on pull request #31097: [SPARK-31891][SQL] Recover tables with non-existing partition locations

Reply via email to