Jason Dere created HIVE-20998:
---------------------------------

             Summary: HiveStrictManagedMigration utility should update DB/Table 
location as last migration steps
                 Key: HIVE-20998
                 URL: https://issues.apache.org/jira/browse/HIVE-20998
             Project: Hive
          Issue Type: Sub-task
            Reporter: Jason Dere
            Assignee: Jason Dere


When processing a database or table, the HiveStrictManagedMigration utility 
currently changes the database/table locations as the first step in processing 
that database/table. Unfortunately if an error occurs while processing this 
database or table, then there may still be migration work that needs to 
continue for that db/table by running the migration again. However the 
migration tool only processes dbs/tables that have the old warehouse location, 
then the tool will skip over the db/table when the migration is run again.
 One fix here is to set the new location as the last step after all of the 
migration work is done:
 - The new table location will not be set until all of its partitions have been 
successfully migrated.
 - The new database location will not be set until all of its tables have been 
successfully migrated.

For existing migrations that failed with an error, the following workaround can 
be done so that the db/tables can be re-processed by the migration tool:
 1) Use the migration tool logs to find which databases/tables failed during 
processing.
 2) For each db/table, change location of of the database and table back to old 
location:
 ALTER DATABASE tpcds_bin_partitioned_orc_10 SET LOCATION 
'hdfs://ns1/apps/hive/warehouse/tpcds_bin_partitioned_orc_10.db';
 ALTER TABLE tpcds_bin_partitioned_orc_10.store_sales SET LOCATION 
'hdfs://ns1/apps/hive/warehouse/tpcds_bin_partitioned_orc_10.db/store_sales';
 2) Rerun the migration tool



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to