[ 
https://issues.apache.org/jira/browse/HIVE-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629702#comment-15629702
 ] 

Naveen Gangam commented on HIVE-11072:
--------------------------------------

[~aihuaxu] Thanks for the review and the comments.
I had fixed this for the prepare.sh script but missed it for this script. I 
will fix the JAVA_HOME like I did for the prepare.sh (thats used in the schema 
upgrade testing) where its read from /etc/alternatives 

The table names are not hard-coded in the script. They are detected from the 
hive schema SQL files. So the script should work for any new tables in the 
future schema. However, the foreign keys are hardcoded in the script. This 
information is used in determining whether or not to generate a random column 
value. For foreign key references, it uses the already generated values that 
are stored in a map. There is no consistent way to determine the FK references 
from parsing the schema files across different DBs. 
For example, 
derby does it via alter table (PK/FK constraints not part of the create table 
statement)
{code}
ALTER TABLE "APP"."IDXS" ADD CONSTRAINT "IDXS_FK1" FOREIGN KEY ("ORIG_TBL_ID") 
REFERENCES "APP"."TBLS" ("TBL_ID") ON DELETE NO ACTION ON UPDATE NO ACTION;
{code}

Oracle:
{code}
ALTER TABLE IDXS ADD CONSTRAINT IDXS_FK1 FOREIGN KEY (ORIG_TBL_ID) REFERENCES 
TBLS (TBL_ID) INITIALLY DEFERRED ;
{code}

I intend to remove this hardcoded FKs in the future, either via adding more 
logic in parsing the schema files or the simplest fix would be to add a static 
value for these columns in the dataload.properties file. This will eliminate 
the need to generate values for columns that have FK references.



> Add data validation between Hive metastore upgrades tests
> ---------------------------------------------------------
>
>                 Key: HIVE-11072
>                 URL: https://issues.apache.org/jira/browse/HIVE-11072
>             Project: Hive
>          Issue Type: New Feature
>          Components: Tests
>            Reporter: Sergio Peña
>            Assignee: Naveen Gangam
>         Attachments: HIVE-11072.1.patch, HIVE-11072.2.patch, 
> HIVE-11072.3.patch, HIVE-11072.4.patch
>
>
> An existing Hive metastore upgrade test is running on Hive jenkins. However, 
> these scripts do test only database schema upgrade, not data validation 
> between upgrades.
> We should validate data between metastore version upgrades. Using data 
> validation, we may ensure that data won't be damaged, or corrupted when 
> upgrading the Hive metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to