[
https://issues.apache.org/jira/browse/HIVE-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629702#comment-15629702
]
Naveen Gangam commented on HIVE-11072:
--------------------------------------
[~aihuaxu] Thanks for the review and the comments.
I had fixed this for the prepare.sh script but missed it for this script. I
will fix the JAVA_HOME like I did for the prepare.sh (thats used in the schema
upgrade testing) where its read from /etc/alternatives
The table names are not hard-coded in the script. They are detected from the
hive schema SQL files. So the script should work for any new tables in the
future schema. However, the foreign keys are hardcoded in the script. This
information is used in determining whether or not to generate a random column
value. For foreign key references, it uses the already generated values that
are stored in a map. There is no consistent way to determine the FK references
from parsing the schema files across different DBs.
For example,
derby does it via alter table (PK/FK constraints not part of the create table
statement)
{code}
ALTER TABLE "APP"."IDXS" ADD CONSTRAINT "IDXS_FK1" FOREIGN KEY ("ORIG_TBL_ID")
REFERENCES "APP"."TBLS" ("TBL_ID") ON DELETE NO ACTION ON UPDATE NO ACTION;
{code}
Oracle:
{code}
ALTER TABLE IDXS ADD CONSTRAINT IDXS_FK1 FOREIGN KEY (ORIG_TBL_ID) REFERENCES
TBLS (TBL_ID) INITIALLY DEFERRED ;
{code}
I intend to remove this hardcoded FKs in the future, either via adding more
logic in parsing the schema files or the simplest fix would be to add a static
value for these columns in the dataload.properties file. This will eliminate
the need to generate values for columns that have FK references.
> Add data validation between Hive metastore upgrades tests
> ---------------------------------------------------------
>
> Key: HIVE-11072
> URL: https://issues.apache.org/jira/browse/HIVE-11072
> Project: Hive
> Issue Type: New Feature
> Components: Tests
> Reporter: Sergio Peña
> Assignee: Naveen Gangam
> Attachments: HIVE-11072.1.patch, HIVE-11072.2.patch,
> HIVE-11072.3.patch, HIVE-11072.4.patch
>
>
> An existing Hive metastore upgrade test is running on Hive jenkins. However,
> these scripts do test only database schema upgrade, not data validation
> between upgrades.
> We should validate data between metastore version upgrades. Using data
> validation, we may ensure that data won't be damaged, or corrupted when
> upgrading the Hive metastore.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)