Copilot commented on code in PR #6521:
URL: https://github.com/apache/hive/pull/6521#discussion_r3376961962


##########
standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-4.2.0-to-4.3.0.mssql.sql:
##########
@@ -5,6 +5,36 @@ ALTER TABLE MATERIALIZATION_REBUILD_LOCKS ADD MRL_CAT_NAME 
nvarchar(128) NOT NUL
 
 CREATE INDEX MIN_HISTORY_WRITE_ID_IDX ON MIN_HISTORY_WRITE_ID (MH_DATABASE, 
MH_TABLE, MH_WRITEID);
 
+ALTER TABLE TXN_COMPONENTS ADD TC_ID bigint IDENTITY(1,1) NOT NULL;
+ALTER TABLE TXN_COMPONENTS ADD CONSTRAINT TXN_COMPONENTS_PK PRIMARY KEY 
CLUSTERED (TC_ID);
+
+ALTER TABLE COMPLETED_TXN_COMPONENTS ADD CTC_ID bigint IDENTITY(1,1) NOT NULL;
+ALTER TABLE COMPLETED_TXN_COMPONENTS ADD CONSTRAINT 
COMPLETED_TXN_COMPONENTS_PK PRIMARY KEY CLUSTERED (CTC_ID);
+
+ALTER TABLE COMPACTION_METRICS_CACHE ADD CMC_ID bigint IDENTITY(1,1) NOT NULL;
+ALTER TABLE COMPACTION_METRICS_CACHE ADD CONSTRAINT 
COMPACTION_METRICS_CACHE_PK PRIMARY KEY CLUSTERED (CMC_ID);
+
+ALTER TABLE WRITE_SET ADD WS_ID bigint IDENTITY(1,1) NOT NULL;
+ALTER TABLE WRITE_SET ADD CONSTRAINT WRITE_SET_PK PRIMARY KEY CLUSTERED 
(WS_ID);

Review Comment:
   SQL Server `ALTER TABLE ... ADD ... IDENTITY` + adding clustered PK 
constraints can be long-running and will take schema modification locks. For 
large metastore tables (TXN_COMPONENTS/WRITE_SET/etc.) this can introduce 
significant upgrade downtime; consider documenting expected impact or using a 
staged migration (add column, populate, then add PK) if downtime is a concern.



##########
standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-4.2.0-to-4.3.0.postgres.sql:
##########
@@ -5,6 +5,25 @@ ALTER TABLE "MATERIALIZATION_REBUILD_LOCKS" ADD COLUMN 
"MRL_CAT_NAME" varchar(12
 
 CREATE INDEX "MIN_HISTORY_WRITE_ID_IDX" ON "MIN_HISTORY_WRITE_ID" 
("MH_DATABASE", "MH_TABLE", "MH_WRITEID");
 
+ALTER TABLE "TXN_COMPONENTS" ADD COLUMN "TC_ID" bigserial PRIMARY KEY;
+ALTER TABLE "COMPLETED_TXN_COMPONENTS" ADD COLUMN "CTC_ID" bigserial PRIMARY 
KEY;
+ALTER TABLE "COMPACTION_METRICS_CACHE" ADD COLUMN "CMC_ID" bigserial PRIMARY 
KEY;
+ALTER TABLE "WRITE_SET" ADD COLUMN "WS_ID" bigserial PRIMARY KEY;

Review Comment:
   Adding `bigserial PRIMARY KEY` columns via `ALTER TABLE ... ADD COLUMN ...` 
forces an `ACCESS EXCLUSIVE` lock and rewrites/scans large transactional tables 
(e.g., TXN_COMPONENTS/WRITE_SET). On busy metastores this can cause a long 
upgrade outage; consider a lower-lock migration (add nullable column, backfill 
IDs in batches, then add NOT NULL + PK) or document expected downtime/size 
assumptions.



##########
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-4.2.0-to-4.3.0.mysql.sql:
##########
@@ -5,6 +5,19 @@ ALTER TABLE MATERIALIZATION_REBUILD_LOCKS ADD  MRL_CAT_NAME 
varchar(128) NOT NUL
 
 CREATE INDEX MIN_HISTORY_WRITE_ID_IDX ON MIN_HISTORY_WRITE_ID (MH_DATABASE, 
MH_TABLE, MH_WRITEID);
 
+ALTER TABLE TXN_COMPONENTS ADD TC_ID bigint NOT NULL AUTO_INCREMENT PRIMARY 
KEY FIRST;
+ALTER TABLE COMPLETED_TXN_COMPONENTS ADD CTC_ID bigint NOT NULL AUTO_INCREMENT 
PRIMARY KEY FIRST;
+ALTER TABLE COMPACTION_METRICS_CACHE ADD CMC_ID bigint NOT NULL AUTO_INCREMENT 
PRIMARY KEY FIRST;
+ALTER TABLE WRITE_SET ADD WS_ID bigint NOT NULL AUTO_INCREMENT PRIMARY KEY 
FIRST;
+

Review Comment:
   This upgrade adds `AUTO_INCREMENT PRIMARY KEY` columns to existing large txn 
tables (TXN_COMPONENTS/WRITE_SET/etc.). In MySQL these `ALTER TABLE` operations 
are typically table-rebuilding and can be very slow + blocking for large 
installations; consider using an online schema change approach (if supported in 
your deployment) or at least calling out the expected downtime/impact in 
release notes.



##########
standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-4.2.0-to-4.3.0.oracle.sql:
##########
@@ -5,6 +5,46 @@ ALTER TABLE MATERIALIZATION_REBUILD_LOCKS ADD (MRL_CAT_NAME 
VARCHAR2(128) DEFAUL
 
 CREATE INDEX MIN_HISTORY_WRITE_ID_IDX ON MIN_HISTORY_WRITE_ID (MH_DATABASE, 
MH_TABLE, MH_WRITEID);
 
+ALTER TABLE TXN_COMPONENTS ADD (TC_ID NUMBER(19) GENERATED BY DEFAULT AS 
IDENTITY PRIMARY KEY);
+ALTER TABLE COMPLETED_TXN_COMPONENTS ADD (CTC_ID NUMBER(19) GENERATED BY 
DEFAULT AS IDENTITY PRIMARY KEY);
+ALTER TABLE COMPACTION_METRICS_CACHE ADD (CMC_ID NUMBER(19) GENERATED BY 
DEFAULT AS IDENTITY PRIMARY KEY);
+ALTER TABLE WRITE_SET ADD (WS_ID NUMBER(19) GENERATED BY DEFAULT AS IDENTITY 
PRIMARY KEY);

Review Comment:
   For Oracle, the upgrade adds identity primary keys to several existing 
transactional tables (TXN_COMPONENTS/WRITE_SET/etc.). These DDLs can be 
long-running/blocking on large tables; since you already use a TMP-table copy 
approach later in the script, consider whether a similar staged approach is 
needed here as well, or explicitly document expected downtime/size assumptions 
for Oracle upgrades.



##########
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-4.2.0-to-4.3.0.derby.sql:
##########
@@ -3,5 +3,22 @@ ALTER TABLE MATERIALIZATION_REBUILD_LOCKS ADD COLUMN 
MRL_CAT_NAME varchar(128) N
 
 CREATE INDEX MIN_HISTORY_WRITE_ID_IDX ON MIN_HISTORY_WRITE_ID (MH_DATABASE, 
MH_TABLE, MH_WRITEID);
 
+ALTER TABLE TXN_COMPONENTS ADD COLUMN TC_ID BIGINT PRIMARY KEY GENERATED BY 
DEFAULT AS IDENTITY;
+ALTER TABLE COMPLETED_TXN_COMPONENTS ADD COLUMN CTC_ID BIGINT PRIMARY KEY 
GENERATED BY DEFAULT AS IDENTITY;
+ALTER TABLE COMPACTION_METRICS_CACHE ADD COLUMN CMC_ID BIGINT PRIMARY KEY 
GENERATED BY DEFAULT AS IDENTITY;
+ALTER TABLE WRITE_SET ADD COLUMN WS_ID BIGINT PRIMARY KEY GENERATED BY DEFAULT 
AS IDENTITY;
+

Review Comment:
   Derby `ALTER TABLE` operations that add identity primary keys and new PK 
constraints can be expensive and locking (often requiring a full table 
scan/rewrite). If this upgrade is expected to run on non-trivial embedded Derby 
metastores, consider documenting potential upgrade time/downtime or applying a 
staged backfill to reduce lock time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to