github-actions[bot] commented on code in PR #64007:
URL: https://github.com/apache/doris/pull/64007#discussion_r3371098337


##########
regression-test/suites/external_table_p0/hive/test_hive_view_schema_drift.groovy:
##########
@@ -0,0 +1,115 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// Regression test for: LogicalView.computeOutput() IndexOutOfBoundsException 
when
+// an underlying Hive table gains new columns (schema drift) after the Hive 
view was created.
+//
+// Repro:
+//   1. Create a Hive base table (3 cols) and a native Hive VIEW on it.
+//   2. In Doris, register the Hive catalog and query the external view — OK 
(3 cols).
+//   3. ADD COLUMN to the Hive base table via hive_docker.
+//   4. REFRESH TABLE <base_table> in Doris (view HMS schema NOT refreshed).
+//   5. Query the external view again — used to crash:
+//        errCode = 2, detailMessage = Index 3 out of bounds for length 3
+//      because LogicalView.computeOutput() iterated childOutput (4 slots from 
the
+//      re-analyzed view body) but called view.getFullSchema().get(i) on a 
3-element
+//      list (the Hive view's HMS schema at creation time).
+//
+// The fix: use Math.min(childOutput.size(), fullSchema.size()) as the loop 
bound,
+// preserving the view's declared output contract while preventing the crash.
+
+suite("test_hive_view_schema_drift", "p0,external,hive_docker") {
+
+    String enabled = context.config.otherConfigs.get("enableHiveTest")
+    if (enabled == null || !enabled.equalsIgnoreCase("true")) {
+        logger.info("disable Hive test.")
+        return;
+    }
+
+    for (String hivePrefix : ["hive2", "hive3"]) {
+        setHivePrefix(hivePrefix)
+        String hms_port = context.config.otherConfigs.get(hivePrefix + 
"HmsPort")
+        String externalEnvIp = context.config.otherConfigs.get("externalEnvIp")
+        String catalog_name = "test_${hivePrefix}_view_schema_drift"
+        String db = "test_view_schema_drift_db"
+        String base_table = "test_view_schema_drift_base"
+        String hive_view = "test_view_schema_drift_view"
+
+        try {
+            // ---- Register Hive catalog in Doris ----
+            sql """drop catalog if exists ${catalog_name}"""
+            sql """CREATE CATALOG ${catalog_name} PROPERTIES (
+                'type'='hms',
+                'hive.metastore.uris' = 
'thrift://${externalEnvIp}:${hms_port}',
+                'hadoop.username' = 'hive'
+            )"""
+
+            // ---- Create Hive database, base table (3 cols), and a native 
Hive VIEW ----
+            // The view is created through hive_docker so it is a native Hive 
view
+            // (ExternalView in Doris). Its HMS schema records exactly 3 
columns.
+            hive_docker """drop database if exists ${db} cascade"""
+            hive_docker """create database ${db}"""
+            hive_docker """
+                create table ${db}.${base_table} (
+                    id     bigint,
+                    name   string,
+                    age    string
+                )
+                partitioned by (dt string)
+                stored as parquet
+            """
+            hive_docker """
+                create view ${db}.${hive_view} as

Review Comment:
   This regression still does not exercise the overflow path fixed in 
`LogicalView.computeOutput()`. The Hive view body is defined with an explicit 
projection (`select id, name, age`), and 
`BindRelation.parseAndAnalyzeExternalView()` re-analyzes the SQL returned by 
`HMSExternalTable.getViewText()`. After adding `score` to the base table, that 
view body still produces only the original three slots, so `childOutput.size()` 
does not exceed `view.getFullSchema().size()` and the old code would not hit 
`view.getFullSchema().get(3)`. As a result, the test can pass without the 
production fix. Please make the view body actually re-expand to include the 
newly added base-table column (or otherwise assert a pre-fix failure), so this 
end-to-end test proves the schema-drift crash is covered.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to