github-actions[bot] commented on code in PR #63543: URL: https://github.com/apache/doris/pull/63543#discussion_r3288197221
########## regression-test/suites/job_p0/streaming_job/cdc/test_streaming_postgres_job_source_timezone.groovy: ########## @@ -0,0 +1,226 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + + +import org.awaitility.Awaitility + +import static java.util.concurrent.TimeUnit.SECONDS + +// PG counterpart of test_streaming_mysql_job_source_timezone. +// +// PG temporal semantics relevant to cdc tz handling: +// timestamp - wall clock, no tz; debezium emits epoch-style schema, cdc +// bypasses serverTimeZone. +// timestamptz - normalized to UTC on write per session TimeZone; debezium +// emits ZonedTimestamp ISO string, cdc renders it using +// serverTimeZone. +// timetz - time-of-day with offset; PG retains the session offset +// instead of normalizing to UTC. Debezium emits ZonedTime. +// (Note: cdc currently misses this case in convert(); the +// expected values below assume the upstream-fix behavior.) +// date - no tz; literal day, must not drift across +08 boundary. +// +// Coverage: +// * source session tz set to +08 / -05 / +00, same wall clock written -> +// different UTC instants for timestamptz prove cdc honors source session. +// * NULL row across every temporal column. +// * epoch lower bound ('1970-01-01 00:00:01Z'). +// * Binlog path mirrors snapshot themes, plus an UPDATE on tstz0 under +08. +// +// jdbc_url uses timezone=UTC so cdc renders timestamptz back to UTC wall +// clock regardless of the source session offset. +suite("test_streaming_postgres_job_source_timezone", "p0,external,pg,external_docker,external_docker_pg,nondatalake") { + def jobName = "test_streaming_postgres_job_source_timezone_name" + def currentDb = (sql "select database()")[0][0] + def table1 = "streaming_pg_source_timezone" + def pgDB = "postgres" + def pgSchema = "cdc_test" + def pgUser = "postgres" + def pgPassword = "123456" + + sql """DROP JOB IF EXISTS where jobname = '${jobName}'""" + sql """drop table if exists ${currentDb}.${table1} force""" + + String enabled = context.config.otherConfigs.get("enableJdbcTest") + if (enabled != null && enabled.equalsIgnoreCase("true")) { + String pg_port = context.config.otherConfigs.get("pg_14_port"); + String externalEnvIp = context.config.otherConfigs.get("externalEnvIp") + String s3_endpoint = getS3Endpoint() + String bucket = getS3BucketName() + String driver_url = "https://${bucket}.${s3_endpoint}/regression/jdbc_driver/postgresql-42.5.0.jar" + + connect("${pgUser}", "${pgPassword}", "jdbc:postgresql://${externalEnvIp}:${pg_port}/${pgDB}") { + sql """DROP TABLE IF EXISTS ${pgDB}.${pgSchema}.${table1}""" + sql """ + create table ${pgDB}.${pgSchema}.${table1} ( + id integer PRIMARY KEY, + tag varchar(32), + ts timestamp, + tstz0 timestamptz(0), + tstz3 timestamptz(3), + tstz6 timestamptz(6), + ttz time with time zone, Review Comment: This adds `time with time zone` to a P0 regression suite, but the current CDC converter still does not have a specific branch for Debezium `io.debezium.time.ZonedTime` (the nearby comment also says the expected values assume an upstream-fix behavior). In the current code path, named schemas not matched by `Time`, `Timestamp`, `ZonedTimestamp`, etc. fall through to `dbzObj.toString()`, so this test can fail or encode the offset differently from the committed `.out` (`+08` / `-05`). Please either add/land the actual `ZonedTime` conversion before enabling this assertion, or remove `ttz` from this P0 case until the behavior is implemented. ########## regression-test/suites/job_p0/streaming_job/cdc/test_streaming_mysql_job_jdbc_servertimezone.groovy: ########## @@ -0,0 +1,137 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + + +import org.awaitility.Awaitility + +import static java.util.concurrent.TimeUnit.SECONDS + +// Recommended end-to-end tz configuration: align jdbc_url's serverTimezone +// with Doris session time_zone, so Doris users see TIMESTAMP columns as +// wall-clock in the cluster's local tz. +// +// jdbc_url is built from the Doris session tz at runtime, so the case works +// on clusters configured with different default tz values without code +// changes. +// +// Setup: +// source SET SESSION time_zone='+01:00', INSERT '2024-06-15 11:00:00' +// ts0 (TIMESTAMP) -> source-internal UTC instant 2024-06-15 10:00:00Z +// dt0 (DATETIME) -> literal '2024-06-15 11:00:00' +// jdbc_url serverTimezone=<Doris session tz> +// +// Expectations at Doris (.out is pre-filled for the standard Doris default +// session time_zone '+08:00'): +// ts0 -> '2024-06-15T18:00' (UTC 10:00Z + 8h = 18:00 in +08) +// dt0 -> '2024-06-15T11:00' (DATETIME has no tz semantics, stored verbatim) +suite("test_streaming_mysql_job_jdbc_servertimezone", "p0,external,mysql,external_docker,external_docker_mysql,nondatalake") { + def jobName = "test_streaming_mysql_job_jdbc_servertimezone_name" + def currentDb = (sql "select database()")[0][0] + def table1 = "streaming_mysql_jdbc_servertimezone" + def mysqlDb = "test_cdc_db" + + sql """DROP JOB IF EXISTS where jobname = '${jobName}'""" + sql """drop table if exists ${currentDb}.${table1} force""" + + String enabled = context.config.otherConfigs.get("enableJdbcTest") + if (enabled != null && enabled.equalsIgnoreCase("true")) { + String mysql_port = context.config.otherConfigs.get("mysql_57_port"); + String externalEnvIp = context.config.otherConfigs.get("externalEnvIp") + String s3_endpoint = getS3Endpoint() + String bucket = getS3BucketName() + String driver_url = "https://${bucket}.${s3_endpoint}/regression/jdbc_driver/mysql-connector-j-8.4.0.jar" + + // Read Doris session tz so the cdc job aligns with it. + def dorisTz = (sql "select @@time_zone")[0][0] Review Comment: The test reads `@@time_zone` at runtime and uses it as `serverTimezone`, but the committed `.out` is fixed for Doris `+08:00` (`ts0` is expected as `2024-06-15T18:00`). On any runner whose Doris session timezone is not `+08:00`, the CDC job will correctly render a different wall clock and this regression will fail. The PostgreSQL jdbc_servertimezone case has the same pattern. Please make the suite deterministic, for example by setting the Doris session timezone to the value used by the `.out` before reading it, or by using a fixed timezone in both the URL and expected output. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
