Copilot commented on code in PR #64741:
URL: https://github.com/apache/doris/pull/64741#discussion_r3458731403


##########
fs_brokers/cdc_client/src/main/java/org/apache/doris/cdcclient/source/deserialize/DebeziumJsonDeserializer.java:
##########
@@ -412,19 +412,50 @@ private Object convertToArray(Schema fieldSchema, Object 
dbzObj) {
         return dbzObj.toString();
     }
 
+    // Format a since-midnight time value given as total microseconds (may be 
negative or exceed
+    // 24h) as MySQL TIME literal text: ±HH:MM:SS[.ffffff], stripping trailing 
fractional zeros.
+    private static String formatTimeText(long microsTotal) {
+        boolean negative = microsTotal < 0;
+        long abs = Math.abs(microsTotal);
+        long hours = abs / 3_600_000_000L;
+        long rem = abs % 3_600_000_000L;
+        long minutes = rem / 60_000_000L;
+        rem %= 60_000_000L;
+        long seconds = rem / 1_000_000L;
+        long frac = rem % 1_000_000L;
+        String sign = negative ? "-" : "";
+        if (frac == 0) {
+            return String.format("%s%02d:%02d:%02d", sign, hours, minutes, 
seconds);
+        }
+        String fracStr = String.format("%06d", frac).replaceAll("0+$", "");
+        return String.format("%s%02d:%02d:%02d.%s", sign, hours, minutes, 
seconds, fracStr);
+    }
+
     protected Object convertToTime(Object dbzObj, Schema schema) {
         try {
             if (dbzObj instanceof Long) {
+                long v = (Long) dbzObj;
                 switch (schema.name()) {
                     case MicroTime.SCHEMA_NAME:
-                        // micro to nano
-                        return LocalTime.ofNanoOfDay((Long) dbzObj * 
1000L).toString();
+                        // MySQL TIME spans [-838:59:59, 838:59:59]; 
out-of-range (negative or
+                        // >=24h) cannot use LocalTime, format as ±HH:MM:SS 
instead. micro to nano.
+                        if (v >= 0 && v < 86_400_000_000L) {
+                            return LocalTime.ofNanoOfDay(v * 1000L).toString();
+                        }
+                        return formatTimeText(v);
                     case NanoTime.SCHEMA_NAME:
-                        return LocalTime.ofNanoOfDay((Long) dbzObj).toString();
+                        if (v >= 0 && v < 86_400_000_000_000L) {
+                            return LocalTime.ofNanoOfDay(v).toString();
+                        }
+                        return formatTimeText(v / 1000L);

Review Comment:
   `NanoTime` in-range formatting uses `LocalTime.ofNanoOfDay(v).toString()` 
(up to 9 fractional digits), but out-of-range formatting truncates to 
microseconds via `v / 1000L` and emits at most 6 digits. This creates an 
observable precision/format inconsistency for the same Debezium logical type 
depending on range. Consider aligning behavior by either (a) formatting nanos 
(±HH:MM:SS[.fffffffff]) for out-of-range NanoTime as well, or (b) consistently 
downscaling to micros for *both* branches of NanoTime to guarantee stable 
output.



##########
fs_brokers/cdc_client/src/main/java/org/apache/doris/cdcclient/source/deserialize/DebeziumJsonDeserializer.java:
##########
@@ -412,19 +412,50 @@ private Object convertToArray(Schema fieldSchema, Object 
dbzObj) {
         return dbzObj.toString();
     }
 
+    // Format a since-midnight time value given as total microseconds (may be 
negative or exceed
+    // 24h) as MySQL TIME literal text: ±HH:MM:SS[.ffffff], stripping trailing 
fractional zeros.
+    private static String formatTimeText(long microsTotal) {
+        boolean negative = microsTotal < 0;
+        long abs = Math.abs(microsTotal);
+        long hours = abs / 3_600_000_000L;
+        long rem = abs % 3_600_000_000L;
+        long minutes = rem / 60_000_000L;
+        rem %= 60_000_000L;
+        long seconds = rem / 1_000_000L;
+        long frac = rem % 1_000_000L;
+        String sign = negative ? "-" : "";
+        if (frac == 0) {
+            return String.format("%s%02d:%02d:%02d", sign, hours, minutes, 
seconds);
+        }
+        String fracStr = String.format("%06d", frac).replaceAll("0+$", "");

Review Comment:
   `replaceAll` compiles/executes a regex for every formatted TIME with 
fractional seconds, which can be relatively expensive in a hot deserialization 
path. Consider trimming trailing zeros without regex (e.g., scan from the end) 
or using a small utility to avoid regex overhead and extra allocations.



##########
fs_brokers/cdc_client/src/test/java/org/apache/doris/cdcclient/itcase/PostgresTimeRangeITCase.java:
##########
@@ -0,0 +1,119 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.cdcclient.itcase;
+
+import static org.assertj.core.api.Assertions.assertThat;
+
+import org.apache.doris.cdcclient.common.Env;
+import org.apache.doris.job.cdc.split.SnapshotSplit;
+
+import com.fasterxml.jackson.databind.JsonNode;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import org.junit.jupiter.api.AfterEach;
+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+import org.testcontainers.containers.PostgreSQLContainer;
+import org.testcontainers.junit.jupiter.Container;
+import org.testcontainers.junit.jupiter.Testcontainers;
+import org.testcontainers.utility.DockerImageName;
+
+import java.sql.Connection;
+import java.sql.DriverManager;
+import java.sql.Statement;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.atomic.AtomicLong;
+
+/**
+ * Guards that the MySQL TIME out-of-range fix 
(DebeziumJsonDeserializer.convertToTime) does not
+ * regress PostgreSQL time handling. PG time values share the same MicroTime 
path: in-range values
+ * must stay byte-for-byte on the unchanged LocalTime branch, and the PG-legal 
boundary '24:00:00'
+ * is now formatted as text instead of falling back to the raw long.
+ */
+@Testcontainers
+class PostgresTimeRangeITCase {
+
+    private static final ObjectMapper MAPPER = new ObjectMapper();
+    private static final AtomicLong JOB_ID_SEQ = new AtomicLong(585_000);
+
+    @Container
+    static final PostgreSQLContainer<?> POSTGRES =
+            new PostgreSQLContainer<>(DockerImageName.parse("postgres:14"))
+                    .withCommand("postgres", "-c", "wal_level=logical");
+
+    private String jobId;
+
+    @BeforeEach
+    void setUp() throws Exception {
+        jobId = String.valueOf(JOB_ID_SEQ.incrementAndGet());
+        try (Connection conn = connect();
+                Statement st = conn.createStatement()) {
+            st.execute("DROP TABLE IF EXISTS t_time");
+            st.execute("CREATE TABLE t_time (id INT PRIMARY KEY, t_col 
TIME(6))");
+            st.execute("ALTER TABLE t_time REPLICA IDENTITY FULL");
+            // id 1: ordinary in-range value -- the fix must leave it 
byte-for-byte unchanged.
+            // id 2: PG-legal upper boundary 24:00:00 -- a raw-long fallback 
before the fix.
+            st.execute("INSERT INTO t_time VALUES (1,'12:34:56.123456'), 
(2,'24:00:00')");

Review Comment:
   This IT case uses a fixed table name (`t_time`) shared by all runs. If the 
test suite is ever executed with JUnit parallelization (or multiple forks), 
concurrent runs can drop/recreate/insert into the same table and cause flakes. 
Consider incorporating `jobId` into the table name (as done in 
`PostgresTypeConsistencyITCase`) and dropping it in `@AfterEach` to isolate 
runs.



##########
fs_brokers/cdc_client/src/main/java/org/apache/doris/cdcclient/source/deserialize/DebeziumJsonDeserializer.java:
##########
@@ -412,19 +412,50 @@ private Object convertToArray(Schema fieldSchema, Object 
dbzObj) {
         return dbzObj.toString();
     }
 
+    // Format a since-midnight time value given as total microseconds (may be 
negative or exceed
+    // 24h) as MySQL TIME literal text: ±HH:MM:SS[.ffffff], stripping trailing 
fractional zeros.
+    private static String formatTimeText(long microsTotal) {
+        boolean negative = microsTotal < 0;
+        long abs = Math.abs(microsTotal);

Review Comment:
   `Math.abs(long)` overflows for `Long.MIN_VALUE` (it returns the same 
negative value), which would break the hour/min/sec decomposition and sign 
handling. Even if current callers are expected to be within MySQL/PG ranges, 
this helper is generic and easy to harden with a `Long.MIN_VALUE` special-case 
(or by using an unsigned/BigInteger approach).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to