[ 
https://issues.apache.org/jira/browse/FLINK-38555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18033150#comment-18033150
 ] 

yuanfenghu edited comment on FLINK-38555 at 10/27/25 9:26 AM:
--------------------------------------------------------------

[~ruanhang1993]   [~kirubankamaraj]  
 
 
I tried to modify this one, but I felt it wasn't particularly easy to handle. 
It involves issues related to datetime (precision) and Timestamp (time zone). 
Currently, I've tried it and can't achieve a particularly elegant fix. I'm not 
very familiar with the source code in cdc. Could you take a look and give me 
some suggestions? My idea is to convert the long type to LocalDatetime for 
comparison, but the conversion here can't be written particularly elegantly...

Thanks~


was (Author: JIRAUSER296932):
[~ruanhang1993]   [~kirubankamaraj]  
 
 
I tried to modify this one, but I felt it wasn't particularly easy to handle. 
It involves issues related to datetime (precision) and Timestamp (time zone). 
Currently, I've tried it and can't achieve a particularly elegant fix. I'm not 
very familiar with the source code in cdc. Could you take a look and give me 
some suggestions? My idea is to convert the long type to LocalDatetime for 
comparison, but the conversion here can't be written particularly elegantly...

> Optimize performance of `RecordUtils.compareObjects()` method by avoiding 
> unnecessary `toString()` calls for temporal types (LocalDateTime, LocalDate, 
> Instant, etc.).
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-38555
>                 URL: https://issues.apache.org/jira/browse/FLINK-38555
>             Project: Flink
>          Issue Type: Bug
>          Components: Flink CDC
>    Affects Versions: cdc-3.5.0
>            Reporter: yuanfenghu
>            Assignee: yuanfenghu
>            Priority: Critical
>             Fix For: cdc-3.6.0
>
>         Attachments: image-2025-10-24-10-15-18-027.png, 
> image-2025-10-24-10-15-37-328.png
>
>
> h2.  Background
> While analyzing flame graphs of a Flink CDC MySQL source job, I identified 
> that `RecordUtils.splitKeyRangeContains()` was a performance bottleneck. 
> Further investigation revealed that `compareObjects()` was using `toString()` 
> to compare temporal objects, which is significantly slower than direct 
> comparison.
>  
> h3. Root Cause
> h3. 
> In the current implementation:
> {code:java}
> private static int compareObjects(Object o1, Object o2) {
>     if (o1 instanceof Comparable && o1.getClass().equals(o2.getClass())) {
>         return ((Comparable) o1).compareTo(o2);
>     } else if (isNumericObject(o1) && isNumericObject(o2)) {
>         return toBigDecimal(o1).compareTo(toBigDecimal(o2));
>     } else {
>         return o1.toString().compareTo(o2.toString());
>     }
> }{code}
> When comparing `LocalDateTime` objects, the first condition fails if the 
> objects are cast to `Object`, falling through to the `toString()` comparison 
> path.
> h3. Impact
> This method is called extensively during the snapshot phase when evaluating 
> whether binlog records fall within completed split ranges. For tables with:
>  - Temporal types (DATETIME, TIMESTAMP, DATE, TIME) as chunk keys
>  - High binlog throughput during snapshot phase
>  - Many splits (large tables with small chunk size)
> The performance impact can be significant (80% CPU in some cases).
> !image-2025-10-24-10-15-37-328.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to