JNSimba opened a new issue #8133: URL: https://github.com/apache/incubator-doris/issues/8133
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version 0.15.1-rc09 ### What's Wrong? I want to ignore data with mismatched lengths during streamload, but I found that if there is only one data in the batch, `max_filter_ratio` will be invalid. Eg: 1.create table sql ```sql CREATE TABLE `test_jieru1` ( `id` int(11) NULL COMMENT "", `name` varchar(10) NULL COMMENT "" ) ENGINE=OLAP UNIQUE KEY(`id`) COMMENT "OLAP" DISTRIBUTED BY HASH(`id`) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 3", "in_memory" = "false", "storage_format" = "V2" ); ``` 2.data in example.json `[{"id":"1","name":"zhangsanzhangsanzhangsan"}]` 3.streamload option `curl -v --location-trusted -u root:123456 -H "format: json" -H "strip_outer_array: true" -H "jsonpaths: [\"$.id\", \"$.name\"]" -H "max_filter_ratio:1" -H "columns: id,name" -T example.json http://127.0.0.1:8030/api/test/test/_stream_load` respones: ``` { "TxnId": 67452740, "Label": "711803f3-d44f-46c8-bc58-68f4ebb394d9", "Status": "Fail", "Message": "all partitions have no load data", "NumberTotalRows": 1, "NumberLoadedRows": 0, "NumberFilteredRows": 1, "NumberUnselectedRows": 0, "LoadBytes": 40, "LoadTimeMs": 6, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 0, "ReadDataTimeMs": 0, "WriteDataTimeMs": 4, "CommitAndPublishTimeMs": 0, "ErrorURL": "http://127.0.0.1:8040/api/_load_error_log?file=__shard_1005/error_log_insert_stmt_26496fe992d7e55c-409680f9eba491b4_26496fe992d7e55c_409680f9eba491b4" } errorURL detail is : Reason: the length of input is too long than schema. column_name: name; input_str: [zhangsanzhangsanzhangsan] schema length: 10; actual length: 24; . src line: []; ``` **When I turn the data into multiple pieces,it success** eg: `[{"id":"1","name":"zhangsanzhangsanzhangsan"},{"id":1,"name":"wangwu"}]` response is ``` { "TxnId": 67452049, "Label": "2fea62bd-eb35-4a98-ae64-c861820e135c", "Status": "Success", "Message": "OK", "NumberTotalRows": 2, "NumberLoadedRows": 1, "NumberFilteredRows": 1, "NumberUnselectedRows": 0, "LoadBytes": 65, "LoadTimeMs": 23, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 0, "ReadDataTimeMs": 0, "WriteDataTimeMs": 5, "CommitAndPublishTimeMs": 15, "ErrorURL": "http://127.0.0.1:8040/api/_load_error_log?file=__shard_945/error_log_insert_stmt_7440e508eae8659f-ce310b4ed5edf196_7440e508eae8659f_ce310b4ed5edf196" } ``` ### What You Expected? `max_filter_ratio` also takes effect when there is a single data ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
