zhanggougou opened a new issue, #6089: URL: https://github.com/apache/seatunnel/issues/6089
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues. ### What happened when use mmap to improve the speed of gengerate file , the file may contains <0x00>. ClickHouse local version 22.3.17.13 .this is my ck version. <0x00> will cause clickhouse local throw exception so,i do some change,by fileChannel.position() to avoid <0x00> ### SeaTunnel Version 2.3.1 ### SeaTunnel Config ```conf env { spark.app.name = "hive_to_ck_file_online" spark.yarn.queue="root.private" spark.executor.instances = 20 spark.executor.cores = 1 spark.executor.memory = 16g spark.sql.catalogImplementation = "hive" spark.executor.extraJavaOptions = "-Dfile.encoding=UTF-8" spark.driver.extraJavaOptions = "-Dfile.encoding=UTF-8" spark.hadoop.hive.exec.dynamic.partition = "true" spark.hadoop.hive.exec.dynamic.partition.mode = "nonstrict" spark.debug.maxToStringFields = 100000 spark.speculation=false spark.yarn.maxAppAttempts=1 spark.yarn.max.executor.failures=1 spark.stage.maxConsecutiveAttempts=1 spark.blacklist.enabled=false } source { Hive { metastore_uri="****" table_name="st_site.ch_push_test" read_partitions= ["ds=20231103","ds=20231104","ds=20231105","ds=20231106","ds=20231107","ds=20231108","ds=20231109","ds=20231110","ds=20231111","ds=20231112","ds=20231113","ds=20231114","ds=20231115","ds=20231116","ds=20231117","ds=20231118","ds=20231119","ds=20231120","ds=20231121","ds=20231122","ds=20231123","ds=20231124","ds=20231125","ds=20231126","ds=20231127","ds=20231128","ds=20231129","ds=20231130","ds=20231201","ds=20231202","ds=20231203"] parallelism= 1000 } } sink{ ClickhouseFile { host = "****" database = "db_test" table = "t_st_bill_line_weightrange_detail6" username = "****" password = "****" clickhouse_local_path = "/usr/bin/clickhouse local" node_pass = [{ node_address = "****" username="****" password = "****" } ] } } ``` ### Running Command ```shell ./bin/start-seatunnel-spark-2-connector-v2.sh --master yarn --deploy-mode cluster --config ${1} ``` ### Error Exception ```log ERROR ClickhouseFileSinkWriter: Code: 27. DB::Exception: Cannot parse input: expected '\t' at end of stream.: Buffer has gone, cannot extract information about what has been parsed.: While executing TabSeparatedRowInputFormat: While executing File. (CANNOT_PARSE_INPUT_ASSERTION_FAILED) ``` ### Zeta or Flink or Spark Version _No response_ ### Java or Scala Version _No response_ ### Screenshots this is the data file: <img width="1222" alt="image" src="https://github.com/apache/seatunnel/assets/25924003/746bf928-52fd-425c-8b3d-7127a230a593"> this is my fix,the file will not contains <0x00>,and can success execute ck local: <img width="1365" alt="image" src="https://github.com/apache/seatunnel/assets/25924003/062dea73-066d-4a69-b854-bdaced6aad64"> <img width="1357" alt="image" src="https://github.com/apache/seatunnel/assets/25924003/035bacbb-2aff-441d-a778-9e61219084c8"> ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]