AdkinsHan opened a new issue, #6868: URL: https://github.com/apache/seatunnel/issues/6868
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues. ### What happened When I used spark local mode to read the local csv file into the hive table, the data was multiplied by 3N times, but this did not happen when I used spark yarn mode. Because I used seatunnnel 1.5 before, the migration process was local, but when I tested version 2.3.5, the data was doubled. summary : --master **local** --deploy-mode **client** 3 times --master **yarn** --deploy-mode **client** 3 times --master **yarn** --deploy-mode **cluster** right I have 2076 in my cvs file ,but select count(1) from xx then shows 3*2076 ### SeaTunnel Version 2.3.5 ### SeaTunnel Config ```conf env { # seatunnel defined streaming batch duration in seconds execution.parallelism = 4 job.mode = "BATCH" spark.executor.instances = 4 spark.executor.cores = 4 spark.executor.memory = "4g" spark.sql.catalogImplementation = "hive" spark.hadoop.hive.exec.dynamic.partition = "true" spark.hadoop.hive.exec.dynamic.partition.mode = "nonstrict" } source { LocalFile { schema { fields { sku = string sku_group = string pb = string series = string pn = string mater_n = string } } path = "/data/ghyworkbase/uploadfile/h019-ods_file_pjp_old_new_sku_yy.csv" file_format_type = "csv" skip_header_row_number=1 result_table_name="ods_file_pjp_old_new_sku_yy_source" } } transform { Sql { source_table_name="ods_file_pjp_old_new_sku_yy_source" query = "select sku,sku_group,pb,series,pn,mater_n,TO_CHAR(CURRENT_DATE(),'yyyy') as dt_year from ods_file_pjp_old_new_sku_yy_source " result_table_name="ods_file_pjp_old_new_sku_yy" } } sink { # Console { # source_table_name = "ods_file_pjp_old_new_sku_yy" # } Hive { source_table_name="ods_file_pjp_old_new_sku_yy" table_name = "ghydata.ods_file_pjp_old_new_sku_yy" metastore_uri = "thrift://" } } ``` ### Running Command ```shell sh /data/seatunnel/seatunnel-2.3.4/bin/start-seatunnel-spark-3-connector-v2.sh \ --master local \ --deploy-mode client \ --queue ghydl \ --executor-instances 4 \ --executor-cores 4 \ --executor-memory 4g \ --name "h019-ods_file_pjp_old_new_sku_yy" \ --config /2.3.5/h019-ods_file_pjp_old_new_sku_yy.conf ``` ### Error Exception ```log nothing but data 3* ``` ### Zeta or Flink or Spark Version _No response_ ### Java or Scala Version /usr/local/jdk/jdk1.8.0_341 ### Screenshots _No response_ ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
