chuang-wang-pre opened a new issue, #314: URL: https://github.com/apache/doris-spark-connector/issues/314
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Version spark-doris-connector: 25.0.1 doris: 3.0.0 spark: 3.0.1 ### What's Wrong? ``` val dorisTableIdentifier = "doris_db.doris_table" val hiveTableName = "hive_db.hive_table" val timeColumn = "ctime" val selectedColumnsStr = args(5).trim val startTime = "2025-05-06 00:00:00" val endTime = "2025-05-07 00:00:00" val appName = s"doris-to-hive-$hiveTableName" val spark = SparkSession.builder() .appName(appName) .enableHiveSupport() .getOrCreate() // 1. read data from doris val dorisDF = spark.read .format("doris") .option("doris.fenodes", feNodes) .option("doris.table.identifier", dorisTableIdentifier) .option("user", user) .option("password", password) .load() .filter(col(timeColumn) >= lit(startTime) && col(timeColumn) < lit(endTime)) // limit timespan .select(selectedColumns.map(col): _*) // select columns log.info("doris data count: {}", dorisDF.count()) Thread.sleep(1000) log.info("doris data count: {}", dorisDF.count()) Thread.sleep(5000) log.info("doris data count: {}", dorisDF.count()) dorisDF.createOrReplaceTempView("doris_data_detail") // 2. write to hive val insertSql = s""" |INSERT OVERWRITE TABLE $hiveTableName PARTITION (pt='20250410000000') |SELECT |$selectedColumnsStr |FROM doris_data_detail |""".stripMargin log.info("insert hive sql: {}", insertSql) spark.sql(insertSql) spark.stop() ``` I used this code to implement doris2hive, and I found that the amount of data in the hive table was smaller than that in the doris table, so I added some logs to record the number of dataframes. The log is as follows: ``` 25/05/07 19:43:56 INFO Doris2HiveTask$: doris data count: 68684 25/05/07 19:43:59 INFO Doris2HiveTask$: doris data count: 97918 25/05/07 19:44:05 INFO Doris2HiveTask$: doris data count: 99903 ``` the amount in doris: <img width="688" alt="Image" src="https://github.com/user-attachments/assets/3e01790b-bc10-4f76-bfe9-13cb1bcc3555" /> Why did this happen , is this a bug? ### What You Expected? The reason for this situation ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
