logicbaby opened a new issue, #5081: URL: https://github.com/apache/paimon/issues/5081
### Search before asking - [x] I searched in the [issues](https://github.com/apache/paimon/issues) and found nothing similar. ### Paimon version paimon-flink-1.20-1.0.1.jar paimon-s3-1.0.1.jar paimon-flink-action-1.0.1.jar ### Compute Engine flink-1.20.0 ### Minimal reproduce step Use mysql cdc sync table to paimon table which on s3. it cannot complet checkpoint, taskmanager report: ``` Caused by: java.lang.RuntimeException: s3://paas-flink-prod/flink-paimon/wh/chen.db/department/bucket-0/data-65dbb220-7017-468d-affb-1de9dd6e4105-0.parquet is not a Parquet file. Expected magic number at tail, but found [21, 0, 21, -32] at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:162) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] at org.apache.paimon.shade.org.apache.parquet.hadoop.ParquetFileReader.<init>(ParquetFileReader.java:243) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] at org.apache.paimon.format.parquet.ParquetUtil.getParquetReader(ParquetUtil.java:85) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] at org.apache.paimon.format.parquet.ParquetUtil.extractColumnStats(ParquetUtil.java:52) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] at org.apache.paimon.format.parquet.ParquetSimpleStatsExtractor.extractWithFileInfo(ParquetSimpleStatsExtractor.java:78) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] at org.apache.paimon.format.parquet.ParquetSimpleStatsExtractor.extract(ParquetSimpleStatsExtractor.java:71) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] at org.apache.paimon.io.StatsCollectingSingleFileWriter.fieldStats(StatsCollectingSingleFileWriter.java:105) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] at org.apache.paimon.io.KeyValueDataFileWriter.result(KeyValueDataFileWriter.java:169) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] at org.apache.paimon.io.KeyValueDataFileWriter.result(KeyValueDataFileWriter.java:58) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] at org.apache.paimon.io.RollingFileWriter.closeCurrentWriter(RollingFileWriter.java:135) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] at org.apache.paimon.io.RollingFileWriter.close(RollingFileWriter.java:167) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] at org.apache.paimon.mergetree.MergeTreeWriter.flushWriteBuffer(MergeTreeWriter.java:235) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] at org.apache.paimon.mergetree.MergeTreeWriter.prepareCommit(MergeTreeWriter.java:264) ~[paimon-flink-1.20-1.0.1.jar:1.0.1] ``` I have downloaded this parquet and checked it is ok. cdc params: local:///opt/flink/usrlib/paimon-flink-action-1.0.1.jar \ mysql_sync_table \ --warehouse s3://paas-flink-prod/flink-paimon/wh \ --database chen \ --table department \ --mysql_conf hostname=rm-xxx.mysql.rds.aliyuncs.com \ --mysql_conf username=** \ --mysql_conf password='**' \ --mysql_conf database-name='xxx' \ --mysql_conf table-name='department' ### What doesn't meet your expectations? it's cannot use s3 as paimon warehouse backend storage, hdfs is ok. ### Anything else? _No response_ ### Are you willing to submit a PR? - [ ] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
