tooptoop4 opened a new issue #2392:
URL: https://github.com/apache/hudi/issues/2392
**Describe the problem you faced**
Under one of hudi table path on S3 i have over 30000 files with
.commits_.archive. in the name
ie (just listed a few below)
```
mytablepath/.hoodie/.commits_.archive.31370_1-0-1
mytablepath/.hoodie/.commits_.archive.31371_1-0-1
mytablepath/.hoodie/.commits_.archive.31372_1-0-1
mytablepath/.hoodie/.commits_.archive.31373_1-0-1
mytablepath/.hoodie/.commits_.archive.31374_1-0-1
mytablepath/.hoodie/.commits_.archive.31375_1-0-1
mytablepath/.hoodie/.commits_.archive.31376_1-0-1
mytablepath/.hoodie/.commits_.archive.31377_1-0-1
mytablepath/.hoodie/.commits_.archive.31378_1-0-1
mytablepath/.hoodie/.commits_.archive.31379_1-0-1
```
i did ingest to this table over 30000 times
there are only 51 other non archive files. ie
```
mytablepath/.hoodie/.temp/20200929011850/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-169_20200929011850.marker
mytablepath/.hoodie/.temp/20201106041159/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-169_20201106041159.marker
mytablepath/.hoodie/.temp/20201125123321/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-169_20201125123321.marker
mytablepath/.hoodie/.temp/20201125123321/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-222_20201125123321.marker
mytablepath/.hoodie/.temp/20201208015244/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-169_20201208015244.marker
mytablepath/.hoodie/.temp/20201208015244/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-222_20201208015244.marker
mytablepath/.hoodie/.temp/20201208065657/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-169_20201208065657.marker
mytablepath/.hoodie/.temp/20201212205947/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-169_20201212205947.marker
mytablepath/.hoodie/.temp/20201223083212/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-169_20201223083212.marker
mytablepath/.hoodie/.temp/20201224042147/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-169_20201224042147.marker
mytablepath/.hoodie/20200929011850.commit.requested
mytablepath/.hoodie/20200929011850.inflight
mytablepath/.hoodie/20201020130743.commit.requested
mytablepath/.hoodie/20201105000925.commit.requested
mytablepath/.hoodie/20201106041159.commit.requested
mytablepath/.hoodie/20201106041159.inflight
mytablepath/.hoodie/20201125123321.commit.requested
mytablepath/.hoodie/20201125123321.inflight
mytablepath/.hoodie/20201203184433.commit.requested
mytablepath/.hoodie/20201208015244.commit.requested
mytablepath/.hoodie/20201208015244.inflight
mytablepath/.hoodie/20201208065657.commit.requested
mytablepath/.hoodie/20201208065657.inflight
mytablepath/.hoodie/20201210023407.commit.requested
mytablepath/.hoodie/20201212205947.commit.requested
mytablepath/.hoodie/20201212205947.inflight
mytablepath/.hoodie/20201213163733.commit.requested
mytablepath/.hoodie/20201213163733.inflight
mytablepath/.hoodie/20201216040208.commit.requested
mytablepath/.hoodie/20201216040208.inflight
mytablepath/.hoodie/20201223083212.commit.requested
mytablepath/.hoodie/20201223083212.inflight
mytablepath/.hoodie/20201224042147.commit.requested
mytablepath/.hoodie/20201224042147.inflight
mytablepath/.hoodie/20201229164435.clean
mytablepath/.hoodie/20201229164435.clean.inflight
mytablepath/.hoodie/20201229164435.clean.requested
mytablepath/.hoodie/20201229164751.clean
mytablepath/.hoodie/20201229164751.clean.inflight
mytablepath/.hoodie/20201229164751.clean.requested
mytablepath/.hoodie/20201229164751.commit
mytablepath/.hoodie/20201229164751.commit.requested
mytablepath/.hoodie/20201229164751.inflight
mytablepath/.hoodie/20201229165044.commit
mytablepath/.hoodie/20201229165044.commit.requested
mytablepath/.hoodie/20201229165044.inflight
mytablepath/.hoodie/hoodie.properties
mytablepath/unknown/.hoodie_partition_metadata
mytablepath/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-169_20201229164435.parquet
mytablepath/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-169_20201229164751.parquet
mytablepath/unknown/bcb735bd-e33d-457b-8971-2818e130ec28-0_0-25-169_20201229165044.parquet
```
**Expected behavior**
how can i prevent so many .commits_.archive. files being created?
**Environment Description**
* Hudi version : 0.5.3
* Spark version : 2.4.6
* Hive version : 2.3.4
* Hadoop version : 2.8.5
* Storage (HDFS/S3/GCS..) : S3
* Running on Docker? (yes/no) : no
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]