noahtaite commented on issue #9805:
URL: https://github.com/apache/hudi/issues/9805#issuecomment-1745640608
Hey @ad1happy2go
Reproduced and workaround found in my dev environment as follows:
```
Reproduce
- Generate table with glue sync
- Partition delete without glue sync
- Partitions aren’t removed from glue
- Bulk insert with glue sync
- Partitions aren’t there
Following works as expected:
- Generate table with glue sync
- Partition delete with glue sync
- Partitions are removed from glue
- Bulk insert with glue sync
- Partitions are there
```
So it seems I can get the expected behaviour by configuring my
DELETE_PARTITION write to use AWS Glue sync as well. Assuming that next
bulk_insert is doing glue sync across the replacecommit + deltacommit so
dropping those incoming partitions.
Maybe this is just a documentation issue more than anything? There is not
much documentation on how to use DELETE_PARTITION operation wholly, with the
best example (IMO) being this video by @soumilshah1995 :
https://www.youtube.com/watch?v=QqCiycIgSFk&t=387s
In this video he has Glue sync disabled for DELETE_PARTITION, which I
thought must be necessary for delete_partition to work. Is enabling glue sync
for DELETE_PARTITION operation supported?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]