voonhous commented on issue #6716:
URL: https://github.com/apache/hudi/issues/6716#issuecomment-1254590318
@yihua Thank you for the reply.
> Is the INSERT_OVERWRITE the only write action
Yes, INSERT_OVERWRITE is the only action being performed on the table. i.e.
ensuring that an insert always rewrites a certain partition, regardless if the
partition exists or not.
For the sake of simplicity, the example below to reproduce this issue does
not involve a partitioned table.
```sql
drop table dev_data_infra.insert_overwrite_archive_test purge;
create table if not existsinsert_overwrite_archive_test(
id int,
name string,
price double,
_ts long
) using hudi
tblproperties (
type = 'cow',
primaryKey = 'id',
preCombineField = '_ts'
) location 'hdfs://insert_overwrite_archive_test';
-- INSERT_OVERWRITE 64 times
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
INSERT OVERWRITE insert_overwrite_archive_test VALUES (1, "test", 0.00, 1);
```
After the INSERT_OVERWRITE operations have completed, we can check the hdfs
directory as such:
```shell
$ hdfs dfs -ls hdfs://insert_overwrite_archive_test/.hoodie | grep -o
'.replacecommit$'
64
```
Let us check the size of the file within the archive folder too:
```shell
$ hdfs dfs -ls hdfs://insert_overwrite_archive_test/.hoodie/archived
```
The above output should return nothing as no archiving has been done.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]