eric666666 opened a new issue, #5058:
URL: https://github.com/apache/iceberg/issues/5058
I'm using v2 format iceberg table. When i use spark3.2 rewrite iceberg
datafile (extrally add an where cause statement) . And i use expire statement
to expire old delefiles, i see only the old small data files are deleted ,but
the 'equalitly delete file' and 'position delete file' cannot be deleted. They
will still remain in filesystem.
rewrite datafile sql i
`CALL hive_prod.system.rewrite_data_files(table => 'test.mock_pre_dwv'
, where => 'dt >= "2022-06-04" '
, options => map (
'delete-file-threshold','1'
,'min-input-files','1'
,'partial-progress.enabled','true'
,'max-concurrent-file-group-rewrites','20'
)
);`
expire snapshot sql
`CALL hive_prod.system.expire_snapshots(table => 'test.mock_pre_dwv',
older_than => timestamp '2022-06-08 11:31:49',retain_last => 1)
;`
Spark expire action execute result:
{
"deleted_data_files_count": 5
"deleted_position_delete_files_count": 0,
"deleted_equality_delete_files_count": 0,
"deleted_manifest_files_count": 588,
"deleted_manifest_lists_count": 319
}
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]