aokolnychyi opened a new issue #1599:
URL: https://github.com/apache/iceberg/issues/1599


   We can define the following stored procedure for removing orphan files.
   
   ```
   CALL catalog.schema.remove_orphan_files(
     namespace => 'namespace_name',
     table => 'table_name',
     older_than => timestamp,
     ignore_scheme => true, -- optional
     ignore_authority => true, -- optional
     dry_run => true -- optional
   )
   ```
   
   The stored procedure should return a list of file locations that are 
considered orphan.
   
   It can be called with an interval:
   
   ```
   CALL catalog.schema.remove_orphan_files(
     namespace => 'namespace_name',
     table => 'table_name',
     older_than => CURRENT_TIMESTAMP - INTERVAL '5' DAYS,
     dry_run => true
   )
   ```
   
   Or with a given timestamp:
   
   ```
    CALL catalog.schema.remove_orphan_files(
     table => 'schema.table_name',
     older_than => TIMESTAMP '2020-01-19 03:14:07',
     dry_run => false
   )
   ```
   
   **Note**: we must validate `older_than` is large enough. We should either 
force it to be at least 1 day old or allow users to configure the min interval 
through table properties.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to